An overview of secure data services offered through EOSC-hub, by Abdulrahman Azab, Francesca Iozzi and Antti Pursula
Research often involves the use of personal data as a basis for the scientific analysis. However, a particular challenge in this area is to use these data resources without violating privacy. And for that we need secure digital infrastructures, compliant with both national and European regulations.
The EOSC-hub project is working on the topic of providing services for sensitive data through two partners: the Sigma2 / University of Oslo in Norway, and the CSC in Finland.
CSC offers the ePouta secure cloud infrastructure, which provides customer organisations a virtual private cloud connected to the customer's infrastructure through a secure virtual private network. This private area of the ePouta cloud is only accessible via customer-provided endpoint, ensuring both full control and flexibility to bring in customer’s own software environment. CSC ePouta has 8000 hyperthreading compute cores within about 130 servers. Most of the servers have 256 or 384 GB RAM but there are also high RAM servers up to 1,5 TB RAM, and GPU nodes. There is altogether close to 2 PB of storage within the same infrastructure.
The University of Oslo provides TSD - the Norwegian e-Infrastructure for sensitive data storage and management. TSD provides sensitive data services directly to researchers and groups in the form of SaaS (Software as a Service) and PaaS (Platform as a Service). Each TSD project has a separate VLAN and is accessed through two-factor authentication. Currently there are 507 projects in TSD. TSD support:: Data Storage, Web forms, High Performance Computing (HPC), Audio/Video streaming and analysis, and software management for Windows and Linux platforms.
TSD and ePouta can be used for secure data processing and sharing data reliably with collaborators. The work within EOSC-hub project aims to widen the services to data discovery through non-sensitive metadata increasing possibilities for data reuse.
Sensitive data services in practice
A research team may want to record videos of interviews, for example, to study both the spoken language and the gestures and facial expressions of the interviewees. This data cannot be shared in open access, as individuals are directly identified from the recordings. The team can upload the data to ePouta or TSD, and invite their collaborators from other institutes to access the data at the same secure platform. In the future the researchers could also publish metadata through EUDAT B2SHARE and B2FIND services, allowing other researchers discover the data set, and ask for access permissions from the original data providers.