This is a website for an H2020 project which concluded in 2019 and established the core elements of EOSC. The project's results now live further in www.eosc-portal.eu and www.egi.eu

Data processing & Data analysis

EGI Jupyter Notebooks tutorial

Materials for the tutorial that was given in Taipei on Apr 2, 2019. The tutorial was 3x90 minutes long.

The webpage and abstract of the tutorial is available at https://indico4.twgrid.org/indico/event/8/session/9/?slotId=0#20190402

ECAS Training Repositories

Repository for training/demo materials

Training on the INDIGO/DEEP/XDC Services

DEEP-Hybrid-DataCloud project aims to promote the integration of specialized, and expensive, hardware under a Hybrid Cloud platform, so it can be used on-demand by researchers of different communities.

XDC project aims at address high-level topics ranging from the federation of storage resources with standard protocols, the policy driven data management based on Quality of Service, data lifecycle management, metadata handling and manipulation, data preprocessing and encryption during ingestion, and smart caching solutions among remote locations.

This training session will provide practical overview on the solutions implemented both at the level of IaaS, PaaS and SaaS within the projects: INDIGO-DataCloud, eXtreme DataCloud (XDC) and DEEP-HybridDataCloud.

EOSC-hub Data Platforms for data processing and solutions for publishing and archiving scientific data - Part II

The main objective of this session is to demonstrate how end-users can perform data analysis on large volume of datasets, and produce reusable results following the FAIR principles. During this training track, the latest features of the EGI DataHub, including the interoperability with the EGI Jupyter Notebooks and the EUDAT B2Handle and B2Find services, will be also introduced.

This training track is relevant for researchers, IT support people, and service providers who operate services for Open Science.

Jupyterhub Deployment - hands-on training

Jupyter provides a powerful environment for expressing research ideas as notebooks, where code, test and visualizations are easily combined together on an interactive web-frontend. JupyterHub allows to deploy a multi-user service where users can store and run their own notebooks without the need of installing anything on their computers. This is the technology behind the EGI Notebooks service and other similar Jupyter-based services for research.

In this training we will demonstrate how to deploy a JupyterHub instance for your users on top of Kubernetes and explore some of the possible customisations that can improve the service towards your users like integration with authentication services or with external storage systems. After this training, the attendees will be able to deploy their own instance of JupyterHub on their facilities.

Target audience: Resource Center/e-Infrastructure operators willing to provide Jupyter environment for their users.

Pre-requisites: basic knowledge of command-line interface on Linux.

The Elastic Cloud Computing Cluster (EC3)

Elastic Cloud Computing Cluster (EC3) is a tool to create elastic virtual clusters on top of Infrastructure as a Service (IaaS) providers, either public (such as Amazon Web ServicesGoogle Cloud or Microsoft Azure) or on-premises (such as OpenNebula and OpenStack). We offer recipes to deploy TORQUE (optionally with MAUI), SLURMSGEHTCondorMesosNomad and Kubernetes clusters that can be self-managed with CLUES: it starts with a single-node cluster and working nodes will be dynamically deployed and provisioned to fit increasing load (number of jobs at the LRMS). Working nodes will be undeployed when they are idle. This introduces a cost-efficient approach for Cluster-based computing.

BioExcel Summer School on Biomolecular Simulations 2019

On this page, you can find the links to the HADDOCK and metadynamics (using Gromacs) tutorials given during the BioExcel Summer School on Biomolecular Simulations 2019 in Pula, Italy.

The first tutorial demonstrates the use of cross-linking data from mass spectrometry to guide protein-protein docking in HADDOCK.

The second tutorial illustrates how metadynamics can be used to sample conformations of a binding pocket; those are subsequently used for docking a ligand using HADDOCK. The conformational sampling approach is following the EDES approach described in the following publication:

DisVis web server Tutorial

DisVis is a software developed in our lab to visualise and quantify the information content of distance restraints between macromolecular complexes. It is open-source and available for download from our Github repository. To facilitate its use, we have developed a web portal for it.

This tutorial demonstrates the use of the DisVis web server. The server makes use of either local resources on our cluster, using the multi-core version of the software, or GPGPU-accelerated grid resources of the EGI to speed up the calculations. It only requires a web browser to work and benefits from the latest developments in the software based on a stable and tested workflow. Next to providing an automated workflow around DisVis, the web server also summarises the DisVis output highlighting relevant information and providing a first overview of the interaction space between the two molecules with images autogenerated in UCSF Chimera.

The case we will be investigating is the interaction between two proteins of the 26S proteasome of S. pombe, PRE5 (UniProtKB: O14250) and PUP2 (UniProtKB: Q9UT97). For this complex seven experimentally determined cross-links (4 ADH & 3 ZL) are available (Leitner et al., 2014). We added two false positive restraints - it is your task to try to identify these! For this, we use DisVis to try to filter out these false positive restraints while assessing the true interaction space between the two chains. We will then use the interaction analysis feature of DisVis that allows for a more complete analysis of the residues putatively involved in the interaction between the two molecules. To do so, we will extract all accessible residues of the two partners, and give the list of residues to DisVis using its interaction analysis feature. Finally, we will show how the restraints can be provided to HADDOCK in order to model the 3D interaction between the 2 partners.

HADDOCK2.4 basic protein-protein docking tutorial

This tutorial will demonstrate the use of HADDOCK for predicting the structure of a protein-protein complex from NMR chemical shift perturbation (CSP) data. Namely, we will dock two E. coli proteins involved in glucose transport: the glucose-specific enzyme IIA (E2A) and the histidine-containing phosphocarrier protein (HPr).

The structures in the free form have been determined using X-ray crystallography (E2A) (PDB ID 1F3G) and NMR spectroscopy (HPr) (PDB ID 1HDN). The structure of the native complex has also been determined with NMR (PDB ID 1GGR).

These NMR experiments have also provided us with an array of data on the interaction itself (chemical shift perturbations, intermolecular NOEs, residual dipolar couplings, and simulated diffusion anisotropy data), which will be useful for the docking. For this tutorial, we will only make use of inteface residues identified from NMR chemical shift perturbation data as described in Wang et al, EMBO J (2000).

How to apply bioinformatics to metallo-proteins

Table of Contents

• Sequence patterns and protein domains
• MetalPDB and related tools
• Structural Databases
• Structure refinement and protein dynamics

Pages