1 of 29

v2.0.1

Home

Resources

Source code
- MedCo Softwares
- Glowing Bear MedCo (forked from The Hyve)
- Documentation
Packages
- GitHub Packages

Contact

For assistance with deploying MedCo or any other technical questions, send an email at medco@epfl.ch or any of the contributors.

Mickaël Misbach (Privacy and Security Software Engineer, EPFL)
Francesco Marino (Privacy and Security Software Engineer, EPFL) - francesco.marino@epfl.ch
Joao Andre Sa (Privacy and Security Software Engineer, EPFL) - joao.gomesdesaesousa@epfl.ch
Jean Louis Raisaro (Data Protection Specialist, CHUV) - jean.raisaro@chuv.ch
Juan Troncoso-Pastoriza (Post-Doctoral Researcher, EPFL) - juan.troncoso-pastoriza@epfl.ch
Jean-Pierre Hubaux (Professor, EPFL) - jean-pierre.hubaux@epfl.ch

License

MedCo is licensed under a End User Software License Agreement (‘EULA’) for non-commercial use. If you need more information, please contact us.

Releases

v2.0.1 - 14th April 2021

Bug fixes, quality of life improvements and UI polishing.

v2.0.0 - 24th March 2021

New major version of MedCo, it includes:

Explore new features:
- numeric range queries
- text queries
- i2b2 modifiers
- auto-selection of result type
- saving of cohorts
- patient list download
Analysis: add support for survival analysis
Many bug fixes and small improvements

v1.0.0 - 31th March 2020

First stable version
Externally reviewed and pen-tested
Bug fixes and enhancements

v0.3.1 - 6th March 2020

Several bug fixes and enhancements

v0.3.0 - 11th February 2020

Many corrections to comply with security reviews
Architecture changes: removal of PICSURE, new implementation of genomic annotations querying
Keycloak auto-provisioning
Support of several identity providers per MedCo node
Rework of Glowing Bear MedCo
Various smaller improvements, bug fixes and better stability

v0.2.1 - 15th August 2019

Implementation of additional query types (patient list, locally obfuscated count, global count, shuffled count)
Implementation of OIDC-based authorization model
Implementation of CLI client
Timers improvements
Upgrade of dependencies
Various smaller improvements and bug fixes

v0.2.0 - 3rd May 2019

Architecture revisit
Replaced medco-i2b2-cell by medco-connector
Upgrades (IRCT v1.4 to PICSURE v2.0, Onet suite v3, Keycloak, Nginx, PHP)
Consolidation of deployment
Many smaller fixes and enhancements

v0.1.1 - 23rd January 2019

Deployment for test purposes on several machines
Enhancements of documentation and deployment infrastructure
Nginx reverse proxy with HTTPS support
Keycloak update

v0.1.0 - 1st December 2018

First public release of MedCo, running with i2b2 v1.7, PIC-SURE/IRCT v1.4 and centralized OpenID Connect authentication. Deployment for development and test purpose on a single machine.

For System Administrators

Requirements

We recommend the following specifications for running MedCo:

Network Bandwidth: >100 Mbps (ideal), >10 Mbps (minimum), symmetrical
Ports Opening and IP Restrictions: see
Hardware
- CPU: 8 cores (ideal), 4 cores (minimum)
- RAM: >16 GB (ideal), >8GB (minimum)
- Storage: dependent on data loaded, >100GB
Software
- OS: Any flavor of Linux, physical or virtualized (tested with Ubuntu 16.04, 18.04, Fedora 29-33)
- Git
- OpenSSL
- version >= 18.09.1
- version >= 1.23.2

Deployment

These pages explain how to deploy MedCo in different scenarios.

Each deployment scenario corresponds to a deployment profile, as described below. All these instructions use the deployment scripts from the repository.

If you are new to MedCo…

… and want to try to deploy the system on a single machine to test it, you should should follow the guide.

… and want to create or join a MedCo network, you should follow the guide.

… and want to develop around MedCo, you should follow the guide.

Deployment Profiles

A deployment profile is composed of two things:

deployment files medco/deployments/<profile name>/: docker-compose file and parameters like ports to expose, log level, etc.
configuration files medco/deployments/<profile name>/configuration/: files mounted in the docker containers, containing the cryptographic keys, the certificates, etc.

Some profiles are provided by default, for development or testing purposes. Those should not be used in a production scenario with real data, as the private keys are set by default, thus not private. Other types of profiles must generated using the script in medco/scripts/network-profile-tool/.

The different profiles are the following:

test-local-3nodes ()

for test on a single machine (used by the MedCo live demo)
3 nodes on any host
using the latest release of the source codes
no debug logging
profile pre-generated

network ()

for test or production deployment on several different hosts
a single node on a host part of a MedCo network
using the latest release of the source codes
no debug logging
profile must be generated prior to use with the provided scripts

dev-local-3nodes ()

for software development
3 nodes on the local host
using development version of source codes
debug logging enabled
profile pre-generated

Local Test Deployment

Deployment of profile test-local-3nodes.

This deployment profile comes with default pre-generated keys and default passwords. It is not meant to contain any real data nor be used in production. If you wish to do so, use instead the Network Deployment (network) deployment profile.

This test profile deploys 3 MedCo nodes on a single machine for test purposes. It can be used either on your local machine, or any other machine to which you have access. The version of the docker images used are the latest released versions. This profile is for example used for the MedCo public demo.

MedCo Stack Deployment

First step is to get the MedCo latest release and download the docker images. Adapt ${MEDCO_SETUP_DIR} to where you wish to install MedCo.

export MEDCO_SETUP_DIR=~/medco MEDCO_SETUP_VER=v2.0.1
git clone --depth 1 --branch ${MEDCO_SETUP_VER} https://github.com/ldsec/medco.git ${MEDCO_SETUP_DIR}
cd "${MEDCO_SETUP_DIR}/deployments/test-local-3nodes"
make pull

The default configuration of the deployment is suitable if the stack is deployed on your local host, and if you do not need to modify the default passwords. To change the default passwords check out this page. For the other settings, check out the following example of modifying the file ${MEDCO_SETUP_DIR}/deployments/test-local-3nodes/.env to reflect your configuration. For example:

MEDCO_NODE_HOST=medco-demo.epfl.ch
MEDCO_NODE_HTTP_SCHEME=https

MEDCO_NODE_HOST should be the fully qualified domain name of the host, MEDCO_NODE_HTTP_SCHEME should be http or https.

If you enable HTTPS, follow HTTPS Configuration to set up the needed certificates.

Final step is to run the nodes, all three will run simultaneously:

cd "${MEDCO_SETUP_DIR}/deployments/test-local-3nodes"
make up

Wait some time for the initialization of the containers to be done (up to the message: “i2b2-medco-srv… - Started x of y services (z services are lazy, passive or on-demand)”), this can take up to 10 minutes. For the subsequent runs, the startup will be faster. In order to stop the containers, hit Ctrl+C in the active window.

You can use the command docker-compose up -d instead to run MedCo in the background and thus not keeping the console captive. In that case use docker-compose stop to stop the containers.

Keycloak Configuration

Only needed if you are deploying somewhere else than your local host. Otherwise the default configuration will work fine.

Follow the instructions for configuring the MedCo OpenID Connect client in Keycloak to be able to login in Glowing Bear.

Test the deployment

In order to test that the local test deployment of MedCo is working, access Glowing Bear in your web browser at http(s)://${MEDCO_NODE_HOST} and use the default credentials specified in Keycloak user management. If you are new to Glowing Bear you can watch the Glowing Bear user interface walkthrough video. You can also use the CLI client to perform tests.

By default MedCo loads a specific test data, refer to Description of the default test data for expected results to queries. To load a dataset, follow the guide Loading Data. To load some additional test data by performing a simple data loading you can execute the following:

make load_test_data

Network Deployment

Deployment of profile test-network.

This profile deploys an arbitrary set of MedCo nodes independently in different machines that together form a MedCo network. This deployment assumes each node is deployed in a single dedicated machine. All the machines have to be reachable between each other. Nodes should agree on a network name and individual indexes beforehand (to be assigned a unique ID).

The next set of steps must be fully executed individually by each node of the network.

Pre-requisites

First step is to get the MedCo Deployment latest release at each node. Adapt ${MEDCO_SETUP_DIR} to where you wish to install MedCo.

export MEDCO_SETUP_DIR=~/medco MEDCO_SETUP_VER=v2.0.1
git clone --depth 1 --branch ${MEDCO_SETUP_VER} https://github.com/ldsec/medco.git ${MEDCO_SETUP_DIR}

Generation of the deployment Profile

Next the compose and configuration profiles must be generated using a script, executed in two steps.

Step 1: each node generates its keys and certificates, and shares its public information with the other nodes
Step 2: each node collects the public keys and certificates of the all the other nodes

Step 1

For step 1, the network name ${MEDCO_SETUP_NETWORK_NAME} should be common to all the nodes. ${MEDCO_SETUP_NODE_DNS_NAME} corresponds to the machine domain name where the node is being deployed. As mentioned before the different parties should have agreed beforehand on the members of the network, and assigned an index ${MEDCO_SETUP_NODE_IDX} to each different node to construct its UID (starting from 0, to n-1, n being the total number of nodes).

export MEDCO_SETUP_NETWORK_NAME=example \
    MEDCO_SETUP_NODE_IDX=0 \
    MEDCO_SETUP_NODE_DNS_NAME=medconode0.example.com
cd "${MEDCO_SETUP_DIR}/scripts/network-profile-tool"
bash step1.sh ${MEDCO_SETUP_NETWORK_NAME} ${MEDCO_SETUP_NODE_IDX} ${MEDCO_SETUP_NODE_DNS_NAME}

This script will generate the compose profile and part of the configuration profile, including a file srv${MEDCO_SETUP_NODE_IDX}-public.tar.gz. This file should be shared with the other nodes, and all of them need to place it in their configuration profile folder (${MEDCO_SETUP_DIR}/deployments/test-network-${MEDCO_SETUP_NETWORK_NAME}-node${MEDCO_SETUP_NODE_IDX}/configuration).

Step 2

Before proceeding to this step, you need to have gathered all the files srv${MEDCO_SETUP_NODE_IDX}-public.tar.gz from the persons deploying MedCo on the other nodes.

Once all nodes have shared their srv${MEDCO_SETUP_NODE_IDX}-public.tar.gz file with all other nodes, step 2 can be executed:

cd "${MEDCO_SETUP_DIR}/scripts/network-profile-tool"
bash step2.sh ${MEDCO_SETUP_NETWORK_NAME} ${MEDCO_SETUP_NODE_IDX}

At this point, it is possible to edit the default configuration generated in ${MEDCO_SETUP_DIR}/deployments/test-network-${MEDCO_SETUP_NETWORK_NAME}-node${MEDCO_SETUP_NODE_IDX}/.env This is needed in order to modify the default passwords. When editing this file, be careful to change only the passwords and not the other values.

The deployment profile is now ready to be used.

MedCo Stack Deployment

Next step is to download the docker images and run the node:

cd "${MEDCO_SETUP_DIR}/deployments/test-network-${MEDCO_SETUP_NETWORK_NAME}-node${MEDCO_SETUP_NODE_IDX}"
make pull
make up

Wait some time for the initialization of the containers to be done, this can take up to 10 minutes. For the subsequent runs, the startup will be faster. You can use make stop to stop the containers and make down to delete them.

Keycloak Configuration

You will need to follow two sets of instruction to make Keycloak functional and be able to log in. Access the Keycloak administration interface and then:

Update the MedCo OIDC client
Update the Keycloak realm keys

Test the deployment

Note that by default the certificates generated by the script are self-signed and thus, when using Glowing Bear, the browser will issue a security warning. To use your own valid certificates, see HTTPS Configuration. If you wish anyway to use the self-signed certificates, you will need to visit individually the page of Glowing Bear of all nodes in your browser, and select to trust the certificate.

The database is pre-loaded with some encrypted test data using a key that is pre-generated from the combination of all the participating nodes’ public keys. For the network deployment profile this data will not be correctly encrypted, since the public key of each node is generated independently, and, as such, the data must be re-loaded before being able to test the system successfully.

Run first the MedCo loader (see Loading Data) to load some data and be able to test this deployment. Or to load some test data by performing a simple data loading you can execute the following:

make load_test_data

Then access Glowing Bear in your web browser at https://${MEDCO_SETUP_NODE_DNS_NAME} and use the default credentials specified in Keycloak user management. If you are new to Glowing Bear you can watch the Glowing Bear user interface walkthrough video. You can also use the CLI client to perform tests.

Configuration

This set of pages provide configuration instructions for MedCo. Note that all of them are not necessarily always needed, follow one of the deployment instructions to know which ones are.

Passwords

It is important to choose strong unique passwords before a deployment, even more so if it contains real data or if it is exposed to the internet.

Passwords Configuration

In each compose profile you will find a .env file containing configuration options. Among them are the passwords to be set. Note that most of those passwords configured that way will only work on a fresh database. Example:

POSTGRES_PASSWORD=postgres_password
PGADMIN_PASSWORD=pgadmin_password
KEYCLOAK_PASSWORD=keycloak_password
I2B2_WILDFLY_PASSWORD=i2b2_wildfly_password
I2B2_SERVICE_PASSWORD=i2b2_service_password
I2B2_USER_PASSWORD=i2b2_user_password

PostgreSQL administration user

POSTGRES_PASSWORD configures the password for the postgres administration user of the PostgreSQL database.

PgAdmin user

PGADMIN_PASSWORD configures the password for the admin user of the PgAdmin web interface. Note that it is necessary to set it only if your deployment profile deploys this tool.

Keycloak administration user

KEYCLOAK_PASSWORD configures the password for the keycloak administration user of the default master realm of Keycloak.

As of v1.0.0, the provisioning of the configuration of Keycloak has changed and this setting is not effective. After the initial deployment, you must login to the administration interface with the default password (keycloak) and change it.

I2b2 Wildfly administration user

I2B2_WILDFLY_PASSWORD configures the password for the admin user of the wildfly instance hosting i2b2.

I2b2 service user

I2B2_SERVICE_PASSWORD configures the password for the AGG_SERVICE_ACCOUNT user of i2b2, used to operate background automated tasks by the i2b2 services.

I2b2 default user

I2B2_USER_PASSWORD configures the password for the default i2b2 and demo users used by MedCo.

Keycloak

Here follows some MedCo-specific instructions for the administration of Keycloak. For anything else, please refer to the Keycloak Server Administration Guide. Those instructions do not necessarily need to be all followed for all deployments, refer to the deployment guide to know which ones are important.

For a production deployment, it is crucial to change the default keys and credentials.

Accessing the web administration interface

You can access the Keycloak administration interface at http(s)://<node domain name>/auth/admin. For example if MedCo is deployed on your local host, you can access it at http://localhost/auth/admin. Use the admin default credentials if you had just deployed MedCo.

User Management

Default users

The default configuration shipped with the MedCo deployments come with several users.

Admin user

The default admin credentials has all the admin access to Keycloak, but no access rights to MedCo. Its credentials are :

User keycloak
Password keycloak (unless configured otherwise through the .env file)

Test users

They all have the password test and have different authorizations that are obvious from their names.

User test: this user has all the authorizations to run all types of MedCo explore queries. it will default to the highest authorization being patient_list.
User test_explore_count_global
User test_explore_count_global_obfuscated
User test_explore_count_per_site
User test_explore_count_per_site_obfuscated
User test_explore_count_per_site_shuffled
User test_explore_count_per_site_shuffled_obfuscated
User test_explore_patient_list

Add a user

Go to the configuration panel Users, click on Add user.
Fill the Username field, toggle to ON the Email Verified button and click Save.
In the next window, click on Credentials, enter twice the user’s password, toggle to OFF the Temporary button if desired and click Reset Password.

Give query permissions to a user

Go to the configuration panel Users, search for the user you want to give authorization to and click on Edit.
Go to the Role Mappings tab, and select medco (or another client ID set up for the MedCo OIDC client) in the Client Roles.
Add the roles you wish to give the user, each of the roles maps to a query type.

MedCo Default Settings

medco OpenID Connect client

The default Keycloak configuration provides an example of a fully working configuration for deployments on your local host. In other cases, you will need to modify this configuration.

Access the configuration panel of the MedCo client by going to the Clients tab, and click on the medco client. Then, in the Settings tab, fill Valid Redirect URIs to reflect the following table (you can delete the existing entries):

Deployment Profile

Valid Redirect URIs

test-local-3nodes

http(s)://<node domain name>/*

test-network + prod-network

https://<node domain name>/*

dev-local-3nodes

http://localhost:4200/*

In the same tab, fill Web Origins with + and save.

Securing a production deployment

Changing default passwords

Both keycloak and test users comes with default passwords. For a production deployment they need to be changed:

Go to the configuration panel Users, click on View all users.
For each of the users you want to change the password of:
- Click on Edit, then go the Credentials tab.
- Enter the new password of the user
- Optionally toggle to OFF the Temporary button; if ON the user at the next login will need to update his password.
- Click on Reset Password.

Changing default realm keys

The example configuration comes with default keys. They have to be changed for a network deployment where there are several Keycloak instances.

Go to the configuration panel Realm Settings, then to the Keys tab and Providers subtab.
Click on Add keystore... and add the three following providers:
- aes-generated
  - Console Display Name: aes-medco
  - Priority: 100
- hmac-generated
  - Console Display Name: hmac-medco
  - Priority: 100
- rsa-generated
  - Console Display Name: rsa-medco
  - Priority: 100
Finally, delete all the other key providers listed that you did not just add. They should be named xxx-generated. Note that it is normal if you get logged out during the operation, just log back in and continue the process.

Enabling brute force detection

Go to the configuration panel Realm Settings, then to the Security Defenses tab and Brute Force Detection subtab.
Toggle to ON the Enabled button.
Fill the following:
- Max Login Failures: 3
- Wait Increment: 30 Seconds
- Save the configuration.

Setting Authorizations

This page guide you on how to set authorizations to users through Keycloak.

You will find below the documentation for each authorization available in MedCo. Follow this section to know how to modify those authorizations for your users.

Authorizations

REST API Authorizations

Those authorizations allow the user to interact with API endpoints of the MedCo connector.

The minimum set of authorizations needed for users to use MedCo is the following:

medco-network
medco-explore
medco-genomic-annotations

medco-network

This covers the calls related to the network metadata: list of nodes, keys, URLs, etc.

medco-explore

This covers the calls related to the building and requesting of explore queries and cohort saving. Note that an additional authorization among the explore query authorizations is needed to be able to make explore queries.

medco-genomic-annotations

This covers the genomic annotations auto-completion and the querying of genomic variants.

medco-survival-analysis

This covers the calls needed for requesting computations of survival curves.

Explore Query Authorizations

Those authorizations set the types of result users will be able to get when making an explore query.

Those authorizations are ordered according to their precedence. This means that if a user has several of them, the authorization with the highest level will be selected.

patient_list: exact counts and list of patient identifiers from all sites
count_per_site: exact counts from all sites
count_per_site_obfuscated: obfuscated counts from all sites
count_per_site_shuffled: exact counts from all sites, but without knowing which count came from which site
count_per_site_shuffled_obfuscated: obfuscated counts from all sites, but without knowing which count came from which site
count_global: exact aggregated global count
count_global_obfuscated: obfuscated (at the site level) aggregated global count

HTTPS Configuration

HTTPS is supported for the profiles test-local-3nodes and test-network.

Certificate

The certificates are held in the configuration profile folder (e.g, ${MEDCO_SETUP_DIR}/deployments/test-local-3nodes/configuration):

certificate.key: private key
certificate.crt: certificate of own node
srv0-certificate.crt, srv1-certificate.crt, …: certificates of all nodes of the network

Enable HTTPS for the Local Local Deployment

To enable HTTPS for the profile test-local-3nodes, replace the files certificate.key and certificate.crt from the configuration profile folder with your own versions. Such a certificate can be obtained for example through Let’s Encrypt.

Then edit the file .env from the compose profile, replace the http with https, and restart the deployment.

Configure HTTPS for the Network Deployment

For this profile, HTTPS is mandatory. The profile generation script generates and uses default self-signed certificates for each node. Those are perfectly fine to be used, but because they are self-signed, an HTTPS warning will be displayed to users in their browser when accessing one of the Glowing Bear instance.

There is currently only one way of avoiding this warning: configuring the browsers of your users to trust this certificate. This procedure is specific to the browsers and operating systems used at your site.

Configuring SwitchAAI Authentication

This guide walks you through the process of configuring Keycloak as a Service Provider to one or more SwitchAAI identity provider(s), in order for MedCo to rely on SwitchAAI for user authentication.

Prerequisites

A MedCo network is up and running, with one or more functional Keycloak within the network.
One or several identity provider(s) part of the SwitchAAI federation is/are chosen to be used as user source.
The institution at which the Keycloak of MedCo is deployed is ready to accept being registered as the home organization.
You have access to the SwitchAAI Resource Registry.

Right now the SwitchAAI WAYF (Where Are You From) mechanism is not supported (i.e. the web UI used to select with institution the user wishes to login). This means that you will need to register in Keycloak each identity provider you wish to support.

The process described in this guide will need to be repeated for each instance of Keycloak deployed, if there are more than one in the MedCo network.

Configure the identity provider(s) in Keycloak

The following instructions are to be executed on the administration UI of Keycloak, e.g. https://medco-demo.epfl.ch/auth/admin.

The behavior of Keycloak during the very first login of users through the identity provider is highly customisable. We propose below an example of a working flow but this can be changed to fit your need.

Navigate to Authentication > Flows, select First Broker Login and make a Copy of it. Name it for example SwitchAAI-Test Demo IdP First Broker Login.
Change the list of executions to make it look like the following image.

Add the identity provider

In the Identity Providers menu, choose Add provider... > SAML v2.0
Specify an Alias. Note this will not be changeable later without redoing the whole process. Example: SwitchAAI-Test.
Specify a Display Name, which will be displayed to the user in the login page. Example: SwitchAAI-Test Demo IdP.
Specify the Single Sign-On Service URL of the identity provider you are linking with. Example: https://aai-demo-idp.switch.ch/idp/profile/SAML2/POST/SSO.
Specify the First Login Flow previously configured to use. Example: SwitchAAI-Test Demo IdP First Broker Login.
Toggle to ON the following buttons:
- Enabled
- Trust Email
- HTTP-POST Binding Response
- HTTP-POST Binding for AuthnRequest
- Validate Signature
Specify the NameID Policy Format as Persistent.
Add the certificate(s) (PEM format, separated by commas if there are several of them) of the identity provider you are linking with in Validating X509 Certificates.
Save the changes.

Add the username mapper

We need to import a unique but intelligible username in Keycloak from the identity provider. For this we use the SwitchAAI mandatory attribute swissEduPersonUniqueID.

Open the Mappers tab and click Create.
Fill the field as:
- Name: SwitchAAI Unique ID.
- Mapper Type: Username Template Importer.
- Template: ${ATTRIBUTE.swissEduPersonUniqueID}
Save the changes.

Setup a certificate

A certificate compliant with the SwitchAAI federation needs to be generated and configured. First follow this SwitchAAI guide to generate a self-signed certificate that meets their requirements. You will need from the Keycloak instance:

Its FQDN (fully-qualified domain name). Example: medco-demo.epfl.ch.
Its SAML entityID, that you can find out in the XML descriptor from the Export tab of the previously configured Keycloak identity provider. Example: https://medco-demo.epfl.ch/auth/realms/master.

Once you have generated the certificate, set it up in Keycloak:

Navigate to the settings page Realm Settings > Keys > Providers and select Add Keystore... > rsa.
Specify a name in Console Display Name. Example: rsa-switchaaitest.
Specify a Priority higher than any other RSA key. Example: 150.
In Private RSA Key and X509 Certificate fields, copy/paste the respective PEM parts of both the private key and the certificate that were previously generated.

Register Keycloak instance as a Service Provider in SwitchAAI

The following instructions are to be executed in the AAI Resource Registry. As a result, a Keycloak instance will be registered as a service provider linked to a home organization in the SwitchAAI federation.

Register new resource

Click Add a Resource Description and fill the 7 categories of information according to the following instructions. Note that if some fields are not listed in this documentation, their value are not important for the registration of the Keycloak instance and can be set according to the explanations provided by the resource registry.

1. Basic Resource Information

Entity ID: the same SAML entityID you used to generate the certificate. Example: https://medco-demo.epfl.ch/auth/realms/master.
Home Organization: the organization that hosts the Keycloak instance currently being registered. The responsible persons of the organization specified here will need to approve the registration. This will typically be the the institution where the MedCo node is deployed. For the purpose of our test we are using AAI Demo Home Organization (aai-demo-idp.switch.ch, AAI Test).
Home URL: the address of the MedCo node, at which the UI Glowing Bear can be accessed. Example: https://medco-demo.epfl.ch/.

2. Descriptive Information

3. Contacts

4. Service Locations

SAML2 HTTP POST binding (x2): the URL at witch the SwitchAAI infrastructure will communicate with the Keycloak instance. You will find it in the configuration page of the configured identity provider in Keycloak under Redirect URI. Example: https://medco-demo.epfl.ch/auth/realms/master/broker/SwitchAAI-Test/endpoint

5. Certificates

Copy/paste in this field the PEM part of the certificate that was previously generated. Note that in the example showed below the certificate has already been validated through a separate channel.

6. Requested Attributes

Put on Required at least the following attributes. Note that the release of attributes needs to have a justification.

E-mail (email). Example reason: Identify user for being able to assign them specific authorizations.
Unique ID (swissEduPersonUniqueID). Example reason: Get a unique ID of user.

7. Intended Audience and Interfederation

Get the new resource approved

Once submitted, the responsible persons from the home organization will need to approve the new resource and validate the fingerprint of the certificate submitted. This is a manual process that will most likely be done through email.

Once this is done, the setup should be functional, and the users will be able to select the configured identity provider to login. Don't forget that this covers only users' authentication, their authorization needs to be handled manually through Keycloak after they login at least once.

Data Loading

There are two ways of loading data into MedCo. The first, using the provided loader, allows to encrypt and load the encrypted data into the MedCo database. The second loads directly pre-generated data into the database without encrypting data.

Load pre-generated data

Pre-generated cleartext synthetic data following the SPO (Swiss Personalized Oncology) ontology is available,

Encrypt and load data with the loader

The current version of the loader offers two different loading alternatives: (v0) loading of clinical and genomic data based on MAF datasets; and (v1) loading of generic i2b2 data. Currently these two loaders support each one dataset:

v0: a genomic dataset (tcga_cbio publicly available in )
v1: the .

Future releases of this software will allow for other arbitrary data sources, given that they follow a specific structure (e.g. BAM format).

Pre-Requisite: Download test data

Execute the download script to download the test datasets.

Dummy Generation

The provided example data set files come with dummy data pre-generated. Those data are random dummy entries whose purpose is to prevent frequency attacks. In a future release, the generation will be done dynamically by the loader.

Synthetic SPO Data

This page will guide you through loading example synthetic data that follows the SPO (Swiss Personalized Oncology) ontology.

Pre-Requisite: Download test data

Execute the download script to download the test datasets.

Load the data into MedCo

A script is available to load in a simple way the data. Example of how to use it with a test-local-3nodes deployment running on your localhost, adapt it to your own use-case:

v0 (Genomic Data)

The v0 loader expects an ontology, with mutation and clinical data in the MAF format. As the ontology data you must use ${MEDCO_SETUP_DIR}/test/data/genomic/tcga_cbio/clinical_data.csv and ${MEDCO_SETUP_DIR}/test/data/genomic/tcga_cbio/mutation_data.csv. For clinical data you can keep using the same two files or a subset of the data (e.g. 8_clinical_data.csv). More information about how to generate sample data files can be found below. After the following script is executed all the data is encrypted and deterministically tagged in compliance with the MedCo data model.

How to use

Ensure you have before proceeding to the loading.

The following examples show you how to load data into a running MedCo deployment. Adapt accordingly the commands your use-case.

Examples

Loading the three nodes on the dev-local-3nodes profile

Loading one node on a network-test profile

Explanation of the command's arguments

Test that the loading was successful

To check that it is working you can query for:

-> MedCo Gemomic Ontology -> Gene Name -> BRPF3

For the small dataset 8_xxxx you should obtain 3 matching subjects (one at each site).

v1 (I2B2 Demodata)

The v1 loader expects an already existing i2b2 database (in .csv format) that will be converted in a way that is compliant with the MedCo data model. This involves encrypting and deterministically tagging some of the data.

List of input (‘original’) files:

all i2b2metadata files (e.g. i2b2.csv)
dummy_to_patient.csv
patient_dimension.csv
visit_dimension.csv
concept_dimension.csv
modifier_dimension.csv
observation_fact.csv
table_access.csv

How to use

Ensure you have downloaded the data before proceeding to the loading.

The following examples show you how to load data into a running MedCo deployment. Adapt accordingly the commands your use-case.

Examples

Loading the three nodes on the dev-local-3nodes profile

export MEDCO_SETUP_DIR=~/medco \
    MEDCO_DEPLOYMENT_PROFILE=dev-local-3nodes
cd "${MEDCO_SETUP_DIR}/deployments/${MEDCO_DEPLOYMENT_PROFILE}"
docker-compose -f docker-compose.tools.yml run medco-loader-srv0 v1 \
    --sen /data/i2b2/sensitive.txt \
    --files /data/i2b2/files.toml
docker-compose -f docker-compose.tools.yml run medco-loader-srv1 v1 \
    --sen /data/i2b2/sensitive.txt \
    --files /data/i2b2/files.toml
docker-compose -f docker-compose.tools.yml run medco-loader-srv2 v1 \
    --sen /data/i2b2/sensitive.txt \
    --files /data/i2b2/files.toml

Loading one node on a network-test profile

export MEDCO_SETUP_DIR=~/medco \
    MEDCO_DEPLOYMENT_PROFILE=test-network-xxx-node0
cd "${MEDCO_SETUP_DIR}/deployments/${MEDCO_DEPLOYMENT_PROFILE}"
docker-compose -f docker-compose.tools.yml run medco-loader v1 \
    --sen /data/i2b2/sensitive.txt \
    --files /data/i2b2/files.toml

Explanation of the command's arguments

NAME:
    medco-loader v1 - Convert existing i2b2 data model

USAGE:
    medco-loader v1 [command options] [arguments...]

OPTIONS:
    --group value, -g value               UnLynx group definition file
    --entryPointIdx value, --entry value  Index (relative to the group definition file) of the collective authority server to load the data
    --sensitive value, --sen value        File containing a list of sensitive concepts
    --dbHost value, --dbH value           Database hostname
    --dbPort value, --dbP value           Database port (default: 0)
    --dbName value, --dbN value           Database name
    --dbUser value, --dbU value           Database user
    --dbPassword value, --dbPw value      Database password
    --files value, -f value               Configuration toml with the path of the all the necessary i2b2 files
    --empty, -e                           Empty patient and visit dimension tables (y/n)

Test that the loading was successful

To check that it is working you can query for:

-> Diagnoses -> Neoplasm -> Benign neoplasm -> Benign neoplasm of breast

You should obtain 2 matching subjects.

Command-Line Interface (CLI)

MedCo provides a client command-line interface (CLI) to interact with the medco-connector APIs.

Prerequisites

To use the CLI, you must first follow one of the deployment guides. However, the version of the CLI documented here is the one shipped with the Local Development Deployment.

How to use

To show the CLI manual, run:

export MEDCO_SETUP_DIR=~/medco
cd ${MEDCO_SETUP_DIR}/deployments/dev-local-3nodes/
docker-compose -f docker-compose.tools.yml run medco-cli-client --user [USERNAME] --password [PASSWORD] --help

NAME:
   medco-cli-client - Command-line query tool for MedCo.

USAGE:
   medco-cli-client [global options] command [command options] [arguments...]

VERSION:
   dev

COMMANDS:
   concept-children, con-c     Get the concept children (both concepts and modifiers)
   modifier-children, mod-c    Get the modifier children
   concept-info, con-i         Get the concept info
   modifier-info, mod-i        Get the modifier info
   query, q                    Query the MedCo network
   ga-get-values, ga-val       Get the values of the genomic annotations of type *annotation* whose values contain *value*
   ga-get-variant, ga-var      Get the variant ID of the genomic annotation of type *annotation* and value *value*
   survival-analysis, srva     Run a survival analysis
   get-saved-cohorts, getsc    get cohorts
   add-saved-cohorts, addsc    Create a new cohort.
   update-saved-cohorts, upsc  Updates an existing cohort.
   remove-saved-cohorts, rmsc  Remove a cohort.
   help, h                     Shows a list of commands or help for one command

GLOBAL OPTIONS:
   --user value, -u value        OIDC user login
   --password value, -p value    OIDC password login
   --token value, -t value       OIDC token
   --disableTLSCheck             Disable check of TLS certificates
   --outputFile value, -o value  Output file for the result. Printed to stdout if omitted.
   --help, -h                    show help
   --version, -v                 print the version

For a start, you can use the credentials of the default user: username:test password:test

concept-children

You can use this command to browse the MedCo ontology by getting the children of a concept, both concepts and modifiers.

docker-compose -f docker-compose.tools.yml run medco-cli-client --user test --password test concept-children --help
NAME:
   medco-cli-client concept-children - Get the concept children (both concepts and modifiers)

USAGE:
   medco-cli-client concept-children conceptPath

For example:

docker-compose -f docker-compose.tools.yml run medco-cli-client --user test --password test concept-children /E2ETEST/e2etest/ 
PATH    TYPE
/E2ETEST/e2etest/1/    concept
/E2ETEST/e2etest/2/    concept
/E2ETEST/e2etest/3/    concept
/E2ETEST/modifiers/    modifier_folder

modifier-children

You can use this command to browse the MedCo ontology by getting the children of a modifier.

docker-compose -f docker-compose.tools.yml run medco-cli-client --user test --password test modifier-children --help
NAME:
   medco-cli-client modifier-children - Get the modifier children

USAGE:
   medco-cli-client modifier-children modifierPath appliedPath appliedConcept

For example:

docker-compose -f docker-compose.tools.yml run medco-cli-client --user test --password test modifier-children /E2ETEST/modifiers/ /e2etest/% /E2ETEST/e2etest/1/
PATH    TYPE
/E2ETEST/modifiers/1/    modifier

concept-info

You can use this command to get information about a MedCo concept, including the associated metadata.

docker-compose -f docker-compose.tools.yml run medco-cli-client --user test --password test concept-info --help
NAME:
   medco-cli-client concept-info - Get the concept info

USAGE:
   medco-cli-client concept-info conceptPath

For example:

docker-compose -f docker-compose.tools.yml run medco-cli-client --user test --password test concept-info /E2ETEST/e2etest/1/ 
  <ExploreSearchResultElement>
      <Code>ENC_ID:1</Code>
      <DisplayName>E2E Concept 1</DisplayName>
      <Leaf>true</Leaf>
      <MedcoEncryption>
          <Encrypted>true</Encrypted>
          <ID>1</ID>
      </MedcoEncryption>
      <Metadata>
          <ValueMetadata>
              <ChildrenEncryptIDs></ChildrenEncryptIDs>
              <CreationDateTime></CreationDateTime>
              <DataType></DataType>
              <EncryptedType></EncryptedType>
              <EnumValues></EnumValues>
              <Flagstouse></Flagstouse>
              <NodeEncryptID></NodeEncryptID>
              <Oktousevalues></Oktousevalues>
              <TestID></TestID>
              <TestName></TestName>
              <Version></Version>
          </ValueMetadata>
      </Metadata>
      <Name>E2E Concept 1</Name>
      <Path>/E2ETEST/e2etest/1/</Path>
      <Type>concept</Type>
  </ExploreSearchResultElement>

modifier-info

You can use this command to get information about a MedCo modifier, including the associated metadata.

docker-compose -f docker-compose.tools.yml run medco-cli-client --user test --password test modifier-info --help
NAME:
   medco-cli-client modifier-info - Get the modifier info

USAGE:
   medco-cli-client modifier-info modifierPath appliedPath

For example:

docker-compose -f docker-compose.tools.yml run medco-cli-client --user test --password test modifier-info /E2ETEST/modifiers/1/ /e2etest/1/
  <ExploreSearchResultElement>
      <Code>ENC_ID:5</Code>
      <DisplayName>E2E Modifier 1</DisplayName>
      <Leaf>true</Leaf>
      <MedcoEncryption>
          <Encrypted>true</Encrypted>
          <ID>5</ID>
      </MedcoEncryption>
      <Metadata>
          <ValueMetadata>
              <ChildrenEncryptIDs></ChildrenEncryptIDs>
              <CreationDateTime></CreationDateTime>
              <DataType></DataType>
              <EncryptedType></EncryptedType>
              <EnumValues></EnumValues>
              <Flagstouse></Flagstouse>
              <NodeEncryptID></NodeEncryptID>
              <Oktousevalues></Oktousevalues>
              <TestID></TestID>
              <TestName></TestName>
              <Version></Version>
          </ValueMetadata>
      </Metadata>
      <Name>E2E Modifier 1</Name>
      <Path>/E2ETEST/modifiers/1/</Path>
      <Type>modifier</Type>
  </ExploreSearchResultElement>

query

You can use this command to query the MedCo network.

docker-compose -f docker-compose.tools.yml run medco-cli-client --user test --password test query --help

NAME:
   medco-cli-client query - Query the MedCo network

USAGE:
   medco-cli-client query [command options] [-t timing] query_string

OPTIONS:
   --timing value, -t value  Query timing: any|samevisit|sameinstancenum (default: "any")

This is the syntax of an example query using the pre-loaded default test data.

docker-compose -f docker-compose.tools.yml run medco-cli-client --user test --password test query enc::1 AND enc::2 OR enc::3

You will get something like that:

node_name,count,patient_list,patient_set_id,DDTRequestTime,KSRequestTime,KSTimeCommunication,KSTimeExec,TaggingTimeCommunication,TaggingTimeExec,medco-connector-DDT,medco-connector-i2b2-PDO,medco-connector-i2b2-PSM,medco-connector-local-agg,medco-connector-local-patient-list-masking,medco-connector-overall,medco-connector-unlynx-key-switch-count,medco-connector-unlynx-key-switch-patient-list
0,1,[2],10,4236,311,307,0,1657,10,4266,3972,25472,1,153,34834,469,491
1,1,[2],10,584,89,75,0,474,78,677,4717,61325,16,3,66991,140,104
2,1,[2],10,669,55,45,0,576,49,709,3134,63371,0,8,67358,68,63

Query terms can be composed using the logical operators NOT, AND and OR.

Note that, in the queries, the OR operator has the highest priority, so1 AND NOT 2 OR 3 AND 2 is factorised as (1) AND (NOT (2 OR 3)) AND (2)

To each group of OR-ed terms you can also add a timing option ("any", "samevisit", "sameinstancenum") that will override the globally set timing option. For example: ``

1 AND NOT 2 OR 3 samevisit AND 2

Each query term is composed is composed of two mandatory fields, the type field and the content field, and an optional field, the constraint field, all separated by ::.

                                                `type::content[::constraint]`

Possible values of the type field are: enc, clr, file.

When the type field is equal to enc, the content field contains the concept ID. The constraint field is not present in this case.
When the type field is equal to clr, the content field contains the concept field (containing the concept path) and, possibly, the modifier field, which in turn contains the modifier key and applied path fields, all separated by :. The optional constraint field can be present, containing the operator, type and value fields separated by :. The constraint field applies either to the concept or, if the modifier field is present, to the modifier. The possible types are NUMBER and TEXT. The possible operators for numbers are: EQ (equals), NE (not equal), GT (greater than), LT (less than), GE (greater than or equal), LE (less than or equal), BETWEEN (between, in this case the value field is in the format "x and y"). The possible operators for TEXT are LIKE[exact], LIKE[begin], LIKE[contains] and LIKE[end].
When the type field is equal to file, the content field contains the path of the file containing the query terms, one for each row. The query terms contained in the same file are OR-ed together. Besides enc, clr, and file query terms, a file can also contain genomic query terms, each of which is composed by 4 comma separated values.

ga-get-values

You can use this command to get the values of the genomic annotations that MedCo nodes make available for queries.

docker-compose -f docker-compose.tools.yml run medco-cli-client --user test --password test ga-get-values --help

NAME:
   medco-cli-client ga-get-values - Get the values of the genomic annotations of type *annotation* whose values contain *value*

USAGE:
   medco-cli-client ga-get-values [command options] [-l limit] annotation value

OPTIONS:
   --limit value, -l value  Maximum number of returned values (default: 0)

To do some tests, you may want to load some data first.

Then, for example, if you want to know which genomic annotations of type "protein_change" containing the string "g32" are available, you can run:

docker-compose -f docker-compose.tools.yml run medco-cli-client --user test --password test ga-get-values protein_change g32

You will get:

G325R
G32E

The matching is case-insensitive and it is not possible to use wildcards. At the moment, with the loader v0, only three types of genomic annotations are available: variant_name, protein_change and hugo_gene_symbol.

ga-get-variant

You can use this command to get the variant ID of a certain genomic annotation.

docker-compose -f docker-compose.tools.yml run medco-cli-client --user test --password test ga-get-variant --help

NAME:
   medco-cli-client ga-get-variant - Get the variant ID of the genomic annotation of type *annotation* and value *value*

USAGE:
   medco-cli-client ga-get-variant [command options] [-z zygosity] [-e] annotation value

DESCRIPTION:
   zygosity can be either heterozygous, homozygous, unknown or a combination of the three separated by |
If omitted, the command will execute as if zygosity was equal to "heterozygous|homozygous|unknown|"

OPTIONS:
   --zygosity value, -z value  Variant zygosysty
   --encrypted, -e             Return encrypted variant id

To do some tests, you may want to load some data first.

Then, for example, if you want to know the variant ID of the genomic annotation "HTR5A" of type "hugo_gene_symbol" with zygosity "heterozygous" or "homozygous", you can run:

docker-compose -f docker-compose.tools.yml run medco-cli-client --user test --password test ga-get-variant -z "heterozygous|homozygous" hugo_gene_symbol HTR5A

You will get:

-7039476204566471680
-7039476580443220992

The matching is case-insensitive and it is not possible to use wildcards. If you request the ID of an annotation which is not available (e.g, in the previous, example, "HTR5") you will get an error message. At the moment, with the loader v0, only three types of genomic annotations are available: variant_name, protein_change and hugo_gene_symbol.

get-saved-cohorts

You can run this command to get the cohorts that have been previously saved.

NAME:
   medco-cli-client get-saved-cohorts - get cohorts

USAGE:
   medco-cli-client get-saved-cohorts [command options] [-l limit]

DESCRIPTION:
   Gets the list of cohorts.

OPTIONS:
   --limit value, -l value  Limits the number of retrieved cohorts. 0 means no limit. (default: 10)

You can run:

docker-compose -f docker-compose.tools.yml run medco-cli-client --user test --password test getsc

You will get:

node_index,cohort_name,cohort_id,query_id,creation_date,update_date,query_timing,panels
0,testCohort,-1,-1,2020-08-25T13:57:00Z,2020-08-25T13:57:00Z,any,"{panels:[{items:[{encrypted:false,queryTerm:/E2ETEST/SPHNv2020.1/DeathStatus/}],not:false,panelTiming:any}]}"
1,testCohort,-1,-1,2020-08-25T13:57:00Z,2020-08-25T13:57:00Z,any,"{panels:[{items:[{encrypted:false,queryTerm:/E2ETEST/SPHNv2020.1/DeathStatus/}],not:false,panelTiming:any}]}"
2,testCohort,-1,-1,2020-08-25T13:57:00Z,2020-08-25T13:57:00Z,any,"{panels:[{items:[{encrypted:false,queryTerm:/E2ETEST/SPHNv2020.1/DeathStatus/}],not:false,panelTiming:any}]}"

add-saved-cohorts

You can run this command to save a new cohort. The patient set IDs are given from a previous explore request. More precisely, they are taken from the patient_set_id column of explore results. The list of IDs must be given in a coma-separated format, without space. There must as many IDs as there are nodes.

NAME:
   medco-cli-client add-saved-cohorts - Create a new cohort.

USAGE:
   medco-cli-client add-saved-cohorts [command options] -c cohortName -p patientSetIDs

DESCRIPTION:
   Creates a new cohort with given name. The patient set IDs correspond to explore query result IDs.

OPTIONS:
   --patientSetIDs value, -p value  List of patient set IDs, there must be one per node
   --cohortName value, -c value     Name of the new cohort

For example, you can run:

docker-compose -f docker-compose.tools.yml run medco-cli-client --user test --password test addsc -c testCohort2 -p 10,10,10

update-saved-cohorts

You can run this command to update an existing cohort. The patient set IDs are given from a previous explore request, in the same manner as add-saved-cohort command.

NAME:
   medco-cli-client update-saved-cohorts - Updates an existing cohort.

USAGE:
   medco-cli-client update-saved-cohorts [command options] -c cohortName -p patientSetIDs

DESCRIPTION:
   Updates a new cohort with given name. The patient set IDs correspond to explore query result IDs.

OPTIONS:
   --patientSetIDs value, -p value  List of patient set IDs, there must be one per node
   --cohortName value, -c value     Name of the existing cohort

For example, you can run:

docker-compose -f docker-compose.tools.yml run medco-cli-client --user test --password test upsc -c testCohort2 -p 9,9,9

remove-saved-cohorts

You can run this command to remove an existing cohort.

NAME:
   medco-cli-client remove-saved-cohorts - Remove a cohort.

USAGE:
   medco-cli-client remove-saved-cohorts [command options] -c cohortName

DESCRIPTION:
   Removes a cohort for a given name. If the user does not have a cohort with this name in DB, an error is sent.

OPTIONS:
   --cohortName value, -c value  Name of the cohort to remove

For example, you can run:

docker-compose -f docker-compose.tools.yml run medco-cli-client --user test --password test rmsc -c testCohort2

This command removes the cohort from the node servers and it is not be possible to revert this action.

cohorts-patient-list

You can run this command to get the list of patient belonging to cohort. The cohort is identified by providing its name.

NAME:
   medco-cli-client cohorts-patient-list - Retrieve patient list belonging to the cohort

USAGE:
   medco-cli-client cohorts-patient-list [command options] -c cohortName [-d timer dump file]

DESCRIPTION:
   Retrieve the encrypted patient list for a given cohort name and locally decrypt it.

OPTIONS:
   --cohortName value, -c value  Name of the new cohort
   --dumpFile value, -d value    Output file for the timers CSV. Printed to stdout if omitted.

For example, you can run:

docker-compose -f docker-compose.tools.yml run medco-cli-client --user test --password test cpl -c testCohort

You will get something like:

Node idx 0
1137,1138,1139,1140
Node idx 1
1137,1138,1139,1140
Node idx 2
1137,1138,1139,1140

survival-analysis

You can run this command to get information useful to run survival analysis. The relative time points are computed as the difference between absolute dates of start concept and end concept.

NAME:
   medco-cli-client survival-analysis - Run a survival analysis

USAGE:
   medco-cli-client survival-analysis [command options] -l limit [-g granularity] [-c cohortID] -s startConcept [-x startModifier] -e endConcept [-y endModifier]

DESCRIPTION:
   Returns the points of the survival curve

OPTIONS:
   --limit value, -l value          Max limit of survival analysis. (default: 0)
   --granularity value, -g value    Time resolution, one of [day, week, month, year] (default: "day")
   --cohortID value, -c value       Cohort identifier (default: -1)
   --startConcept value, -s value   Survival start concept
   --startModifier value, -x value  Survival start modifier (default: "@")
   --endConcept value, -e value     Survival end concept
   --endModifier value, -y value    Survival end modifier (default: "@")

Start and concept are determined by the name of the access table concatenated to the full path of the concept.

The default cohort identifier -1 corresponds to test data loaded for end-to-end testing. All future cohort identifiers will be positive integers. Cohorts can be created after a successful MedCo Explore query.

docker-compose -f docker-compose.tools.yml run medco-cli-client --user test --password test srva  srva -l 2000 -g week -c 1  -s /SPHN/SPHNv2020.1/FophDiagnosis/ -e /SPHN/SPHNv2020.1/DeathStatus/ -y 126:1

The matching is case-insensitive and it is not possible to use wildcards. If you request the ID of an annotation which is not available (e.g, in the previous, example, "HTR5") you will get an error message. At the moment only three types of genomic annotations are available: variant_name, protein_change and hugo_gene_symbol.

Network Architecture

External Entities

Entities that need to connect to a machine running MedCo can be categorized as follow:

System administrators: Persons administrating the MedCo node. Likely to remain inside the clinical site internal network.
End-users: Researchers using MedCo to access the shared. Likely to remain inside the clinical site internal network.
Other MedCo nodes: MedCo nodes belonging to other clinical sites of the network.

Firewall Ports Opening

The following ports should be accessible by the listed entities, which makes IP address white-listing possible:

Port 22, 5432 (TCP): System Administrators
Port 80 (TCP): End-Users (HTTP automatic redirect to HTTPS (443))
Port 443 (TCP): System Administrators, End-Users, Other MedCo Nodes
Ports 2000-2001 (TCP): Other MedCo Nodes

Common Problems

Changing the Docker default address pool

If after deploying MedCo you notice some connectivity problems on your machine, or on the opposite the running containers have connectivity problems, check for potential conflict between your machine networks and Docker's virtual network (e.g. with ifconfig). If you do have such conflicts, you can edit Docker's configuration to set the addresses to use. Example:

Using Docker as non-root user

If you get such an error message while trying to execute commands as a non-root user:

You will need to follow .

Corrupt deployment after interrupting the very first loading

The very first a deployment is started, an initialization phase that can some time (2-10 minutes depending on the machine) will take place. If during this initialization the deployment is stopped, the database will be left in a corrupt state. In order to reset the database, you should delete the corresponding docker volume:

Note that the name of volume in this example is valid only for the dev-local-3nodes deployment. In other cases, use docker volume ls to retrieve the name of the volume containing the database, usually in the format <deploymentprofile>_medcodb.

For Developers

Local Development Deployment

Deployment profile dev-local-3nodes.

This deployment profile comes with default pre-generated keys and password. It is not meant to contain any real data nor be used in production. If you wish to do so, use instead the Network Deployment (network) deployment profile.

This deployment profile deploys 3 MedCo nodes on a single machine for development purposes. It is meant to be used only on your local machine, i.e. localhost. The tags of the docker images used are all dev, i.e. the ones built from the development version of the different source codes. They are available either through Docker Hub, or built locally from the sources of each component.

MedCo Stack Deployment (except Glowing Bear)

First step is to clone the medco repository with the correct branch. This example gets the data in the home directory of the current user, but that can be of course changed.

cd ~
git clone -b dev https://github.com/ldsec/medco.git

Next step is to build the docker images defined in the medco repository:

cd ~/medco/deployments/dev-local-3nodes
make build

Note that this will build the docker images defined locally. Because those are development versions, there is no guarantee that they will work at any point in time.

Next step is to run the nodes. They will run simultaneously, and the logs of the running containers will maintain the console captive. No configuration changes are needed in this scenario before running the nodes. To run them:

$ make up

Deploy new docker image of a running service

In order to deploy new code in the running deployment, it is enough to stop and start again the running container(s). Example:

make stop
make up

To confirm that a new version of the image has been deployed, Docker will output in the console "Recreating container ...".

Glowing Bear Deployment

First step is to clone the glowing-bear-medco repository with the correct branch.

cd ~
git clone -b dev https://github.com/ldsec/glowing-bear-medco.git

Glowing Bear is deployed separately for development, as we use its convenient live development server:

cd ~/glowing-bear-medco/deployment
./dev-server.sh

Note that the first run will take a significant time in order to build everything.

In order to stop the containers, simply hit Ctrl+C in all the active windows.

Test the deployment

In order to test that the development deployment of MedCo is working, access Glowing Bear in your web browser at http://localhost:4200/glowing-bear/ and use the default credentials specified in Keycloak user management. If you are new to Glowing Bear you can watch the Glowing Bear user interface walkthrough video. You can also use the CLI client to perform tests.

make load_test_data

System Architecture

Containers

medco-connector

Component orchestrating the MedCo query at the clinical site. Implements the resource-side of the PIC-SURE API. It communicates with medco-unlynx to execute the distributed cryptographic protocols. Sources on GitHub.

medco-unlynx

The software executing the distributed cryptographic protocols, based on Unlynx. Sources on GitHub.

glowing-bear-medco

Nginx web server serving Glowing Bear and the javascript crypto module. Sources on GitHub.

medco-loader

ETL tool to encrypt and load data into MedCo. Sources on GitHub.

i2b2

The i2b2 stack (all the cells). Project website.

keycloak

OpenID Connect identity provider, providing user management and their authentication to MedCo. Project website.

postgresql

The SQL database used by all other services, contains all the data. Project website.

pg-admin

A web-based administration tool for the PostgreSQL database. Project website.

nginx

Web server and (HTTPS-enabled) reverse proxy. Project website.

Description of the default test data

MedCo Explore

The default data loaded in MedCo is a small artificially generated dataset, appropriate to test a fresh deployment. The same data is replicated on all the nodes. Note that as of now it is encrypted using the test keys, i.e. the ones used for deployment profiles dev-local-3nodes and test-local-3nodes.

It contains 4 patients: 1 (real), 2 (real), 3 (real), 4 (dummy).

3 encrypted concepts: 1, 2, 3.

1 clear concept folder: /E2ETEST/e2etest/.

4 clear concepts: /E2ETEST/e2etest/1/, /E2ETEST/e2etest/2/, /E2ETEST/e2etest/3/.

1 modifier folder: /E2ETEST/modifiers/.

5 modifiers: /E2ETEST/modifiers/1/, E2ETEST/modifiers/2/, E2ETEST/modifiers/3/, E2ETEST/modifiers/2text/, E2ETEST/modifiers/3text/.

The observation fact contains the following entries:

patient 1, concept 1
patient 1, concept /E2ETEST/e2etest/1/ (val=10), modifier /E2ETEST/modifiers/1/ (val=10)
patient 1, concept /E2ETEST/e2etest/2/, modifier /E2ETEST/modifiers/2text/ (val='bcde')
patient 1, concept /E2ETEST/e2etest/3/, modifier /E2ETEST/modifiers/3text/ (val='ab')
patient 2, concept 1
patient 2, concept 2
patient 2, concept /E2ETEST/e2etest/1/ (val=20), modifier /E2ETEST/modifiers/ (val=20)
patient 2, concept /E2ETEST/e2etest/2/(val=50), modifier /E2ETEST/modifiers/2/(val=5)
patient 2, concept /E2ETEST/e2etest/2/, modifier /E2ETEST/modifiers/2text/ (val='abc')
patient 2, concept /E2ETEST/e2etest/3/, modifier /E2ETEST/modifiers/3text/ (val='def')
patient 3, concept 2
patient 3, concept 3
patient 3, concept /E2ETEST/e2etest/1/ (val=30), modifier /E2ETEST/modifiers/ (val=15), modifier /E2ETEST/modifiers/1/ (val=15)
patient 3, concept /E2ETEST/e2etest/2/ (val=25), modifier /E2ETEST/modifiers/ (val=30), modifier /E2ETEST/modifiers/2/ (val=15)
patient 3, concept /E2ETEST/e2etest/3/ (val=77), modifier /E2ETEST/modifiers/ (val=66), modifier /E2ETEST/modifiers/3/ (val=88)
patient 3, concept /E2ETEST/e2etest/2/, modifier /E2ETEST/modifiers/2text/ (val='de')
patient 3, concept /E2ETEST/e2etest/3/, modifier /E2ETEST/modifiers/3text/ (val='abcdef')
patient 4, concept 1
patient 4, concept 2
patient 4, concept 3
patient 4, concept /E2ETEST/e2etest/3/ (val=20), modifier /E2ETEST/modifiers/3/ (val=10)

Example queries and expected results (per node):

enc::1 AND enc::2: 1 (patient 2)
enc::2 AND enc::3: 1 (patient 3)
enc::1 AND enc::2 AND enc::3: 0
enc::1 OR enc::2: 3 (patients 1, 2 and 3)
enc::1 OR enc::3 AND enc::2: 1 (patients 2 and 3)
clr::/E2ETEST/e2etest/1/: 3 (patients 1, 2 and 3)
clr::/E2ETEST/e2etest/1/:/E2ETEST/modifiers/:/e2etest/%: 3 (patients 1, 2 and 3)
clr::/E2ETEST/e2etest/1/:/E2ETEST/modifiers/1/:/e2etest/1/: 2 (patients 1 and 3)
enc::1 AND clr::/E2ETEST/e2etest/2/: 1 (patient 2)
enc::1 OR clr::/E2ETEST/e2etest/3/: 3 (patients 1, 2 and 3)
clr::/E2ETEST/e2etest/1/::EQ:10: 1 (patient 1)
clr::/E2ETEST/e2etest/1/:/E2ETEST/modifiers/1/:/e2etest/1/::EQ:NUMBER:15: 1 (patient 3)
clr::/E2ETEST/e2etest/1/::BETWEEN:NUMBER:5 and 25: 2 (patients 1 and 2)
enc::1 OR clr::/E2ETEST/e2etest/2/::GE:25 AND clr::/E2ETEST/e2etest/2/:/E2ETEST/modifiers/2/:/e2etest/2/::LT:NUMBER:21: 2 (2 and 3)
clr::/E2ETEST/e2etest/2/:/E2ETEST/modifiers/2text/:/e2etest/2/::IN:TEXT:\'abc\'\,'de\': 2 (2 and 3)
clr::/E2ETEST/e2etest/3/:/E2ETEST/modifiers/3text/:/e2etest/3/::LIKE[begin]:TEXT:ab: 2 (1 and 3)
clr::/E2ETEST/e2etest/2/:/E2ETEST/modifiers/2text/:/e2etest/2/::LIKE[contains]:TEXT:cd: 1 (1)
clr::/E2ETEST/e2etest/3/:/E2ETEST/modifiers/3text/:/e2etest/3/::LIKE[end]:TEXT:bc: 0

MedCo Analysis - survival analysis

Default data for survival analysis consist on 228 fake patients. Each patient has observations relative to:

Death status (/SPHNv2020.1/DeathStatus/ and /DeathStatus-status/death in metadata)
Gender (/I2B2/Demographics/Gender/Female/ and /I2B2/Demographics/Gender/Female/ in metadata)
A diagnosis (/SPHNv2020.1/FophDiagnosis/ in metadata)

All patients have the same diagnosis. It is used as the start event for computing relative times.

Death status have two possible values: death or unknown (respectively 126:0 and 126:1 in modifier_cd column of observation_fact). 165 patients are deceased, the remaining 63 have the unknown status. Death status is used as the end event for relative times. As unknown-status death observation is the latest one for those whose death is not recorded, this observation is also useful for end event for right censoring events.

Gender observation are useful for testing the grouping feature. There are 138 female patients and 90 male patients.

Survival analysis query requires cohorts saved by the user. Tables explore_query_results and saved_cohorts are preloaded with the patient_num of the 228 fake patients. The cohort identifier is -1 It is the default argument of the command.

Survival analysis example:

docker-compose -f docker-compose.tools.yml run medco-cli-client --user test --password test srva  srva -l 2000 -g day  -s /SPHN/SPHNv2020.1/FophDiagnosis/ -e /SPHN/SPHNv2020.1/DeathStatus/ -y 126:1

Database

Administration with PgAdmin

PgAdmin can be accessed through http(s)://<node domain name>/pgadmin with username admin and password admin (by default). To access the test database just create a server with the name MedCo, the address postgresql, username postgres and password postgres (by default). Note that PgAdmin is not included in the production deployments.

Managing Large Databases: Data Loading

Database modifications

-- structure
ALTER TABLE i2b2demodata_i2b2.observation_fact
    ALTER COLUMN instance_num TYPE bigint,
    ALTER COLUMN text_search TYPE bigint;
    
-- settings
ALTER SYSTEM SET maintenance_work_mem TO '32GB';
SELECT pg_reload_conf();

Data generation

All the following operations are implemented through PL/pgSQL functions present in the MedCo-i2b2 database.

Duplication

The parameter of the following functions corresponds to the number of times the existing observation_fact table should be added to itself. For example with 3 , the number of rows of the table will be multiplied by 4.

Method 1 (double for loop)

SELECT i2b2demodata_i2b2.obs_fact_duplication_method_1(3);

Method 2 (temporary table)

SELECT i2b2demodata_i2b2.obs_fact_duplication_method_2(3);

Reduction

The parameter of the following function corresponds to the number of rows the resulting observation_fact table will have.

SELECT i2b2demodata_i2b2.obs_fact_reduction(2370000000);

Indexes

The following command builds only the i2b2 indexes needed by MedCo. I2b2 offers by default more of them to enable features not currently supported by MedCo.

SELECT i2b2demodata_i2b2.obs_fact_indexes();

MedCo setup

PostgreSQL database files

In the docker-compose.common.yml of the running profile, add in the postgresql service the following key, pointing to where the database files are:

volumes:
  - PATH_TO_DATABASE_FILES/_data:/var/lib/postgresql/data

Timeouts

Depending on the size of the database, you might need to increase several timeouts, notably the following:

I2B2_WAIT_TIME_SECONDS
SERVER_HTTP_WRITE_TIMEOUT_SECONDS
CLIENT_QUERY_TIMEOUT_SECONDS
ALL_TIMEOUTS_SECONDS

Backed Up Data

Backup Server

SSH: icsil1-ds-49.epfl.ch ; accounts: root / lca1
SSH Key of Mickaël and Joao set up (ask one of them for access)
20TB storage in /srv/, publicly accessible read-only through rsync daemon
10Gbps link with IC Clusters through interface 10.90.48.141

Data Stored in `/srv/`

`/srv/databases`

pg_volume_nodeX_x1212_indexes Databases of 3 nodes (X=0,1,2), with approximately 150k patients and 28.5B records total (50k patients and 9.5B records per node). It is a x1212 duplication of the original tcga_bio dataset. It also contains the built indexes. The size of each is around 3TB, which maybe can be reduced by running a PostgreSQL FULL VACUUM.
pg_volume_node0_XB_indexes Those databases are reductions of the pg_volume_node0_x1212_indexes database, with X=4.75,3.17,2.37 billion records. Those numbers were calculated to keep a total number of 28.5B rows with respectively 6, 9 and 12 nodes.

`/srv/deployments`

icclusters-deployment-backup-11-07-19 Contains all the deployment profiles and keys used for the pg_volume_nodeX_x1212_indexes databases.
postgresql-deployment Local deployment of postgresql and pgAdmin, in order to explore or modify the databases.
nebulaexp-alldeployments Contains all the deployment profiles and keys used for the pg_volume_node0_XB_indexes databases.

`/srv/logs`

duplications-nodeX Logs of data duplication (x100, x2, x3) and indexes building for the pg_volume_nodeX_x1212_indexes databases.
reductions-node0 Logs of data reduction (to 2.37B records) and indexes building of pg_volume_node0_x1212_indexes database.

Copying data to/from IC-Cluster machines

Enabling rsync daemon on a linux machine

Using the rsync daemon allows for easier data copy between machines. It is already enabled on the backup server (read-only), but in some cases it can be useful on other machines.

In the file /etc/default/rsync set the variable: RSYNC_ENABLE=true. Create the file /etc/rsyncd.conf with the following content, adapted to your case:

uid = root
gid = root

[disk]
path = /disk/
comment = MedCo data (read-only)
read only = yes

Then start the daemon with service rsync start.

Copy data with rsync

Example from an IC Cluster machine: rsync -a -v rsync://10.90.48.141/srv/databases/pg_volume_node0_x1212_indexes /disk/pg_volume

Live Demo

The live demo of MedCo is available at https://medco-demo.epfl.ch. The profile test-local-3nodes with a custom configuration is used.

Information about the machine and deployment

Any new added administrator should belong to the groups sudo, docker and lds, and have its SSH key set up.
The files of the deployment (/opt/medco) have the group lds set.
Updates of the system should be regularly done.
After a reboot, the MedCo deployment will not persist and must be started manually.
While the renewal of the certificate may be done automatically with Let's Encrypt, it is not automatically set up in the MedCo deployment, thus every 2-3 months this should be updated.
The Keycloak of the deployment is set up to demonstrate the connection with SwitchAAI. This includes having some keys configured in Keycloak, so be careful to not wipe the database in order to not loose those keys, otherwise the configuration will have to be redone.

Update demo version

Connect with SSH to the machine (you should have your SSH key set up there).

Get the latest version

Ensure the configuration specific to the MedCo stays (see next section).

cd /opt/medco/deployments/test-local-3nodes
make down
git pull

Configuration update

Update the configuration according to the following examples. For the passwords, use the same as defined in the previous deployments.

/opt/medco/deployments/test-local-3nodes/.env

MEDCO_NODE_HOST=medco-demo.epfl.ch
MEDCO_NODE_HTTP_SCHEME=https
POSTGRES_PASSWORD=xxx
PGADMIN_PASSWORD=xxx
KEYCLOAK_PASSWORD=xxx
I2B2_WILDFLY_PASSWORD=xxx
I2B2_SERVICE_PASSWORD=xxx
I2B2_USER_PASSWORD=xxx

/opt/medco/deployments/test-local-3nodes/docker-compose.yml

  glowing-bear-medco:
    environment:
      - "GB_FOOTER_TEXT=Disclaimer: This demo complies with the EPFL regulations and guidelines regarding the storage and use of personal data: https://www.epfl.ch/about/overview/overview/regulations-and-guidelines/"

Start deployment

cd /opt/medco/deployments/test-local-3nodes
make up

Load synthetic demo data

Get them from the Google Drive folder and execute the script.

cd /opt/medco
./test/data/download.sh spo_synthetic

./scripts/load-spo-i2b2-data.sh test/data/spo-synthetic/node_0 localhost i2b2medcosrv0 medcoconnectorsrv0
./scripts/load-spo-i2b2-data.sh test/data/spo-synthetic/node_1 localhost i2b2medcosrv1 medcoconnectorsrv1
./scripts/load-spo-i2b2-data.sh test/data/spo-synthetic/node_2 localhost i2b2medcosrv2 medcoconnectorsrv2

Update certificate

The certificate is provided by Let's Encrypt and valid for a period of 3 months, it thus needs regular renewing. First ensure that the configuration located in /etc/letsencrypt/renewal/medco-demo.epfl.ch.conf is correct.

Then with sudo rights renew the certificate:

certbot renew
cp /etc/letsencrypt/live/medco-demo.epfl.ch/fullchain.pem \
    /opt/medco/deployments/test-local-3nodes/configuration/certificate.crt
cp /etc/letsencrypt/live/medco-demo.epfl.ch/privkey.pem \
    /opt/medco/deployments/test-local-3nodes/configuration/certificate.key

And finally restart the stack:

cd /opt/medco/deployments/test-local-3nodes
make stop
make up

Release a new version

This is a small guide to make a new MedCo release.

Before releasing

Update dependencies

Glowing Bear MedCo

Update the Angular version and ensure node version in docker image is compatible with angular
NPM packages: npm update and review with npm outdated
- keycloak-js: has to be the same version as Keycloak (set in a Dockerfile in medco-deployment)
- typescript: has to be compatible with the angular version used

MedCo

Dockerfile go base image version: FROM golang:1.13
go.mod go version: go 1.13
Go modules (pay particular attention to onet): go get -u ./... and go mod tidy

Perform tests

Check that all the CI/CD pipeline on GitHub passes (this tests the whole backend with the profile dev-local-3nodes)
Deploy locally test-local-3nodes to manually test Glowing Bear MedCo
Deploy locally on several machines test-network to manually test the deployment over several machines, and the generation of its configuration

To change the version of the docker images used, update the .env file in the deployment folder:

MEDCO_VERSION=<docker_tag>
GLOWING_BEAR_MEDCO_VERSION=<docker_tag>

Making a release

Manual updates

Update the version of Glowing Bear MedCo GB_VERSION in the Makefile to point to the correct Docker tag that will be released (e.g. v1.0.0)

Release on GitHub

Version numbers follow semantic versioning, and both codebases should have the same version. For both codebases:

Out of the dev branch, create a new release (and the associated tag) with the semantic version (e.g. v1.0.0)
Ensure the CI/CD pipeline correctly builds the new release

Update documentation

In GitBook

Create new variant named like the MedCo version being release (e.g. v1.0.0). When merged it will be pushed as a branch on GitHub.
Add new entry on Releases page.
Update version numbers in the guides' download scripts (e.g. docker-compose pull).
Review the documentation to ensure the guides are up-to-date. Notably the deployment and loading guides.
Make the documentation variant be the new main variant on GitBook.
On GitHub, set the branch corresponding to the new version be the new default branch.

After the release

Ensure on GitHub that all images have been built correctly with the proper versioning.
Update to the new version the live demo on medco-demo.epfl.ch.
Update medco.epfl.ch website with the new version and update the roadmap.
Communication about the new release (Twitter notably).

For users

MedCo Live Demo Tutorial

July 2021

This tutorial is related to Version 2.0.1 of MedCo and to a synthetic dataset that simulates observations for 1500 patients in three hospitals. The synthetic data are derived from the ontology of the Swiss Personalized Oncology (SPO) program that is supported by the Swiss Personalized Health (SPHN) initiative.

Background

Medco is a privacy-preserving federated-analytics platform for performing secure distributed analytics on data from several clinical sites. Medco is a project supported by the Swiss Personalized Health Network (SPHN) [1] and Data Protection for Personalized Health (DPPH) [2] initiatives and developed at Ecole Polytechnique Fédérale de Lausanne (EPFL) and Centre Hospitalier Universitaire Vaudois (CHUV). This project is also made possible by the joint efforts of the Hopitaux Universitaires de Genève (HUG) and Inselspital. Some terminologies available in this demo are in French, as the demo deployment is built from metadata available at CHUV. Other versions can be installed depending on the clinical site or researcher language. MedCo offers cross-compatibility between languages. It does not explain the technical details of the underlying technology or the deployment process. Please refer to the Medco website [3] for publications and technical details.

To demonstrate its wide range of possibilities, we present this tutorial that describes a few relevant uses-cases of Medco [4]. We will illustrate the Explore and Analysis functionalities. All the data in this demo are synthetic and do not belong to any company or institution.

Make a simple age-constrained cohort selection
Try a realistic oncology query and use the Medco-Analysis feature for survival curves

The Medco Live Demo client is available at the following address: . The initial screen asks for credentials. For this demo, you can use these :

Username: test / Password: test

A Simple Age Query

On the left side of the user interface is an ontology browser. The ontology browser enables you to explore the variables that might be contained in the database and to identify those that you would like to use for your query. Variables in the ontology browser are organized in a tree-like fashion and can be explored as a file system made of folders and files. Most of the time, variables and hierarchies are taken from standard medical terminologies and classifications. The purpose is to drag-and-drop criteria for cohort selection into the right-side panel called inclusion criteria.

1. In the MedCo Explore query parameters, select the option "Selected groups occur in the same instance". It allows to manually specify dependencies between criteria. This option will be explained later. 2. On the left sidebar, expand the ontology SPHN-SPO ontology, then the Birth Datetime group, and then expand it to reveal Birth Year-value. Drag-and-drop this Birth Year-value element to the right panel.

3. Below, an input field will appear: select greater than. 4. Next to this field, type 1980.

When you drag-and-drop an item, you can drop it in 3 different zones :

The Replace zone, to change the element
The Or zone, to create a Or condition with the new element
The And zone, to create a And condition with the new element

5. So for the next step, on the left sidebar, expand the Drug group, then the Drug_code, and finally ATC. Drag-and-drop the Nervous system drugs (N) element to the right panel in the And zone.

6. On the left sidebar, expand the SPOConcepts group, then the Somatic Variant Found group. Drag-and-drop the Gene_name element to the right panel in the And zone. 7. An input field will appear: select exactly matches. 8. Next to this field, type HRAS. 9. On the left sidebar, also in the Somatic Variant Food, drag-and-drop the Hgvs_c element to the right panel in the And zone. 10. An input field will appear: select contains. 11. Next to this field, type 6994.

The “Selected groups occur in the same instance” option we selected before enabled a checkbox next to each criterion. All the checked boxes require to refer to the same observation. On the contrary, unticked boxes refer to independent observations.

In this case, we require that the mutation on the gene HRAS and the mutation at the position 6994 are the same object. 12. So we need to uncheck the Same instance boxes for Birth Year-value and the Nervous system drugs (N) as the screenshot below. Indeed, these are distinct and independent objects from the Somatic Variant Found observation.

When all the Inclusion criteria have been added, the right panel should look like this :

Now that we have selected the Inclusion criteria, we can add the Exclusion criteria :

1. On the left sidebar, expand the Consent group, then Status. Drag-and-drop the Refused element to the right panel in the Exclusion area. Uncheck the option Same instance. 2. On the left sidebar, also in the Consent group, extend the Type group. Drag-and-drop the Waiver element to the right panel in the Exclusion’s Or zone.

3. Click Run. After a few seconds of loading, the number of subjects will be displayed at the top.

Survival Analysis

In this part, to have some subjects, we first need to build and run a Query, then we need to run some analyses on it. Finally, we can see the results of the analyses and change some visualization parameters.

1. Browse the ontology panel again to expand SPHN-SPO ontology, then expand SPOConcepts. Drag-and-drop the Oncology Drug Treatment element to the right panel in the Inclusion area.

2. Click Run at the top. After a few seconds of loading, the number of subjects will be displayed. 3. Name the Cohort as had_treatment and click on Save. The Cohort will appear on the left panel, below Saved Cohorts.

4. The Cohort is now saved, click on it to select it.

5. Click on the Analysis tab at the top of the page, then on Survival.

6. Open the Settings panel and set a Time Limit of 20 years.

7. Drag-and-drop the Oncology Drug Treatment element in the right panel under Start Event.

8. Browse the ontology panel again to expand Death Status, then expend Status. Drag-and-drop the Death element to the right panel under End Event.

9. Open the Subgroups panel and set the first subgroup name as surgery.

10. Drag-and-drop the Oncology Surgery element in the right panel in the Inclusion criteria area. Click on Save to save the subgroup.

11. Set the second subgroup name to no_surgery. 12. Drag-and-drop the Oncology Surgery element in the right panel in the Exclusion criteria area. Click on Save to save the subgroup.

13. Click on Run.

14. After a few seconds of loading, the result will be displayed. By opening the Input parameters panel, you can find more details about the query. 15. Click on the cog icon to open the panel to edit the Confidence interval of the graphical representation. In this panel, you can change diverse parameters to alter the graphical representation. When you have finished, close this panel.

16. On the right side, you can also choose the tabular scores to show.

17. Finally, you can download the results in PDF form by clicking on the download icon.

Conclusion

Only a few exploration and analysis features available in Medco have been presented in this document; more are available, and all can be combined with no limitations. No adaptations were made to the data, except for the tabular vs. graph representation. In particular, no links were lost or tampered with. Every edge in the semantic graphs (e.g., every relation between a patient and their diagnosis or treatment) is preserved. The Medco database uses visit (encounter) identifiers, patient pseudo-identifiers, and instance (observation) identifiers; they are not shown to the user. Therefore, using Medco does not inherently add any usability penalty, compared to the original clinical data.

References

[1] “Swiss personalized health network,” , accessed: 2021-02-26.

[2] “Data protection in personal health,” , accessed: 2021-02-26.

[3] , accessed: 2021-02-26.

[4] J. L. Raisaro, J. R. Troncoso-Pastoriza, M. Misbach, J. S. Sousa, S. Pradervand, E. Missiaglia, O. Michielin, B. Ford, and J.-P. Hubaux, “Medco: Enabling secure and privacy-preserving exploration of distributed clinical and genomic data,” IEEE/ACM transactions on computational biology and bioinformatics, vol. 16, no. 4, pp. 1328–1341, 2018.

Command-Line Interface (CLI)

MedCo provides a client command-line interface (CLI) to interact with the medco-connector APIs.

Prerequisites

To use the CLI, you must first follow one of the deployment guides. However, the version of the CLI documented here is the one shipped with the Local Development Deployment.

How to use

To show the CLI manual, run:

export MEDCO_SETUP_DIR=~/medco
cd ${MEDCO_SETUP_DIR}/deployments/dev-local-3nodes/
docker-compose -f docker-compose.tools.yml run medco-cli-client --user [USERNAME] --password [PASSWORD] --help

NAME:
   medco-cli-client - Command-line query tool for MedCo.

USAGE:
   medco-cli-client [global options] command [command options] [arguments...]

VERSION:
   dev

COMMANDS:
   concept-children, con-c     Get the concept children (both concepts and modifiers)
   modifier-children, mod-c    Get the modifier children
   concept-info, con-i         Get the concept info
   modifier-info, mod-i        Get the modifier info
   query, q                    Query the MedCo network
   ga-get-values, ga-val       Get the values of the genomic annotations of type *annotation* whose values contain *value*
   ga-get-variant, ga-var      Get the variant ID of the genomic annotation of type *annotation* and value *value*
   survival-analysis, srva     Run a survival analysis
   get-saved-cohorts, getsc    get cohorts
   add-saved-cohorts, addsc    Create a new cohort.
   update-saved-cohorts, upsc  Updates an existing cohort.
   remove-saved-cohorts, rmsc  Remove a cohort.
   help, h                     Shows a list of commands or help for one command

GLOBAL OPTIONS:
   --user value, -u value        OIDC user login
   --password value, -p value    OIDC password login
   --token value, -t value       OIDC token
   --disableTLSCheck             Disable check of TLS certificates
   --outputFile value, -o value  Output file for the result. Printed to stdout if omitted.
   --help, -h                    show help
   --version, -v                 print the version

For a start, you can use the credentials of the default user: username:test password:test

concept-children

You can use this command to browse the MedCo ontology by getting the children of a concept, both concepts and modifiers.

docker-compose -f docker-compose.tools.yml run medco-cli-client --user test --password test concept-children --help
NAME:
   medco-cli-client concept-children - Get the concept children (both concepts and modifiers)

USAGE:
   medco-cli-client concept-children conceptPath

For example:

docker-compose -f docker-compose.tools.yml run medco-cli-client --user test --password test concept-children /E2ETEST/e2etest/ 
PATH    TYPE
/E2ETEST/e2etest/1/    concept
/E2ETEST/e2etest/2/    concept
/E2ETEST/e2etest/3/    concept
/E2ETEST/modifiers/    modifier_folder

modifier-children

You can use this command to browse the MedCo ontology by getting the children of a modifier.

docker-compose -f docker-compose.tools.yml run medco-cli-client --user test --password test modifier-children --help
NAME:
   medco-cli-client modifier-children - Get the modifier children

USAGE:
   medco-cli-client modifier-children modifierPath appliedPath appliedConcept

For example:

docker-compose -f docker-compose.tools.yml run medco-cli-client --user test --password test modifier-children /E2ETEST/modifiers/ /e2etest/% /E2ETEST/e2etest/1/
PATH    TYPE
/E2ETEST/modifiers/1/    modifier

concept-info

You can use this command to get information about a MedCo concept, including the associated metadata.

docker-compose -f docker-compose.tools.yml run medco-cli-client --user test --password test concept-info --help
NAME:
   medco-cli-client concept-info - Get the concept info

USAGE:
   medco-cli-client concept-info conceptPath

For example:

docker-compose -f docker-compose.tools.yml run medco-cli-client --user test --password test concept-info /E2ETEST/e2etest/1/ 
  <ExploreSearchResultElement>
      <Code>ENC_ID:1</Code>
      <DisplayName>E2E Concept 1</DisplayName>
      <Leaf>true</Leaf>
      <MedcoEncryption>
          <Encrypted>true</Encrypted>
          <ID>1</ID>
      </MedcoEncryption>
      <Metadata>
          <ValueMetadata>
              <ChildrenEncryptIDs></ChildrenEncryptIDs>
              <CreationDateTime></CreationDateTime>
              <DataType></DataType>
              <EncryptedType></EncryptedType>
              <EnumValues></EnumValues>
              <Flagstouse></Flagstouse>
              <NodeEncryptID></NodeEncryptID>
              <Oktousevalues></Oktousevalues>
              <TestID></TestID>
              <TestName></TestName>
              <Version></Version>
          </ValueMetadata>
      </Metadata>
      <Name>E2E Concept 1</Name>
      <Path>/E2ETEST/e2etest/1/</Path>
      <Type>concept</Type>
  </ExploreSearchResultElement>

modifier-info

You can use this command to get information about a MedCo modifier, including the associated metadata.

docker-compose -f docker-compose.tools.yml run medco-cli-client --user test --password test modifier-info --help
NAME:
   medco-cli-client modifier-info - Get the modifier info

USAGE:
   medco-cli-client modifier-info modifierPath appliedPath

For example:

docker-compose -f docker-compose.tools.yml run medco-cli-client --user test --password test modifier-info /E2ETEST/modifiers/1/ /e2etest/1/
  <ExploreSearchResultElement>
      <Code>ENC_ID:5</Code>
      <DisplayName>E2E Modifier 1</DisplayName>
      <Leaf>true</Leaf>
      <MedcoEncryption>
          <Encrypted>true</Encrypted>
          <ID>5</ID>
      </MedcoEncryption>
      <Metadata>
          <ValueMetadata>
              <ChildrenEncryptIDs></ChildrenEncryptIDs>
              <CreationDateTime></CreationDateTime>
              <DataType></DataType>
              <EncryptedType></EncryptedType>
              <EnumValues></EnumValues>
              <Flagstouse></Flagstouse>
              <NodeEncryptID></NodeEncryptID>
              <Oktousevalues></Oktousevalues>
              <TestID></TestID>
              <TestName></TestName>
              <Version></Version>
          </ValueMetadata>
      </Metadata>
      <Name>E2E Modifier 1</Name>
      <Path>/E2ETEST/modifiers/1/</Path>
      <Type>modifier</Type>
  </ExploreSearchResultElement>

query

You can use this command to query the MedCo network.

docker-compose -f docker-compose.tools.yml run medco-cli-client --user test --password test query --help

NAME:
   medco-cli-client query - Query the MedCo network

USAGE:
   medco-cli-client query [command options] [-t timing] query_string

OPTIONS:
   --timing value, -t value  Query timing: any|samevisit|sameinstancenum (default: "any")

This is the syntax of an example query using the pre-loaded default test data.

docker-compose -f docker-compose.tools.yml run medco-cli-client --user test --password test query enc::1 AND enc::2 OR enc::3

You will get something like that:

node_name,count,patient_list,patient_set_id,DDTRequestTime,KSRequestTime,KSTimeCommunication,KSTimeExec,TaggingTimeCommunication,TaggingTimeExec,medco-connector-DDT,medco-connector-i2b2-PDO,medco-connector-i2b2-PSM,medco-connector-local-agg,medco-connector-local-patient-list-masking,medco-connector-overall,medco-connector-unlynx-key-switch-count,medco-connector-unlynx-key-switch-patient-list
0,1,[2],10,4236,311,307,0,1657,10,4266,3972,25472,1,153,34834,469,491
1,1,[2],10,584,89,75,0,474,78,677,4717,61325,16,3,66991,140,104
2,1,[2],10,669,55,45,0,576,49,709,3134,63371,0,8,67358,68,63

Query terms can be composed using the logical operators NOT, AND and OR.

Note that, in the queries, the OR operator has the highest priority, so1 AND NOT 2 OR 3 AND 2 is factorised as (1) AND (NOT (2 OR 3)) AND (2)

To each group of OR-ed terms you can also add a timing option ("any", "samevisit", "sameinstancenum") that will override the globally set timing option. For example: ``

1 AND NOT 2 OR 3 samevisit AND 2

Each query term is composed is composed of two mandatory fields, the type field and the content field, and an optional field, the constraint field, all separated by ::.

                                                `type::content[::constraint]`

Possible values of the type field are: enc, clr, file.

When the type field is equal to enc, the content field contains the concept ID. The constraint field is not present in this case.
When the type field is equal to clr, the content field contains the concept field (containing the concept path) and, possibly, the modifier field, which in turn contains the modifier key and applied path fields, all separated by :. The optional constraint field can be present, containing the operator, type and value fields separated by :. The constraint field applies either to the concept or, if the modifier field is present, to the modifier. The possible types are NUMBER and TEXT. The possible operators for numbers are: EQ (equals), NE (not equal), GT (greater than), LT (less than), GE (greater than or equal), LE (less than or equal), BETWEEN (between, in this case the value field is in the format "x and y"). The possible operators for TEXT are LIKE[exact], LIKE[begin], LIKE[contains] and LIKE[end].
When the type field is equal to file, the content field contains the path of the file containing the query terms, one for each row. The query terms contained in the same file are OR-ed together. Besides enc, clr, and file query terms, a file can also contain genomic query terms, each of which is composed by 4 comma separated values.

ga-get-values

You can use this command to get the values of the genomic annotations that MedCo nodes make available for queries.

docker-compose -f docker-compose.tools.yml run medco-cli-client --user test --password test ga-get-values --help

NAME:
   medco-cli-client ga-get-values - Get the values of the genomic annotations of type *annotation* whose values contain *value*

USAGE:
   medco-cli-client ga-get-values [command options] [-l limit] annotation value

OPTIONS:
   --limit value, -l value  Maximum number of returned values (default: 0)

To do some tests, you may want to load some data first.

Then, for example, if you want to know which genomic annotations of type "protein_change" containing the string "g32" are available, you can run:

docker-compose -f docker-compose.tools.yml run medco-cli-client --user test --password test ga-get-values protein_change g32

You will get:

G325R
G32E

ga-get-variant

You can use this command to get the variant ID of a certain genomic annotation.

docker-compose -f docker-compose.tools.yml run medco-cli-client --user test --password test ga-get-variant --help

NAME:
   medco-cli-client ga-get-variant - Get the variant ID of the genomic annotation of type *annotation* and value *value*

USAGE:
   medco-cli-client ga-get-variant [command options] [-z zygosity] [-e] annotation value

DESCRIPTION:
   zygosity can be either heterozygous, homozygous, unknown or a combination of the three separated by |
If omitted, the command will execute as if zygosity was equal to "heterozygous|homozygous|unknown|"

OPTIONS:
   --zygosity value, -z value  Variant zygosysty
   --encrypted, -e             Return encrypted variant id

To do some tests, you may want to load some data first.

Then, for example, if you want to know the variant ID of the genomic annotation "HTR5A" of type "hugo_gene_symbol" with zygosity "heterozygous" or "homozygous", you can run:

docker-compose -f docker-compose.tools.yml run medco-cli-client --user test --password test ga-get-variant -z "heterozygous|homozygous" hugo_gene_symbol HTR5A

You will get:

-7039476204566471680
-7039476580443220992

get-saved-cohorts

You can run this command to get the cohorts that have been previously saved.

NAME:
   medco-cli-client get-saved-cohorts - get cohorts

USAGE:
   medco-cli-client get-saved-cohorts [command options] [-l limit]

DESCRIPTION:
   Gets the list of cohorts.

OPTIONS:
   --limit value, -l value  Limits the number of retrieved cohorts. 0 means no limit. (default: 10)

You can run:

docker-compose -f docker-compose.tools.yml run medco-cli-client --user test --password test getsc

You will get:

node_index,cohort_name,cohort_id,query_id,creation_date,update_date,query_timing,panels
0,testCohort,-1,-1,2020-08-25T13:57:00Z,2020-08-25T13:57:00Z,any,"{panels:[{items:[{encrypted:false,queryTerm:/E2ETEST/SPHNv2020.1/DeathStatus/}],not:false,panelTiming:any}]}"
1,testCohort,-1,-1,2020-08-25T13:57:00Z,2020-08-25T13:57:00Z,any,"{panels:[{items:[{encrypted:false,queryTerm:/E2ETEST/SPHNv2020.1/DeathStatus/}],not:false,panelTiming:any}]}"
2,testCohort,-1,-1,2020-08-25T13:57:00Z,2020-08-25T13:57:00Z,any,"{panels:[{items:[{encrypted:false,queryTerm:/E2ETEST/SPHNv2020.1/DeathStatus/}],not:false,panelTiming:any}]}"

add-saved-cohorts

NAME:
   medco-cli-client add-saved-cohorts - Create a new cohort.

USAGE:
   medco-cli-client add-saved-cohorts [command options] -c cohortName -p patientSetIDs

DESCRIPTION:
   Creates a new cohort with given name. The patient set IDs correspond to explore query result IDs.

OPTIONS:
   --patientSetIDs value, -p value  List of patient set IDs, there must be one per node
   --cohortName value, -c value     Name of the new cohort

For example, you can run:

docker-compose -f docker-compose.tools.yml run medco-cli-client --user test --password test addsc -c testCohort2 -p 10,10,10

update-saved-cohorts

You can run this command to update an existing cohort. The patient set IDs are given from a previous explore request, in the same manner as add-saved-cohort command.

NAME:
   medco-cli-client update-saved-cohorts - Updates an existing cohort.

USAGE:
   medco-cli-client update-saved-cohorts [command options] -c cohortName -p patientSetIDs

DESCRIPTION:
   Updates a new cohort with given name. The patient set IDs correspond to explore query result IDs.

OPTIONS:
   --patientSetIDs value, -p value  List of patient set IDs, there must be one per node
   --cohortName value, -c value     Name of the existing cohort

For example, you can run:

docker-compose -f docker-compose.tools.yml run medco-cli-client --user test --password test upsc -c testCohort2 -p 9,9,9

remove-saved-cohorts

You can run this command to remove an existing cohort.

NAME:
   medco-cli-client remove-saved-cohorts - Remove a cohort.

USAGE:
   medco-cli-client remove-saved-cohorts [command options] -c cohortName

DESCRIPTION:
   Removes a cohort for a given name. If the user does not have a cohort with this name in DB, an error is sent.

OPTIONS:
   --cohortName value, -c value  Name of the cohort to remove

For example, you can run:

docker-compose -f docker-compose.tools.yml run medco-cli-client --user test --password test rmsc -c testCohort2

This command removes the cohort from the node servers and it is not be possible to revert this action.

cohorts-patient-list

You can run this command to get the list of patient belonging to cohort. The cohort is identified by providing its name.

NAME:
   medco-cli-client cohorts-patient-list - Retrieve patient list belonging to the cohort

USAGE:
   medco-cli-client cohorts-patient-list [command options] -c cohortName [-d timer dump file]

DESCRIPTION:
   Retrieve the encrypted patient list for a given cohort name and locally decrypt it.

OPTIONS:
   --cohortName value, -c value  Name of the new cohort
   --dumpFile value, -d value    Output file for the timers CSV. Printed to stdout if omitted.

For example, you can run:

docker-compose -f docker-compose.tools.yml run medco-cli-client --user test --password test cpl -c testCohort

You will get something like:

Node idx 0
1137,1138,1139,1140
Node idx 1
1137,1138,1139,1140
Node idx 2
1137,1138,1139,1140

survival-analysis

You can run this command to get information useful to run survival analysis. The relative time points are computed as the difference between absolute dates of start concept and end concept.

NAME:
   medco-cli-client survival-analysis - Run a survival analysis

USAGE:
   medco-cli-client survival-analysis [command options] -l limit [-g granularity] [-c cohortID] -s startConcept [-x startModifier] -e endConcept [-y endModifier]

DESCRIPTION:
   Returns the points of the survival curve

OPTIONS:
   --limit value, -l value          Max limit of survival analysis. (default: 0)
   --granularity value, -g value    Time resolution, one of [day, week, month, year] (default: "day")
   --cohortID value, -c value       Cohort identifier (default: -1)
   --startConcept value, -s value   Survival start concept
   --startModifier value, -x value  Survival start modifier (default: "@")
   --endConcept value, -e value     Survival end concept
   --endModifier value, -y value    Survival end modifier (default: "@")

Start and concept are determined by the name of the access table concatenated to the full path of the concept.

docker-compose -f docker-compose.tools.yml run medco-cli-client --user test --password test srva  srva -l 2000 -g week -c 1  -s /SPHN/SPHNv2020.1/FophDiagnosis/ -e /SPHN/SPHNv2020.1/DeathStatus/ -y 126:1

The matching is case-insensitive and it is not possible to use wildcards. If you request the ID of an annotation which is not available (e.g, in the previous, example, "HTR5") you will get an error message. At the moment only three types of genomic annotations are available: variant_name, protein_change and hugo_gene_symbol.

v2.0.1

Home

Resources

Contact

License

Releases

v2.0.1 - 14th April 2021

v2.0.0 - 24th March 2021

v1.0.0 - 31th March 2020

v0.3.1 - 6th March 2020

v0.3.0 - 11th February 2020

v0.2.1 - 15th August 2019

v0.2.0 - 3rd May 2019

v0.1.1 - 23rd January 2019

v0.1.0 - 1st December 2018

For System Administrators

Requirements

Deployment

Deployment Profiles

test-local-3nodes ()

network ()

dev-local-3nodes ()

Local Test Deployment

MedCo Stack Deployment

Keycloak Configuration

Test the deployment

Network Deployment

Pre-requisites

Generation of the deployment Profile

Step 1

Step 2

MedCo Stack Deployment

Keycloak Configuration

Test the deployment

Configuration

Passwords

Passwords Configuration

PostgreSQL administration user

PgAdmin user

Keycloak administration user

I2b2 Wildfly administration user

I2b2 service user

I2b2 default user

Keycloak

Accessing the web administration interface

User Management

Default users

Admin user

Test users

Add a user

Give query permissions to a user

MedCo Default Settings

medco OpenID Connect client

Securing a production deployment

Changing default passwords

Changing default realm keys

Enabling brute force detection

Setting Authorizations

Authorizations

REST API Authorizations

medco-network

medco-explore

medco-genomic-annotations

medco-survival-analysis

Explore Query Authorizations

HTTPS Configuration

Certificate

Enable HTTPS for the Local Local Deployment

Configure HTTPS for the Network Deployment

Configuring SwitchAAI Authentication

Prerequisites

Configure the identity provider(s) in Keycloak

Configure the first login flow

Add the identity provider

Add the username mapper

Setup a certificate

Register Keycloak instance as a Service Provider in SwitchAAI

Register new resource

1. Basic Resource Information

2. Descriptive Information