1 of 4

Data Loading

There are two ways of loading data into MedCo. The first, using the provided loader, allows to encrypt and load the encrypted data into the MedCo database. The second loads directly pre-generated data into the database without encrypting data.

Load pre-generated data

Pre-generated cleartext synthetic data following the SPO (Swiss Personalized Oncology) ontology is available,

Synthetic SPO Data

This page will guide you through loading example synthetic data that follows the SPO (Swiss Personalized Oncology) ontology.

Pre-Requisite: Download test data

Execute the download script to download the test datasets.

cd ${MEDCO_SETUP_DIR}/test/data
bash download.sh spo_synthetic

Load the data into MedCo

A script is available to load in a simple way the data. Example of how to use it with a test-local-3nodes deployment running on your localhost, adapt it to your own use-case:

v0 (Genomic Data)

The v0 loader expects an ontology, with mutation and clinical data in the MAF format. As the ontology data you must use ${MEDCO_SETUP_DIR}/test/data/genomic/tcga_cbio/clinical_data.csv and ${MEDCO_SETUP_DIR}/test/data/genomic/tcga_cbio/mutation_data.csv. For clinical data you can keep using the same two files or a subset of the data (e.g. 8_clinical_data.csv). More information about how to generate sample data files can be found below. After the following script is executed all the data is encrypted and deterministically tagged in compliance with the MedCo data model.

How to use

Ensure you have before proceeding to the loading.

The following examples show you how to load data into a running MedCo deployment. Adapt accordingly the commands your use-case.

Examples

Loading the three nodes on the dev-local-3nodes profile

Loading one node on a network-test profile

Explanation of the command's arguments

Test that the loading was successful

To check that it is working you can query for:

-> MedCo Gemomic Ontology -> Gene Name -> BRPF3

For the small dataset 8_xxxx you should obtain 3 matching subjects (one at each site).

v1 (I2B2 Demodata)

The v1 loader expects an already existing i2b2 database (in .csv format) that will be converted in a way that is compliant with the MedCo data model. This involves encrypting and deterministically tagging some of the data.

List of input (‘original’) files:

all i2b2metadata files (e.g. i2b2.csv)
dummy_to_patient.csv
patient_dimension.csv
visit_dimension.csv
concept_dimension.csv
modifier_dimension.csv
observation_fact.csv
table_access.csv

How to use

Ensure you have before proceeding to the loading.

The following examples show you how to load data into a running MedCo deployment. Adapt accordingly the commands your use-case.

Examples

Loading the three nodes on the dev-local-3nodes profile

Loading one node on a network-test profile

Explanation of the command's arguments

Test that the loading was successful

To check that it is working you can query for:

-> Diagnoses -> Neoplasm -> Benign neoplasm -> Benign neoplasm of breast

You should obtain 2 matching subjects.

v0 (Genomic Data)

How to use

Ensure you have before proceeding to the loading.

The following examples show you how to load data into a running MedCo deployment. Adapt accordingly the commands your use-case.

Examples

Loading the three nodes on the dev-local-3nodes profile

Loading one node on a network-test profile

Explanation of the command's arguments

Test that the loading was successful

To check that it is working you can query for:

-> MedCo Gemomic Ontology -> Gene Name -> BRPF3

For the small dataset 8_xxxx you should obtain 3 matching subjects (one at each site).

export MEDCO_SETUP_DIR=~/medco \
    MEDCO_DEPLOYMENT_PROFILE=dev-local-3nodes
cd "${MEDCO_SETUP_DIR}/deployments/${MEDCO_DEPLOYMENT_PROFILE}"
docker-compose -f docker-compose.tools.yml run medco-loader-srv0 v0 \
    --ont_clinical /data/genomic/tcga_cbio/8_clinical_data.csv \
    --sen /data/genomic/sensitive.txt \
    --ont_genomic /data/genomic/tcga_cbio/8_mutation_data.csv \
    --clinical /data/genomic/tcga_cbio/8_clinical_data.csv \
    --genomic /data/genomic/tcga_cbio/8_mutation_data.csv \
    --output /data/
docker-compose -f docker-compose.tools.yml run medco-loader-srv1 v0 \
    --ont_clinical /data/genomic/tcga_cbio/8_clinical_data.csv \
    --sen /data/genomic/sensitive.txt \
    --ont_genomic /data/genomic/tcga_cbio/8_mutation_data.csv \
    --clinical /data/genomic/tcga_cbio/8_clinical_data.csv \
    --genomic /data/genomic/tcga_cbio/8_mutation_data.csv \
    --output /data/
docker-compose -f docker-compose.tools.yml run medco-loader-srv2 v0 \
    --ont_clinical /data/genomic/tcga_cbio/8_clinical_data.csv \
    --sen /data/genomic/sensitive.txt \
    --ont_genomic /data/genomic/tcga_cbio/8_mutation_data.csv \
    --clinical /data/genomic/tcga_cbio/8_clinical_data.csv \
    --genomic /data/genomic/tcga_cbio/8_mutation_data.csv \
    --output /data/

export MEDCO_SETUP_DIR=~/medco \
    MEDCO_DEPLOYMENT_PROFILE=dev-local-3nodes
cd "${MEDCO_SETUP_DIR}/deployments/${MEDCO_DEPLOYMENT_PROFILE}"
docker-compose -f docker-compose.tools.yml run medco-loader-srv0 v0 \
    --ont_clinical /data/genomic/tcga_cbio/8_clinical_data.csv \
    --sen /data/genomic/sensitive.txt \
    --ont_genomic /data/genomic/tcga_cbio/8_mutation_data.csv \
    --clinical /data/genomic/tcga_cbio/8_clinical_data.csv \
    --genomic /data/genomic/tcga_cbio/8_mutation_data.csv \
    --output /data/
docker-compose -f docker-compose.tools.yml run medco-loader-srv1 v0 \
    --ont_clinical /data/genomic/tcga_cbio/8_clinical_data.csv \
    --sen /data/genomic/sensitive.txt \
    --ont_genomic /data/genomic/tcga_cbio/8_mutation_data.csv \
    --clinical /data/genomic/tcga_cbio/8_clinical_data.csv \
    --genomic /data/genomic/tcga_cbio/8_mutation_data.csv \
    --output /data/
docker-compose -f docker-compose.tools.yml run medco-loader-srv2 v0 \
    --ont_clinical /data/genomic/tcga_cbio/8_clinical_data.csv \
    --sen /data/genomic/sensitive.txt \
    --ont_genomic /data/genomic/tcga_cbio/8_mutation_data.csv \
    --clinical /data/genomic/tcga_cbio/8_clinical_data.csv \
    --genomic /data/genomic/tcga_cbio/8_mutation_data.csv \
    --output /data/

v1 (I2B2 Demodata)

List of input (‘original’) files:

all i2b2metadata files (e.g. i2b2.csv)
dummy_to_patient.csv
patient_dimension.csv
visit_dimension.csv
concept_dimension.csv
modifier_dimension.csv
observation_fact.csv
table_access.csv

How to use

Ensure you have before proceeding to the loading.

The following examples show you how to load data into a running MedCo deployment. Adapt accordingly the commands your use-case.

Examples

Loading the three nodes on the dev-local-3nodes profile

Loading one node on a network-test profile

Explanation of the command's arguments

Test that the loading was successful

To check that it is working you can query for:

-> Diagnoses -> Neoplasm -> Benign neoplasm -> Benign neoplasm of breast

You should obtain 2 matching subjects.

Data Loading

hashtagLoad pre-generated data

Synthetic SPO Data

hashtagPre-Requisite: Download test data

hashtagLoad the data into MedCo

v0 (Genomic Data)

hashtagHow to use

hashtagExamples

hashtagLoading the three nodes on the dev-local-3nodes profile

hashtagLoading one node on a network-test profile

hashtagExplanation of the command's arguments

hashtagTest that the loading was successful

v1 (I2B2 Demodata)

hashtagHow to use

hashtagExamples

hashtagLoading the three nodes on the dev-local-3nodes profile

hashtagLoading one node on a network-test profile

hashtagExplanation of the command's arguments

hashtagTest that the loading was successful

v0 (Genomic Data)

hashtagHow to use

hashtagExamples

hashtagLoading the three nodes on the dev-local-3nodes profile

hashtagLoading one node on a network-test profile

hashtagExplanation of the command's arguments

hashtagTest that the loading was successful

Synthetic SPO Data

hashtagPre-Requisite: Download test data

hashtagLoad the data into MedCo

v1 (I2B2 Demodata)

hashtagHow to use

hashtagExamples

hashtagLoading the three nodes on the dev-local-3nodes profile

hashtagLoading one node on a network-test profile

hashtagExplanation of the command's arguments

hashtagTest that the loading was successful

Data Loading

hashtagLoad pre-generated data

hashtagPre-Requisite: Download test data

hashtagDummy Generation

Load pre-generated data

Pre-Requisite: Download test data

Load the data into MedCo

How to use

Examples

Loading the three nodes on the dev-local-3nodes profile

Loading one node on a network-test profile

Explanation of the command's arguments

Test that the loading was successful

How to use

Examples

Loading the three nodes on the dev-local-3nodes profile

Loading one node on a network-test profile

Explanation of the command's arguments

Test that the loading was successful

How to use

Examples

Loading the three nodes on the dev-local-3nodes profile

Loading one node on a network-test profile

Explanation of the command's arguments

Test that the loading was successful

Pre-Requisite: Download test data

Load the data into MedCo

How to use

Examples

Loading the three nodes on the dev-local-3nodes profile

Loading one node on a network-test profile

Explanation of the command's arguments

Test that the loading was successful

Load pre-generated data

Pre-Requisite: Download test data

Dummy Generation