v0 (Genomic Data)
The v0 loader expects an ontology, with mutation and clinical data in the MAF format. As the ontology data you must use ~/medco-deployment/resources/data/genomic/tcga_cbio/clinical_data.csv
and ~/medco-deployment/resources/data/genomic/tcga_cbio/mutation_data.csv
. For clinical data you can keep using the same two files or a subset of the data (e.g. 8_clinical_data.csv). More information about how to generate sample datafiles can be found below. After the following script is executed all the data is encrypted and ‘deterministically tagged’ in compliance with the MedCo data model.
Example
The following example allows to load data into a running MedCo development deployment (dev-local-3nodes), on the node 0. Adapt accordingly the docker-compose service being ran to load the two other nodes of this profile.
Explanation of the arguments:
Data Manipulation
Inside ~/medco-loader/data/scripts/
you can find a small python application to extract (or replicate) data out of the original tcga_cbio dataset. You can decide which patients you want to consider for you ‘new’ dataset or simply randomly pick a sample.
To check that it is working you can query for:
-> MedCo Gemomic Ontology -> Gene Name -> BRPF3
For the small dataset 8_xxxx
you should obtain 3 matching subjects (one at each site).
Last updated