# Description of the default test data

## MedCo Explore

The default data loaded in MedCo is a small artificially generated dataset, appropriate to test a fresh deployment. The same data is replicated on all the nodes. Note that as of now it is encrypted using the test keys, i.e. the ones used for deployment profiles `dev-local-3nodes` and `test-local-3nodes`.

It contains 4 patients: 1 (real), 2 (real), 3 (real), 4 (dummy).

3 encrypted concepts: 1, 2, 3.

1 clear concept folder: /E2ETEST/e2etest/.

4 clear concepts:  /E2ETEST/e2etest/1/,  /E2ETEST/e2etest/2/, /E2ETEST/e2etest/3/.

1 modifier folder: /E2ETEST/modifiers/.

5 modifiers: /E2ETEST/modifiers/1/, E2ETEST/modifiers/2/, E2ETEST/modifiers/3/, E2ETEST/modifiers/2text/, E2ETEST/modifiers/3text/.

The observation fact contains the following entries:

* patient 1, concept 1
* patient 1, concept /E2ETEST/e2etest/1/ (val=10), modifier /E2ETEST/modifiers/1/ (val=10)
* patient 1, concept /E2ETEST/e2etest/2/, modifier /E2ETEST/modifiers/2text/ (val='bcde')
* patient 1, concept /E2ETEST/e2etest/3/, modifier /E2ETEST/modifiers/3text/ (val='ab')
* patient 2, concept 1
* patient 2, concept 2
* patient 2, concept /E2ETEST/e2etest/1/ (val=20), modifier /E2ETEST/modifiers/ (val=20)
* patient 2, concept /E2ETEST/e2etest/2/(val=50), modifier /E2ETEST/modifiers/2/(val=5)
* patient 2, concept /E2ETEST/e2etest/2/, modifier /E2ETEST/modifiers/2text/ (val='abc')
* patient 2, concept /E2ETEST/e2etest/3/, modifier /E2ETEST/modifiers/3text/ (val='def')
* patient 3, concept 2
* patient 3, concept 3
* patient 3, concept /E2ETEST/e2etest/1/ (val=30), modifier /E2ETEST/modifiers/ (val=15), modifier /E2ETEST/modifiers/1/ (val=15)
* patient 3, concept /E2ETEST/e2etest/2/ (val=25), modifier /E2ETEST/modifiers/ (val=30), modifier /E2ETEST/modifiers/2/ (val=15)
* patient 3, concept /E2ETEST/e2etest/3/ (val=77), modifier /E2ETEST/modifiers/ (val=66), modifier /E2ETEST/modifiers/3/ (val=88)
* patient 3, concept /E2ETEST/e2etest/2/, modifier /E2ETEST/modifiers/2text/ (val='de')
* patient 3, concept /E2ETEST/e2etest/3/, modifier /E2ETEST/modifiers/3text/ (val='abcdef')
* patient 4, concept 1
* patient 4, concept 2
* patient 4, concept 3
* patient 4, concept /E2ETEST/e2etest/3/ (val=20), modifier /E2ETEST/modifiers/3/ (val=10)

Example queries and expected results (per node):

* `enc::1 AND enc::2`: 1 (patient 2)
* `enc::2 AND enc::3`: 1 (patient 3)
* `enc::1 AND enc::2 AND enc::3`: 0
* `enc::1 OR enc::2`: 3 (patients 1, 2 and 3)
* `enc::1 OR enc::3 AND enc::2`: 1 (patients 2 and 3)
* `clr::/E2ETEST/e2etest/1/`: 3 (patients 1, 2 and 3)
* `clr::/E2ETEST/e2etest/1/:/E2ETEST/modifiers/:/e2etest/%`: 3 (patients 1, 2 and 3)
* `clr::/E2ETEST/e2etest/1/:/E2ETEST/modifiers/1/:/e2etest/1/`: 2 (patients 1 and 3)
* `enc::1 AND clr::/E2ETEST/e2etest/2/`: 1 (patient 2)
* `enc::1 OR clr::/E2ETEST/e2etest/3/`:  3 (patients 1, 2 and 3)
* `clr::/E2ETEST/e2etest/1/::EQ:10`: 1 (patient 1)
* `clr::/E2ETEST/e2etest/1/:/E2ETEST/modifiers/1/:/e2etest/1/::EQ:NUMBER:15`: 1 (patient 3)
* `clr::/E2ETEST/e2etest/1/::BETWEEN:NUMBER:5 and 25`: 2 (patients 1 and 2)
* `enc::1 OR clr::/E2ETEST/e2etest/2/::GE:25 AND clr::/E2ETEST/e2etest/2/:/E2ETEST/modifiers/2/:/e2etest/2/::LT:NUMBER:21`: 2 (2 and 3)
* `clr::/E2ETEST/e2etest/2/:/E2ETEST/modifiers/2text/:/e2etest/2/::IN:TEXT:\'abc\'\,'de\'`: 2 (2 and 3)
* `clr::/E2ETEST/e2etest/3/:/E2ETEST/modifiers/3text/:/e2etest/3/::LIKE[begin]:TEXT:ab`: 2 (1 and 3)
* `clr::/E2ETEST/e2etest/2/:/E2ETEST/modifiers/2text/:/e2etest/2/::LIKE[contains]:TEXT:cd`: 1 (1)
* `clr::/E2ETEST/e2etest/3/:/E2ETEST/modifiers/3text/:/e2etest/3/::LIKE[end]:TEXT:bc`: 0

## MedCo Analysis - survival analysis

Default data for survival analysis consist on 228 fake patients. Each patient has observations relative to:

* Death status (/SPHNv2020.1/DeathStatus/ and /DeathStatus-status/death in metadata)
* Gender (/I2B2/Demographics/Gender/Female/ and /I2B2/Demographics/Gender/Female/ in metadata)
* A diagnosis (/SPHNv2020.1/FophDiagnosis/  in metadata)

All patients have the same diagnosis. It is used as the start event for computing relative times.

Death status have two possible values: death or unknown (respectively 126:0 and 126:1 in modifier\_cd column of observation\_fact). 165 patients are deceased, the remaining 63 have the unknown status. Death status is used as the end event for relative times. As unknown-status death observation is the latest one for those whose death is not recorded, this observation is also useful for end event for right censoring events.

Gender observation are useful for testing the grouping feature. There are 138 female patients and 90 male patients.

Survival analysis query requires cohorts saved by the user. Tables explore\_query\_results and saved\_cohorts are preloaded with the patient\_num of the 228 fake patients. The cohort identifier is -1 It is the default argument of the command.

Survival analysis example:

```
docker-compose -f docker-compose.tools.yml run medco-cli-client --user test --password test srva  srva -l 2000 -g day  -s /SPHN/SPHNv2020.1/FophDiagnosis/ -e /SPHN/SPHNv2020.1/DeathStatus/ -y 126:1
```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://ldsec.gitbook.io/medco-documentation/v2.0.1/developers/description-of-the-default-test-data.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
