Comment on page
PgAdmin can be accessed through
http(s)://<node domain name>/pgadminwith username
admin(by default). To access the test database just create a server with the name
MedCo, the address
postgres(by default). Note that PgAdmin is not included in the production deployments.
PgAdmin server configuration.
ALTER TABLE i2b2demodata_i2b2.observation_fact
ALTER COLUMN instance_num TYPE bigint,
ALTER COLUMN text_search TYPE bigint;
ALTER SYSTEM SET maintenance_work_mem TO '32GB';
All the following operations are implemented through PL/pgSQL functions present in the MedCo-i2b2 database.
The parameter of the following functions corresponds to the number of times the existing
observation_facttable should be added to itself. For example with 3 , the number of rows of the table will be multiplied by 4.
- Method 1 (double for loop)
- Method 2 (temporary table)
The parameter of the following function corresponds to the number of rows the resulting
observation_facttable will have.
The following command builds only the i2b2 indexes needed by MedCo. I2b2 offers by default more of them to enable features not currently supported by MedCo.
docker-compose.common.ymlof the running profile, add in the
postgresqlservice the following key, pointing to where the database files are:
Depending on the size of the database, you might need to increase several timeouts, notably the following:
- SSH: icsil1-ds-49.epfl.ch ; accounts: root / lca1
- SSH Key of Mickaël and Joao set up (ask one of them for access)
- 20TB storage in
/srv/, publicly accessible read-only through rsync daemon
- 10Gbps link with IC Clusters through interface 10.90.48.141
pg_volume_nodeX_x1212_indexesDatabases of 3 nodes (
X=0,1,2), with approximately 150k patients and 28.5B records total (50k patients and 9.5B records per node). It is a x1212 duplication of the original
tcga_biodataset. It also contains the built indexes. The size of each is around 3TB, which maybe can be reduced by running a PostgreSQL FULL VACUUM.
pg_volume_node0_XB_indexesThose databases are reductions of the
X=4.75,3.17,2.37billion records. Those numbers were calculated to keep a total number of 28.5B rows with respectively 6, 9 and 12 nodes.
icclusters-deployment-backup-11-07-19Contains all the deployment profiles and keys used for the
postgresql-deploymentLocal deployment of postgresql and pgAdmin, in order to explore or modify the databases.
nebulaexp-alldeploymentsContains all the deployment profiles and keys used for the
duplications-nodeXLogs of data duplication (x100, x2, x3) and indexes building for the
reductions-node0Logs of data reduction (to 2.37B records) and indexes building of
Using the rsync daemon allows for easier data copy between machines. It is already enabled on the backup server (read-only), but in some cases it can be useful on other machines.
In the file
/etc/default/rsyncset the variable:
RSYNC_ENABLE=true. Create the file
/etc/rsyncd.confwith the following content, adapted to your case:
uid = root
gid = root
path = /disk/
comment = MedCo data (read-only)
read only = yes
Then start the daemon with
service rsync start.
Example from an IC Cluster machine:
rsync -a -v rsync://10.90.48.141/srv/databases/pg_volume_node0_x1212_indexes /disk/pg_volume