Below is a current listing of the eMERGE phenotyping files that reside at the Coordinating Center. Please click here for a complete listing of the eMERGE I and II phenotypes. A listing of eMERGE studies accessible via dbGaP is located here.
Data File Descriptions
Imputed GWAS and Sequencing Sets
Name | Type | Cohort Size | Variables |
eMERGE I Merged Set |
Imputed GWAS | 18,663 participants across 5 sites | Static Phenotype Data Dictionary Lab Data Dictionary |
eMERGE II Merged Set | Imputed GWAS | 55,029 participants across 9 sites | Static Phenotype Data Dictionary BMI Data Dictionary Lab Data Dictionary |
eMERGE PGx Merged Set | PGRNSeq Sequencing | 9,010 participants across 9 sites | Static Phenotype Data Dictionary Labs Data Dictionary ECG Data Dictionary |
eMERGE III Merged Set | Imputed GWAS | 83,717 participants across 12 sites | Static Phenotype Data Dictionary BMI Data Dictionary |
Data Access
Direct Request for eMERGE Members: eMERGE investigators request access to phenotype data files following the publication policy. A manuscript concept sheet is submitted and sites express their interest in participation (see the eMERGE publication policy and data use agreement). External investigators may request access to the data via dbGaP.
Tools: The eMERGE Record Counter (eRC) and SPHINX provide eMERGE investigators online access to aggregate views of the data. Both tools allow counts to be stratified by race, sex, and age.
- eRC: Data available on the eMERGE I and II GWAS cohort. Data includes Demographics, ICD and CPT codes, and case/control status for published phenotypes; additionally a download file containing blind data from the eMERGE II merged set.
- SPHINX: Demographics, medications, variants, ICD and CPT codes for PGx cohort
Access for Non-eMERGE Investigators: eMERGE research data is publicly accessible to external researchers via the database of Genotypes and Phenotypes (dbGaP). You can view more information about authorized access to data via dbGaP here. Additionally, individual investigators can propose a collaborative project for Network consideration. That process is outlined in the Guidelines for External Collaborators. Academic, non-profit and government organizations can apply for affiliate membership. You can learn more about that process here.
Notes on the Data
eMERGE I Merged Set: Demographics and case/control status for primary and secondary phenotypes are available for the entire cohort. Lab data only collected for Lipids, Hypothyroidism, Diabetic Retinopathy, WBC, RBC, Height, Resistant Hypertension, Cataracts, Dementia, PAD, QRS, and Type II Diabetes studies.
eMERGE II Merged Set: Demographics and case/control status for eMERGE I and II phenotype are available for the entire cohort. BMI was only collected for AAA, AMD, Asthma, CRF, Diabetic Hypertensive CKD, Heart Failure,and Zoster studies. Lipid labs were only collected for AAA, AMD, Asthma, Atopic Dermatitis, CRF, Diabetic Hypertensive CKD studies. BMI and labs data are accessible via the raw study files by site by phenotype. In eMERGE II, primary sites cleaned study-specific data and these data were submitted to dbGaP as amendments to the eMERGE II Merged Set submission. Below are the current submission status for each of the eMERGE II phenotypes. In addition to the clean, merged study datasets, the CC also has a copy of the study and site-specific data files submitted during the phenotyping workflow for all eMERGE II phenotypes (not intergrated by site or phenotype; not QC’d).
- Not Completed: CAAD, c-diff, CKD, CRF, Extreme Obesity, Heart Failure, Ocular HTN, VTE, Zoster studies
- Not to be Included: DILI, Remission of Diabetes after ROUX-EN-Y
- Completed Submission: AAA, ACE- I Cough, ADHD, AMD, Appendicitis, Atopic Dermatitis, Asthma, Autism, Benign Prostatic Hyperplasia (BPH), caMRSA, Childhood Obesity, Colon Polyps, Diverticulitis, Extreme Obesity, GERD, Glaucoma, Statins for MACE studies
eMERGE I PGx Set: Demographics, ECG, and Lipid labs data are available for the entire cohort.
eMERGE I Merged Set: Demographics, and case/control status for Phase I and II phenotypes, and BMI are available for the entire cohort.