Teaching Datasets

The NHLBI has prepared three datasets suitable for use in an undergraduate or graduate level biostatistics instruction program. These datasets are freely available upon request.

A longitudinal, epidemiology focused datasets was developed using the Framingham Heart Study as the source for the data. This dataset contains three clinic examination and 20 year follow-up data on a large subset of the original Framingham cohort participants. The documentation for the Framingham dataset contains a variable list and coding help for the data.

A clinical trial focused dataset was developed using the Digitalis Investigation Group (DIG). This dataset was designed to replicate the results found in the February 1997 NEJM article. The documentation for the DIG dataset contains a variable list and annotated forms.

A dataset focused on longitudinal, repeated measures was developed using the Childhood Asthma Management Program (CAMP). This dataset includes 695 participants from the CAMP trial and an average of 14 spirometry measures per participant. The documentation includes a variable list, summary tables, and selected annotated form elements.

Users are cautioned that teaching datasets are completely unsuitable for publication purposes since specific statistical measures were used to create anonymous versions.

Request a teaching dataset.

Public Use Datasets

Public use datasets are anonymized, freely available datasets for research purposes. Since the data is in the public domain, requirements for a research materials agreement or review by a local IRB are waived. Due to the public investment to collect and provide the data, contact information and project titles are requested for the purpose of tracking publications.

National Longitudinal Mortality Study (NLMS)