Case Studies for the 2003 Annual Meeting
Blood Pressure
Last modified 2008-06-18
Please check this page regularly for updates, corrections, and answers to frequently asked questions!
Table of Contents
Acknowledgments
Our appreciation goes out to Dr. Raymond Lam, GlaxoSmithKline, Toronto, Ontario, Canada for providing this case study.
Introduction
Genes contribute to the development and progression of disease and they also influence how individuals respond to medicines. At GlaxoSmithKline (GSK), we are conducting genetic and genomic research which will allow the medical community to accurately prescribe the right medicine for the right patient.
In genetics research studies often hundreds to thousands of genetic markers, together with many clinical measurements, are collected. Statistical tools are useful for separating ‘true’ genes from ‘false’ alarms.
Data Description
The data file (ascii file, comma delimited data file) contains 500 observations (subjects) and 501 variables. Of the 500 subjects, 250 had low blood pressure and 250 had high blood pressure (i.e. hypertension). The 501 variables consist of one response variable (systolic blood pressure) and 500 predictors (17 clinical covariates and 483 genetic markers). These variables are described below.
Table 1: Attributes Used in This Study
Variable | Description |
---|---|
Systolic Blood Pressure (SBP) | Continuous response variable |
Gender | Binary Variable: M = Male, F = Female |
Marital Status | Binary variable: Y = Married, N = Not Married |
Smoking Status | Binary variable: Y = Smoker, N = Non-Smoker |
Age | Continuous variable (years) |
Weight | Continuous variable (lbs) |
Height | Continuous variable (inches) |
Body Mass Index (BMI) | Continuous variable: (Weight/Height2) x 703 |
Overweight | Categorical variable: 1 = Normal, 2 = Overweight, 3 = Obese. |
Race | Categorical variable taking values 1, 2, 3, or 4. |
Exercise level | Categorical variable: 1 = Low, 2 = Medium, 3 = High |
Alcohol Use | Categorical variable: 1 = Low, 2 = Medium, 3 = High |
Stress Level | Categorical variable: 1 = Low, 2 = Medium, 3 = High |
Salt (NaCl) Intake Level | Categorical variable: 1 = Low, 2 = Medium, 3 = High |
Childbearing Potential | Categorical variable: 1 = Male, 2 = Able Female, 3 = Unable Female |
Income Level | Categorical Variable: 1 = Low, 2 = Medium, 3 = High |
Education Level | Categorical Variable: 1 = Low, 2 = Medium, 3 = High |
Treatment (for hypertension) | Binary Variable: Y = Treated, N = Untreated |
483 Genetic Markers | 0_0, 0_1, 1_1 |
Objectives
For this case study, a genetic data set is generated based on a complex genetic model we developed at GSK. There are 500 predictors (483 genetic markers and 17 clinical covariates). The goal is to identify the ‘true’ predictors among the 500 variables and, at the same time, control the false discovery rate. Therefore, the objectives are:
- Identify ‘true’ genes and clinical covariates
- Control False Discovery (number of true X’s versus number of false X’s identified)
References
- Scottish Intercollegiate Guidelines Network (SIGN) (January 2001).
Hypertension in Older People - National Institutes of Health, ‘National Heart, Lung, and Blood Institute’ (nhlbi.nih.gov).
Lowering Blood Pressure - Hyman, D.J., and Valory, N.P. (2001). Characteristics of Patients with Uncontrolled Hypertension in the United States.
The New England Journal of Medicine, Volume 345, No. 7, p 479-486.