2008 SSC Annual Meeting
2008 Annual Meeting of the SSC in OttawaBIOSTATISTICS WORKSHOP
Regression Modelling Strategies
May 25, 2008
Frank Harrell (Vanderbilt University)
The first part of the workshop presents the following elements of multivariable predictive modeling for a single response variable: using regression splines to relax linearity assumptions, perils of variable selection and overfitting, where to spend degrees of freedom, shrinkage, imputation of missing data, data reduction, and interaction surfaces. Then a default overall modeling strategy will be described. This is followed by methods for graphically understanding models (e.g., using nomograms) and using re-sampling to estimate a model’s likely performance on new data. Then the freely available R Design library will be overviewed. Design facilitates most of the steps of the modeling process. Two of the following three case studies will be presented: an interactive exploration of the survival status of Titanic passengers, an interactive case study in developing a survival time model for critically ill patients, and a case study in Cox regression.
Participants may wish to read the following references in advance.
- F. E. Harrell, K. L. Lee, and D. B. Mark. Multivariable prognostic models: Issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Statistics in Medicine, 15:361—387, 1996.
- F. E. Harrell, P. A. Margolis, S. Gove, K. E. Mason, E. K. Mulholland, D. Lehmann, L. Muhe, S. Gatchalian, and H. F. Eichenwald. Development of a clinical prediction model for an ordinal outcome: The World Health Organization ARI Multicentre Study of clinical signs and etiologic agents of pneumonia, sepsis, and meningitis in young infants. Statistics in Medicine, 17:909—944, 1998.
- Spanos, F. E. Harrell, and D. T. Durack. Differential diagnosis of acute meningitis: An analysis of the predictive value of initial observations. Journal of the American Medical Association, 262:2700—2707, 1989.
About the Leader:
Dr. Harrell is Professor and Chair, Department of Biostatistics, Vanderbilt University. His primary interest is the study of patient outcomes in general and specifically the development of accurate prognostic and diagnostic models and models for many other patient responses. His book Regression Modeling Strategies with Applications to Linear Models, Logistic Regression, and Survival Analysis (2001, Springer-Verlag) contains theory, examples, and detailed case studies demonstrating the use of many modern statistical modeling tools.