Aller au contenu principal
How a Misclassified Binary Outcome Y in Training Data Affects Model Prediction Performance: a Simulation Study
Statistics models are used for explaining and/or predicting an outcome of interest. For explanations, the focus is on parameter estimation that describes an independent variable’s effect. In this regard, the effect of a misclassified outcome variable and how to correct it has been studied extensively, with one popular method being MCSIMEX. However, a relevant question yet to be addressed is how misclassification affects predictive performance. We investigate this question through extensive simulation studies. Motivated by a real world example, we generated a binary event status Y that is subject to misclassification. We fit a logistic regression model using the misclassified Y* and assessed model performance on a test data simulated from the same underlying model without misclassification. We show that the predictive performance on test data is similar regardless of whether or not the misclassified Y* was corrected and always better than the performance on the training data.
Date and Time
-
Co-auteurs (non y compris vous-même)
Yutong Han
University of Alberta
Langue de la présentation orale
Anglais
Langue des supports visuels
Anglais

Speaker

Edit Name Primary Affiliation
Yan Yuan University of Alberta