Skip to main content
Gene Expression-based Classification of Cancer Tumours via Penalized Probabilistic Principal Components Analysis
Probabilistic Principal Component Analysis is frequently used to pre-process noisy data. Though the number of principal components are useful to capture sample dependence, cluster assignments based on principal axes do not always perform well as noise can weaken the degree of clusters separation. A previously proposed penalized profile log-likelihood criterion to automatically select the effective dimension of data is used to train classification models in the projection space. We illustrate via simulations that this approach requires less training data and leads to faster computation. The proposed method was used to classify nine tumor classes in NCI 60 cell-line dataset. On 30% and 50% training samples we recorded 83% and 93% accuracy on the remaining test cases. In contrast, classification using original data yielded 52% and 79% accuracy. Our approach is able to leverage the molecular variations for tens of thousands of genes simultaneously to produce accurate tumor classifications.
Date and Time
-
Additional Authors and Speakers (not including you)
Radu Craiu
University of Toronto
Language of Oral Presentation
English
Language of Visual Aids
English

Speaker

Edit Name Primary Affiliation
Wei Deng McMaster