Skip to main content
Data Source
The Canadian Open Parkinson Network (C-OPN, https://copn-rpco.ca/)
Organizer
Dr. Juan Li and Dr. Michael Schlossmacher; Ottawa Hospital Research Institute


(Le français suit)

Background

Parkinson disease (PD) and atypical parkinsonism

PD is a progressive, neurodegenerative disorder that results from the loss of specific, dopamine-producing cells in the brain. This leads to the cardinal motor signs, including muscle stiffness, slowness of movement, rest tremor, and poor balance and coordination, which prompts the working diagnosis of PD; a positive response to L-dopa treatment and progression of symptoms over time will confirm the diagnosis. The disease, however, is not only restricted to motor impairments; non-motor symptoms include a loss of sense of smell, constipation, sleep disturbances, and depression and anxiety, some of which are present years before motor signs appear.1 PD is the second commonest neurodegenerative disease, following Alzheimer’s disease. The worldwide pooled prevalence of PD is estimated at 1.51 cases per 1000, 9.34 cases per 1000 among individuals older than 60 years, and prevalence is higher in males than in females.2 In Canada, PD is currently estimated to impact over 110,000 Canadians directly,3,4 and to cost the country \$3.3 billion annually; these numbers are expected to grow by 40,000 individuals and an additional \$1.1 billion in annual costs by 2034.3

Atypical parkinsonism (aka PD-plus syndrome) represents a group of disorders that share some signs with PD, such as tremor, slowness and stiffness (i.e. parkinsonism), but patients have additional symptoms that may progress faster and benefit less from L-dopa medication (or not at all).1,5 Theses disorders include multiple-system atrophy (MSA), progressive supranuclear palsy (PSP), corticobasal degeneration (CBD) and dementia with Lewy bodies (DLB). These conditions are often difficult to differentiate from PD and each other.1,5 An accurate and easy-to-use diagnostic model to help with the differentiation is an unmet need. 

The Canadian Open Parkinson Network (C-OPN)

C-OPN is “a pan-Canada initiative bridging people, data, and resources to accelerate Parkinson’s disease (PD) discoveries”,6,7 which now includes 11 major Movement Disorders centres in Canada across 5 provinces and has recruited 2,318 participants (PD: 84%; PD-plus: 4.6%; neurologically healthy controls (HC): 11.4%). C-OPN is an invaluable resource to support various PD-research projects by providing a comprehensive de-identified database containing demographic, epidemiological, clinical information, and various PD-related assessment tools and instruments.

Implementing statistical/machine learning models in healthcare

Since its advent in 1950s, machine learning (ML) has been used in various fields, including healthcare.8,9 In healthcare settings, ML has been used to develop prognostic/diagnostic tools and personalized treatment plans.9,10 However, despite the exploding number of new developments and publications, there is still a huge implementation gap of machine learning in healthcare – very few of these algorithms ever make it to the clinics.11 Seneviratne et al. talked about some key factors that limit the broader adoption of ML models in healthcare: actionability (of the output), safety, and utility. Another factor that should be considered is the actionability of the input – what variables to be included into the model and what are their implications for the future clinical implementation. The current case study challenge is designed for study participants to be aware of, and consider these elements when developing a new algorithm, and to understand the importance of collaboration with domain experts at every stage of the project.
 
*******************

Contexte

Maladie de Parkinson (MP) et parkinsonisme atypique

La MP est une maladie neurodégénérative progressive qui résulte de la perte de cellules spécifiques productrices de dopamine dans le cerveau. Cela entraîne des signes moteurs caractéristiques, notamment une raideur musculaire, une lenteur des mouvements, des tremblements au repos, ainsi qu’un mauvais équilibre et une mauvaise coordination, ce qui conduit au diagnostic provisoire de MP ; une réponse positive au traitement à la L-dopa et la progression des symptômes au fil du temps confirmeront le diagnostic. Cependant, la maladie ne se limite pas à des troubles moteurs ; les symptômes non moteurs comprennent une perte de l’odorat, de la constipation, des troubles du sommeil, ainsi que de la dépression et de l’anxiété, dont certains sont présents plusieurs années avant l’apparition des signes moteurs.1 La MP est la deuxième maladie neurodégénérative la plus courante, après la maladie d’Alzheimer. La prévalence mondiale de la maladie de Parkinson est estimée à 1,51 cas pour 1 000 personnes, 9,34 cas pour 1 000 personnes âgées de plus de 60 ans, et la prévalence est plus élevée chez les hommes que chez les femmes.2 Au Canada, on estime actuellement que la maladie de Parkinson touche directement plus de 110 000 Canadiens,3,4 et coûte au pays 3,3 milliards de dollars par an ; ces chiffres devraient augmenter de 40 000 personnes et de 1,1 milliard de dollars supplémentaires en coûts annuels d’ici 2034.3

Le parkinsonisme atypique (également appelé syndrome Parkinson-plus) désigne un groupe de troubles qui partagent certains signes avec la maladie de Parkinson, tels que les tremblements, la lenteur et la raideur (c.-à-d. le parkinsonisme), mais les patients présentent des symptômes supplémentaires qui peuvent progresser plus rapidement et bénéficier moins (voire pas du tout) du traitement à la L-dopa.1,5 Ces troubles comprennent l’atrophie des systèmes multiples (ASM), la paralysie supranucléaire progressive (PSP), la dégénérescence cortico-basale (DCB) et la maladie à corps de Lewy (MCL). Ces affections sont souvent difficiles à différencier de la MP et les unes des autres.1,5 Il existe un besoin non satisfait pour un modèle diagnostique précis et facile à utiliser afin d’aider à la différenciation. 

Le Réseau Canadian Parkinson Ouvert (RPCO)

Le RPCO est « une initiative rassemblant des personnes, des données et des ressources pour accélérer les découvertes sur la maladie de Parkinson »,6,7 qui comprend désormais 11 grands centres spécialisés dans les troubles du mouvement au Canada, répartis dans 5 provinces, et qui a recruté 2 318 participants (MP : 84 % ; Parkinson-plus : 4,6 % ; témoins neurologiquement sains (TS) : 11,4 %). Le RPCO est une ressource inestimable pour soutenir divers projets de recherche sur la MP en fournissant une base de données complète et anonymisée contenant des informations démographiques, épidémiologiques et cliniques, ainsi que divers outils et instruments d’évaluation liés à la MP.

Mise en œuvre de modèles statistiques/d’apprentissage automatique dans le domaine des soins de santé

Depuis son apparition dans les années 1950, l’apprentissage automatique (AA) est utilisé dans divers domaines, notamment les soins de santé.8,9 Dans le domaine des soins de santé, l’AA a été utilisé pour développer des outils de pronostic/diagnostic et des plans de traitement personnalisés.9,10 Cependant, malgré l’explosion du nombre de nouveaux développements et de publications, il existe encore un énorme fossé dans la mise en œuvre de l’apprentissage automatique dans le domaine des soins de santé : très peu de ces algorithmes parviennent jusqu’aux cliniques.11 Seneviratne et al. ont évoqué certains facteurs clés qui limitent l’adoption à plus grande échelle des modèles d’apprentissage automatique dans le domaine des soins de santé : l’applicabilité (des résultats), la sécurité et l’utilité. Un autre facteur à prendre en compte est l’exploitabilité des données d’entrée : quelles variables inclure dans le modèle et quelles sont leurs implications pour la future mise en œuvre clinique. Le défi de l’étude de cas actuelle est conçu pour que les participants à l'étude prennent conscience de ces éléments et les prennent en compte lors du développement d’un nouvel algorithme, et pour qu’ils comprennent l’importance de la collaboration avec des experts du domaine à chaque étape du projet.
 

Research Question


Challenging Question

  1. Using the Canadian Open Parkinson Network (C-OPN) data, can you develop and validate a classification model to distinguish patients with Parkinson’s disease (PD) from those with atypical parkinsonism (PD-plus) as well as from healthy controls? 
    Notes:
    1. You are free to explore standard statistical and/or machine learning models. Simpler models have the benefit of explainability and easier implementation but is not required.
    2. The main goal is to have a classification model for PD vs PD-plus. However, it is preferrable to explore model(s) that achieve multi-class classification of PD vs PD-plus vs neurologically healthy controls.
    3. One caveat is that PD-plus is an umbrella term for various neurological diseases. It’s a bonus if the model can help differentiate among these diseases as well. (Might be too ambitious, especially considering the relatively small sample size of the PD-plus group).
    4. Regarding validation, you are expected to assess how well the model generalizes to unseen data to prevent issues like overfitting. Methods for internal validation, e.g., train-test split, cross validation, bootstrapping etc., are recommended. Further, since C-OPN is a pan-Canada, multi-centric study, one may also select data from one or more specific sites as the hold-out dataset to approximate external validation.
    5. You are strongly encouraged to consider model implementation: for models that have similar performance, the ones that are easier to implement will have higher grades. Level of implementation difficulty of each variable is defined in the data dictionary:
      Level 1: the variable is based on a question that can be self-reported
      Level 2: the variable is a total score or sub-score of a self-report questionnaire, using this variable requires the completion of the entire questionnaire, or several questions of it 
      Level 3: using this variable requires the assistance of professionals but this is already routinely done in clinic
      Level 4: using this variable requires extra resources from professionals to implement
  2. We are also open to various challenges that you identify and wish to choose (for example on the question of sex/gender-specific clinical features among diagnostic groups, or regarding their environmental exposure history). If you want to work on a different research question after reviewing the C-OPN data and are unsure whether your research question(s) would meet the competition focus, please consult with Dr. J. Li or Dr. M. Schlossmacher.

Award information
We are pleased to announce that the winning team will receive an award of \$3,000. The award is sponsored by the Ottawa Parkinson Research Consortium, with strong support from and input by people living with Parkinson’s. In addition to the financial award, there may be potential for research opportunities / collaborations for the successful team members.

Variables

Date source and access

  1. Dataset: Single csv file (tabular format). All C-OPN data are de-identified, with Research Ethics Board (REB) approval at all participating sites. All participants provided written and informed consent, either in-person or electronically.
  2. Study cohort: C-OPN is a multi-centric national study that includes 2,318 participants (PD: 84%; PD-plus: 4.6%;note1 HC: 11.4%).6,7,note2 Diagnosis were made by movement disorder specialists in Canada according to the Movement Disorder Society (MDS) criteria15 or previously published criteria such as the UK Brain Bank criteria. The PD-plus group includes patients with the diagnosis of MSA, PSP, CBS, DLB, frontotemporal dementia (FTD), essential tremor (ET), and REM sleep behavior disorder (RBD). Compared with other highly curated cohorts in PD research, the C-OPN cohort offers more diversity in geography, ethnicity and phenotype, thus representing a more realistic group of individuals typically seen in neurologists’ offices. One caveat is the relatively small sample size of the HC group, and because of the way of recruiting neurologically healthy controls, some recruitment bias may be found, especially for variables like family history of PD.
    Notes:
    1. In the C-OPN data, the PD-plus group includes some other, rare conditions beyond the four disorders mentioned above (for more details, see the data dictionary); most of them also represent neurodegenerative conditions.
    2. Numbers based on data before July 2025. C-OPN database is constantly updated, recruitment number may change in the dataset used for this case study.
  3. Outcome of interest: Group classification of PD versus PD-plus. Multi-class classification of PD vs PD-plus vs neurologically healthy controls (HC), and/or classification between PD-plus conditions is a bonus.
  4. Features: The complete de-identified C-OPN database will be provided, see data dictionary. You are free to explore all available information in the database, but feature selection is expected - to only include most relevant and informative variables.
  5. Data dictionary: Included in the training package, see below.
  6. Data access: A Data Use Agreement (included in the training package) will be signed by all participating teams and sent to C-OPN (Anna Bendas, anna.bendas@hec.ca), followed by granting access to the dataset hosted on the REDCap platform.

The study package
Please see the Data Files link below for all materials included in the study package:

  1. Study description
  2. C-OPN data dictionary with annotation
  3. Data Use Agreement
  4. A README file to help understand data and data dictionary
  5. Protocol template for ethics application
  6. The standardized instruments (questionnaires and assessments) completed at each C-OPN site, which are kindly provided by the C-OPN team.

Organizer contact information
Juan Li, PhD 
Senior Clinical Research Associate, Neuroscience Program, Ottawa Hospital Research Institute; Ottawa, ON, Canada. 
Email: juli@ohri.ca

Michael Schlossmacher, MD, DABPN, FRCPC
Senior Scientist, Neuroscience Program, Ottawa Hospital Research Institute; Ottawa, ON, Canada.
Email: mschlossmacher@toh.ca