The Status of Statistics Education in Canada

Introduction

The purpose of this report is twofold:
      a) to present summative data on post-secondary statistics education,and
      b) to examine the structure of undergraduate statistics programs in Canada.
This information is important for planning and designing statistical curricula, and is of particular interest to educators, administrators and other stakeholders. In the following sections, we describe our data in more detail and present results along our two lines of inquiry. We conclude with a discussion of our findings and their implications for directing future efforts in statistics education. An earlier version of this report was presented in a contributed session at the 2017 SSC Annual Meeting.

Data Sources

STUDENT DATA
Student data come from Statistics Canada’s (StatCan) Post-secondary Student Information System (PSIS)1, a national survey of Canadian public post-secondary institutions. PSIS collects data on enrolments and graduates, classified by degree, program of study, and other geographic and demographic information. PSIS data are publicly available2  across broad areas of study, and we looked at data across all programs in Mathematics, Computer and Information Sciences (MCIS) for comparison. For statistics programs we extracted PSIS data down at the program level through StatCan’s Real-Time Remote Access3 (RTRA) system, available through our institution’s data library. Specifically, we looked at three types of programs: a) Biostatistics, b) General Statistics, and c) Mathematical Statistics and Probability, corresponding to the program classification4 used by StatCan. Our student data consist of enrolment and graduate numbers spanning the academic years 2010/11 to 2015/16, organized by the following variables:

  • YEAR: Academic year, from 2010 to 2015 (2010 corresponds to 2010/11 academic year, etc.)
  • PROGRAM: Type of program, one of Biostatistics, General Statistics, Mathematical Statistics and Probability
  • LEVEL: Degree level, one of BSc, MSc, PhD
  • REGION: Region/province of Canada, one of Atlantic (NB/NL/NS/PE), QC, ON, Prairies (AB/MB/SK), BC
  • SEX: Student’s sex, one of Female, Male
     

PROGRAM DATA


Data on statistics curricula come from the academic calendars of Canadian universities for the 2016/17 academic year. We collected information on undergraduate statistics programs satisfying two criteria: a) the program focuses on Statistics proper (i.e. excludes Biostatistics, Mathematical or Business Statistics, and Actuarial Science), and b) it is a 4-year stand-alone program, i.e a typical Honours BSc program that does not need to be combined with another program to lead to a degree (as in a double major). In total, 24 programs satisfied our conditions for inclusion in the study. Programs are expressed as collections of course requirements, and our basic data units are individual course requirements. For each course requirement, we recorded the following information:

  • DISCIPLINE: Discipline that offers course requirement, one or more of Computer Science, Mathematics, Statistics, Other
  • LEVEL: Year of study that course requirement is typically taken, one or more of 1, 2, 3, 4
  • CREDIT: Number of course credits, 1 credit = 1 semester course
  • TYPE: Type of course requirement, one of Core, Elective
  • TOPIC CATEGORY: Categorization of content, one or more of Computation, Mathematics, Probability, Statistical Methodology, Statistical Practice, Statistical Theory, Other

The first four variables are objective, whereas the final categorization is subjective, based on our interpretation of the calendar description for the course. Certain requirements were expressed in terms of sets of courses (e.g. 3 credits from 3rd or 4th year mathematics or statistics courses). For the analysis, we distributed the credits from such requirements evenly among the possible values. The data and R scripts used for creating this report are available on GitHub5.

 

Results

STUDENT DATA
We now present “vital statistics” on statistics education, i.e. information on the number of students who enter and complete statistics programs in Canada. Below are time series plots of total enrolments and graduates6 in statistics programs, broken down by degree level. Obviously, there is rapid growth in enrolments at all levels, measuring around 20% in the last two available years. This growth is also reflected in the number of awarded BSc degrees, although with some lag. Relative differences between enrolments and graduates are due, among other things, to the different lengths of the degrees: MSc programs are typically much shorter than BSc or PhD programs.

 

 

 

 

 

 

 

 

 

We compared Statistics with its parent group of programs in Mathematics, Computer & Information Sciences (MCIS). The corresponding time series plots are presented below. Statistics is included in the MCIS grouping, and constitutes from 5 to 10% of awarded degrees at different levels. Nevertheless, Statistics BSc enrolments seem to grow faster than MCIS, the latter increasing by around 10% in recent years. If Statistics enrolments continue to grow at twice this rate, the discipline will soon make up a bigger part of this group.

 

 

 

 

 

 

 

 

 

We also compared the number of Statistics graduates in Canada to those from the United States (US). US data come from the Integrated Post-secondary Education Data System (IPEDS) that is conducted by the Department of Education’s National Center for Education Statistics (NCES)7. The ratio of statistics degrees in Canada vs the US is comparable to the ratio of their populations (14% vs 11%, respectively). The most important difference is qualitative: the majority of degrees in the US are at the MSc level, and almost twice as many as at the BSc level. In Canada the situation is reversed: the majority of statistics graduates come from the BSc level, at an approximate ratio of 5-to-3 compared to the MSc level. This places particular emphasis on undergraduate statistics education in Canada, as the majority of statisticians enter the profession with a BSc degree.

 

 

Next, we look at the breakdown of awarded degrees in terms of programs. The following barplots present the proportion of graduates in Biostatistics, General Statistics, and Mathematical Statistics & Probability, at different levels in Canada and the US. At least 85% of undergraduate degrees are in General Statistics, but this proportion drops to around 65% for graduate degrees. In Canada, around 30% of graduate degrees are in Mathematical Statistics and Probability, whereas in the US this is replaced by Biostatistics.

 

 

 

 

 

 

 

 

Lastly, we look at the composition of undergraduate enrolments for Statistics and its parent MCIS group by region and sex. The barplots below show that Ontario has disproportionately more Statistics students, with 65% of the total in Canada. This is at the expense of Quebec and the Atlantic and Prairie provinces, which have a smaller share of students relative to their MCIS numbers. The great news is that Statistics enrolments have gender parity, whereas for MCIS there are three males for every female.

 

 

 

 

 

 

 

 

PROGRAM DATA
In this section we examine the structure of Statistics curricula in Canada. The 24 programs in the study are listed in the plot below, labeled by offering institution and grouped by region. The horizontal bars represent the number of required credits (equivalent to semester courses) broken down by requirement type (core or elective). The average program requires 25 courses, of which around 18 are core and 7 are electives.

 

 

 

 

 

 

 

 

 

 

Next, we examine the distribution of course requirements by discipline. The following barplot on the left presents the average number of credits by discipline and type. Out of the 25 courses that are required on average, roughly 13 are offered by Statistics, 9.5 by Mathematics, 1.5 by Computer Science and 1 by other disciplines. The plot on the right shows the typical sequence of course requirements over the length of a program. The horizontal axis represents the level (year of study) at which a requirement is offered, and the vertical axis represents the average number of credits across programs. It seems that most programs are heavy on mathematics courses in the first two years, after which they are dominated by statistics courses. Computer science courses are almost exclusively taken in the first year, and the same is true for courses from other disciplines. It is interesting to note that there is fewer than a single statistics course offered, on average, in the first year of the statistics programs.

 

 

 

 

 

 

 

 

 

 

The offering discipline provides a relatively rough idea of what is covered in a course, as certain material can be offered under various course codes. For this reason, we created a new variable to describe the contents of a course. We used six descriptive topic categories and an extra one for everything else, and we tagged each course requirement with one or more topic categories, based on its calendar description. The six topic categories we used are presented below, together with the five most frequent terms in their calendar descriptions. Our assignment is necessarily subjective, but the plot gives a better sense of what each category represents.

 

 

 

 

 

 

 

 

 

 

 

 

 

We now present the results of our course categorization. The barplot on the left shows the breakdown of course credits by topic category and discipline. All topics are almost exclusively offered by a single discipline and the most interesting one is Probability. Although Probability Theory is a branch of (Applied) Mathematics, it is primarily taught under a Statistics course code. This brings the average number of mathematics-related credits on par with statistics, at around 11 credits. The barplot on the right presents the number of credits per topic category, broken down by type. Mathematics and Probability Theory consist of mostly core requirements, whereas Statistical Practice is the least developed of the Statistics topics.

 

 

 

 

 

 

 

 

Summary and Conclusions

Statistics is experiencing a period of rapid growth, with university enrolments increasing at double-digit rates. The majority of statisticians in Canada come from BSc programs, which has important implications for how and what we teach our undergrads. Ontario has a higher proportion of statistics students compared to other provinces, but the gender ratio in the discipline is balanced. In terms of undergraduate programs, most curricula have extensive Mathematical components and a relatively theoretical focus. There are typically few courses in Statistics during the first two years of study, and after that there is more emphasis on statistical methodology and theory.

Given our results, and the role that undergraduate education plays in preparing students for the statistical profession, we make three broad recommendations for undergraduate curricula:

  1. Offer more statistics courses. Statistical topics, rather than mathematical ones, should constitute the bulk of the curriculum.
  2. Offer statistics courses early on. Students should be introduced to the main concepts and uses of Statistics from the start, and they should continue to refine them throughout their studies.
  3. Place more emphasis on Statistical practice. Students should be given multiple opportunities to apply Statistics in real settings, and develop relevant communication, collaboration and computing skills.

 

Acknowledgements

We would like to thank Olivia Rennie for her excellent research support in collecting program data.

  1. http://www23.statcan.gc.ca/imdb/p2SV.pl?Function=getSurvey&SDDS=5017

  2. For enrolments: https://open.canada.ca/data/en/dataset/f4ee8c35-ac77-4b12-ace8-b6d8e0908eb1, and graduates: https://open.canada.ca/data/en/dataset/e7150d2d-63ee-42af-ae00-9914cf69a496

  3. https://www.statcan.gc.ca/eng/rtra/rtra

  4. http://www.statcan.gc.ca/eng/subjects/standard/cip/2011/index

  5. https://github.com/damouras/SoSE_liaison

  6. Values for 2015 are rounded to a hundred, due to RTRA’s new controlled rounding system.

  7. https://ncsesdata.nsf.gov/webcaspar/index.jsp?subHeader=WebCASPARHome

 

Sotirios Damouras & Sohee Kang

Monday, October 1, 2018

Liaison Newsletter: