Cervical Cancer


Data Source: 

Toronto Sunnybrook Health Science Center (TSRCC)


Dr. Al Covens, Department of Obstetrics and Gynecology, University of Toronto, Ontario, Canada has provided this Case Study in cooperation with Edmee Franssen, BSc, MSc., Biostatistician, and Milena Kurtinecz, BSc., Statistical Database Coordinator


Radical hysterectomy (an operation in which the uterus is removed) has been demonstrated to be the customary method of treatment of early stage of the cervix cancer. Although both surgery and radiation therapy produce equivalent cure rates, surgery is often selected for younger, healthier patients, based on a shorter treatment course, as an opportunity for ovarian preservation, and better post-treatment vaginal function. The cure rate associated with radical surgery (approximately 80%) has not appreciably changed over the last three decades. It has been of interest to identify clinical and pathological factors that predict an increased risk of recurrence following surgery. These include tumor size, cell type, grade, depth of invasion, and lymph node status.

Data Description

The study, conducted in Toronto, is of a prospective data collection design where the interest of the doctor is to determine the different attributes predicting survival (i.e. no relapse of disease). It is expected that if there will be a relapse, it will occur during the first two years following surgery. The overall relapse rate is approximately 20%, and in the case of stage I cancer, the relapse rate is less than 5%.

The data (excel filespace delimited data file) documents the cases of 905 cervical cancer patients of which only 871 patients, having a record of their last follow-up, are considered. A patient enters this study on her surgery date, also considered to be her diagnosis date and is observed for an unspecified period of time, or until her first relapse. The range of recorded observations is roughly from 1984 to present, however the exact time range can be gotten from the data itself.

In addition to determining the different attributes predicting survival, there is a need for patient classification regarding likelihood of relapse. This classification can be into 3 or 4 groups as follows:

Classification 1: "Low relapse", "Moderate relapse", "High relapse"
Classification 2: "No relapse", "Low relapse","Moderate relapse", "High relapse"

Frequently Asked Questions

Please check this section regularly for updates.


Research Question: 

The following are two goals we hope to achieve with these data:

  1. Determining which of the attributes, listed in the table, predict the event of no relapse of the disease;
  2. Classifying patients according to individual risk of relapse.



The attributes (variables) in this study are:



MRNO Patient number.
SURGDAT Surgery date (diagnosis date).
ADJ_RAD Categorical variable:
0 – if patient did NOT receive radiation therapy;
1 – if patient received radiation therapy. (radiation therapy given only when parameters are deemed severe enough by doctor)
AGE_1 Age of patient at time of diagnosis.
CLS_1 Categorical variable: (Prognostics) (Capillary Lymphatic Spaces)
0 – negative;
1, 2 – positive.
DIS_STA Categorical variable:
0 – no evidence of disease;
1 – alive with disease;
2 – dead of disease;
3 – dead of complications (disease present);
4 – dead of complications (disease absent);
5 – dead of unrelated causes.
GRAD_1 Categorical variable: cell differentiation
1 – better;
2 – moderate;
3 – worst;
0 – indicating a missing value.
HISTOLOG Categorical variable: ranging from 0 to 6.
MARGINS Categorical variable: disease left after surgery
0 – clear;
1 – para-vaginal area;
2 – vaginal area;
3 – both.
MAXDEPT Continuous variable: depth of tumor (mm); a record of 0 indicates that device could not measure depth due to small measurement.
PELLYMPH 0 – negative;
1 – positive.
RECURRN1 Date of reoccurrence of disease (if no reoccurrence, there is no entry recorded).
SIZE_1 Size of tumor (mm) upon diagnosis.
FU_DATE Last follow up date.



Student Presentations
  1. McMaster University: Christine Calzonetti, Simo Goshev, Rongfang Gu, Shahidul Mohammad Islam, Amanda Lafontaine, Marcus Loreti, Maria Porco, William Volterman, Qihao Xie.
  2. University of Toronto: Eshetu Atenafu, Sandra Gardner, So-hee Kang, Anjela Tzontcheva.
  3. University of Calgary: Alberto Nettel Aguirre, Luz Palacious.
  4. University of Guelph: Baktiar Hasan, Mark Kane, Melanie Laframboise, Michael Maschio, Andy Quigley.
  5. York University: Sophia Lee, Noa Rozenblit, Sumanth Sharatchandran, Shirin Yazdanian.