Skip to main content
In a typical classification problem, human coders/annotators code a subset of available observations manually, and these coded observations are used to train a classifier so that uncoded observations can then be classified automatically. One of the assumptions that most classification methods rely on is that the human-assigned labels are reliable, which does not always hold in practice. This paper explores how double-coding can help to improve automatic classification in the presence of coding error. Four double-coding strategies are proposed and compared with single-coding. Our study shows that double-coding is preferable when coding error is non-negligible. We also find that the disagreements between the two manual annotators are better resolved by employing a more expensive expert coder or by simply being removed rather than being coded by a third regular annotator.
Date and Time
-
Additional Authors and Speakers (not including you)
Matthias Schonlau
University of Waterloo
Language of Oral Presentation
English
Language of Visual Aids
English

Speaker

Edit Name Primary Affiliation
Zhoushanyue He University of Waterloo