Unsupervised Bot Detection in a Likert-Type Personality Questionnaire

Whereas administering Likert-type questionnaires to online crowdsourced samples is common in personality research, doing so risks contamination of the data by malicious computer-generated random responses, i.e. bots. Toward unsupervised bot detection, current literature offers nonresponsivity indices (NRIs)---summary statistics of how much each respondent violates supposed human correlational structure. However, cut-off values for NRIs are not available, and no attempt has been made to algorithmically classify respondents as human vs. bot. In this work, we propose an EM algorithm in NRI space to detect bots. Based on the assumption that bot responses are exchangeable random vectors, this algorithm generates its own bot examples to facilitate classification of respondents. Our work emphasizes the use of visualizations to make the algorithm's unsupervised solution credible to users. A simulation study of the algorithm shows promise in terms of classification accuracy.

Session

Mégadonnées

Date and Time