Security Questions: A Large Crowdsourced Dataset and Analysis of Bias


Jennifer Golbeck and Simon Li, University of Maryland


Security questions are a common fallback authentication mechanism across the internet, from banking to e-commerce to social media. Previous studies have investigated issues with those questions, with a marked focus on their susceptibility to attack. Indeed, some questions have a small set of possible answers and others are easily guessable by people who know or know about the account owner, and this makes them less secure. However, there are many more important avenues of study regarding security questions: Is there bias inherent in some of these questions? Which are practically easier or more difficult to remember? How does this vary based on demographics? To support this type of usable security research, a dataset of security questions is required. We have created a public, actively updated dataset of questions with their sources. We also demonstrate how such a dataset can be useful. We include an initial analysis of bias, demonstrating that a large percentage of questions are significantly more difficult for people in lower income groups to answer. This highlights the need for future work in collecting, analyzing, and designing security questions if they are to be used for fallback authentication.

    author = {Golbeck, Jennifer and Li, Simon},
    title = {{Security Questions: A Large Crowdsourced Dataset and Analysis of Bias}},
    booktitle = {Who Are You?! Adventures in Authentication Workshop},
    year = {2020},
    series = {WAY~'20},
    pages = {1--5},
    address = {Virtual Conference},
    month = aug,
    publisher = {}
} % No publisher