Spam or Ham? Characterizing and Detecting Fraudulent "Not Spam" Reports in Web Mail Systems
MetadataShow full item record
Web mail providers rely on users to “vote” to quickly and collaboratively identify spam messages. Unfortunately, spammers have begun to use large collections of compromised accounts not only to send spam, but also to vote “not spam” on many spam emails in an attempt to thwart collaborative filtering. We call this practice a vote gaming attack. This attack confuses spam filters, since it causes spam messages to be mislabeled as legitimate; thus, spammer IP addresses can continue sending spam for longer. In this paper, we introduce the vote gaming attack and study the extent of these attacks in practice, using four months of email voting data from a large Web mail provider. We develop a model for vote gaming attacks, explain why existing detection mechanisms cannot detect them, and develop new, efficient detection methods. Our empirical analysis reveals that the bots that perform fraudulent voting differ from those that send spam. We use this insight to develop a clustering technique that identifies bots that engage in vote-gaming attacks. Our method detects tens of thousands of previously undetected fraudulent voters with only a 0.17% false positive rate, significantly outperforming existing clustering methods used to detect bots who send spam from compromisedWeb mail accounts.