De-anonymizing social networks and mobility traces
Abstract
When people utilize social applications and services, their privacy suffers potential serious threats. In this work, we present a novel, robust, and effective de-anonymization attack to mobility trace data and social network data. First, we design a Unified Similarity (US) measurement which takes into account local and global structural characteristics of data, information obtained from auxiliary data, and knowledge inherited from on-going de-anonymization results. By analyzing the measurement on real datasets, we find that some datasets can potentially be de-anonymized accurately and the others can be de-anonymized in a coarse granularity. Utilizing this property, we present a US based De-Anonymization (DA) frame-work, which iteratively de-anonymizes data with an accuracy guarantee. Then, to de-anonymize large scale data without the knowledge of the overlap size between the anonymized data and the auxiliary data, we generalize DA to an Adaptive De-Anonymization (ADA) framework. By strategically working on two core matching subgraphs, ADA achieves high de-anonymization accuracy and reduces computational overhead. Finally, we examine the presented de-anonymization attack on three well known mobility traces: St. Andrews, Infocom06, and Smallblue, and three social network datasets: ArnetMiner, Google+, and Facebook. The experimental results demonstrate that the presented de-anonymization framework is very effective and robust to noise.