Social computing for personalization and credible information mining using probabilistic graphical models
MetadataShow full item record
In this dissertation, we address challenging social computing problems in personalized recommender systems and social media information mining. We tap into probabilistic graphical models, including directed and undirected graphical models, to model a large number of observed and unobserved variables as well as various dependency relationships between variables, and develop efficient computation algorithms that exploit the graph structure to solve the problems. In recommender systems, we propose probabilistic graphical models for Collaborative Filtering (CF) algorithms in various problem settings, and solve them using Belief Propagation (BP) algorithms that allow scalable and distributed implementations. Firstly, user similarities are computed in factor graphs. Then unknown ratings are predicted in Pairwise Markov Random Fields (PMRFs). Further, when online social networks of users are provided, a Bayesian Network (BN) recommendation system is constructed based on user relations to improve recommendation for cold-start users or users do not have sufficient ratings. To preserve user privacy, a semi-distributed item-based CF system is developed, which employs semi-distributed BP for item similarity computation in factor graphs, without disclosing ratings to the server or other peer users. Finally, to protect CF recommender systems from shilling attacks, a factor graph is proposed to jointly detect colluding spammers, which significantly improves detection accuracy over classification algorithms based on a single user's rating patterns. In social media information mining, to detect false information and keep track of information credibility, we propose a generative probabilistic model to predict the credibility of events in Twitter-like social media using streaming tweets. The proposed algorithm predicts credibility much faster than existing offline algorithms and updates prediction online with newly observed tweets. Further, to identify suspicious users that perform malevolent activities such as spamming and phishing, we propose a probabilistic PMRF model for predicting the trustworthiness of social media users. The PMRF model improves prediction accuracy by taking into account user relationships compared to existing prediction algorithms for individual users.