Influence modeling in behavioral data
MetadataShow full item record
Understanding influence in behavioral data has become increasingly important in analyzing the cause and effect of human behaviors under various scenarios. Influence modeling enables us to learn not only how human behaviors drive the diffusion of memes spread in different kinds of networks, but also the chain reactions evolve in the sequential behaviors of people. In this thesis, I propose to investigate into appropriate probabilistic models for efficiently and effectively modeling influence, and the applications and extensions of the proposed models to analyze behavioral data in computational sustainability and information search. One fundamental problem in influence modeling is the learning of the degree of influence between individuals, which we called social infectivity. In the first part of this work, we study how to efficient and effective learn social infectivity in diffusion phenomenon in social networks and other applications. We replace the pairwise infectivity in the multidimensional Hawkes processes with linear combinations of those time-varying features, and optimize the associated coefficients with lasso regularization on coefficients. In the second part of this work, we investigate the modeling of influence between marked events in the application of energy consumption, which tracks the diffusion of mixed daily routines of household members. Specifically, we leverage temporal and energy consumption information recorded by smart meters in households for influence modeling, through a novel probabilistic model that combines marked point processes with topic models. The learned influence is supposed to reveal the sequential appliance usage pattern of household members, and thereby helps address the problem of energy disaggregation. In the third part of this work, we investigate a complex influence modeling scenario which requires simultaneous learning of both infectivity and influence existence. Specifically, we study the modeling of influence in search behaviors, where the influence tracks the diffusion of mixed search intents of search engine users in information search. We leverage temporal and textual information in query logs for influence modeling, through a novel probabilistic model that combines point processes with topic models. The learned influence is supposed to link queries that serve for the same formation need, and thereby helps address the problem of search task identification. The modeling of influence with the Markov property also help us to understand the chain reaction in the interaction of search engine users with query auto-completion (QAC) engine within each query session. The fourth part of this work studies how a user's present interaction with a QAC engine influences his/her interaction in the next step. We propose a novel probabilistic model based on Markov processes, which leverage such influence in the prediction of users' click choices of suggested queries of QAC engines, and accordingly improve the suggestions to better satisfy users' search intents. In the fifth part of this work, we study the mutual influence between users' behaviors on query auto-completion (QAC) logs and normal click logs across different query sessions. We propose a probabilistic model to explore the correlation between user' behavior patterns on QAC and click logs, and expect to capture the mutual influence between users' behaviors in QAC and click sessions.