Efficient inference algorithms for network activities
Tran, Long Quoc
MetadataShow full item record
The real social network and associated communities are often hidden under the declared friend or group lists in social networks. We usually observe the manifestation of these hidden networks and communities in the form of recurrent and time-stamped individuals' activities in the social network. The inference of relationship between users/nodes or groups of users/nodes could be further complicated when activities are interval-censored, that is, when one only observed the number of activities that occurred in certain time windows. The same phenomenon happens in the online advertisement world where the advertisers often offer a set of advertisement impressions and observe a set of conversions (i.e. product/service adoption). In this case, the advertisers desire to know which advertisements best appeal to the customers and most importantly, their rate of conversions. Inspired by these challenges, we investigated inference algorithms that efficiently recover user relationships in both cases: time-stamped data and interval-censored data. In case of time-stamped data, we proposed a novel algorithm called NetCodec, which relies on a Hawkes process that models the intertwine relationship between group participation and between-user influence. Using Bayesian variational principle and optimization techniques, NetCodec could infer both group participation and user influence simultaneously with iteration complexity being O((N+I)G), where N is the number of events, I is the number of users, and G is the number of groups. In case of interval-censored data, we proposed a Monte-Carlo EM inference algorithm where we iteratively impute the time-stamped events using a Poisson process that has intensity function approximates the underlying intensity function. We show that that proposed simulated approach delivers better inference performance than baseline methods. In the advertisement problem, we propose a Click-to-Conversion delay model that uses Hawkes processes to model the advertisement impressions and thinned Poisson processes to model the Click-to-Conversion mechanism. We then derive an efficient Maximum Likelihood Estimator which utilizes the Minorization-Maximization framework. We verify the model against real life online advertisement logs in comparison with recent conversion rate estimation methods. To facilitate reproducible research, we also developed an open-source software package that focuses on various Hawkes processes proposed in the above mentioned works and prior works. We provided efficient parallel (multi-core) implementations of the inference algorithms using the Bayesian variational inference framework. To further speed up these inference algorithms, we also explored distributed optimization techniques for convex optimization under the distributed data situation. We formulate this problem as a consensus-constrained optimization problem and solve it with the alternating direction method for multipliers (ADMM). It turns out that using bipartite graph as communication topology exhibits the fastest convergence.