Modeling, Learning, and Inference of High-Dimensional Asynchronous Event Data
MetadataShow full item record
The increasing availability of temporal-spatial events produced from natural and social systems provides new opportunities and challenges for effective modeling the latent dynamics which inherently govern these seemly ``random'' data. In our work, we propose a unified probabilistic framework based on multivariate point processes to better predict `who will do what by when and where?' in the future. This framework comprises a systematic paradigm for modeling, learning, and making inference of large-scale asynchronous high-dimensional event data. With this common framework, we contribute in the following three aspects. Accurate Modeling: we first propose non-parametric and topic-modulated multivariate terminating point processes to capture continuous-time heterogeneous information diffusions. We then develop the low-rank Hawkes process to describe the recurrent temporal interactions among different types of entities. We also build a link between the recurrent neural network and the temporal point process to learn a general representation of the influence from the past event history. Finally, we establish a previously unexplored connection between Bayesian Nonparametrics and temporal point processes to jointly model the temporal data and other type of additional information. Efficient Learning: we develop a robust structure learning algorithm via group lasso, which is able to efficiently uncover sparse heterogeneous interdependent relations specified via vectorized parameters among the dimensions. We also propose an efficient nonnegative matrix rank minimization algorithm, which elegantly inherits the advantages from both the proximal methods and the conditional gradient methods to solve the matrix rank minimization problem under different constraints. Finally, in the data streaming setting, we develop a Bayesian inference algorithm for inferring latent variables and updating the respective model parameters based on both temporal and textual information, which achieves almost constant processing time per data sample. Scalable Inference: another important aspect of our research is to make future predictions by exploiting the learned models. Specifically, based on the terminating processes, we develop the first scalable influence estimation algorithm in continuous-time diffusion networks with provable performance guarantees. Based on the low-rank Hawkes processes, we develop the first time-sensitive recommendation algorithm, which not only can recommend the most relevant item specific to a given moment, but also can predict the next returning time for a user to a designated service. Finally, based on the recurrent point processes, we have derived an analytic solution to shape the overall network activities of users. We show that our method can provide fine-grained control over user activities in a time-sensitive fashion.