Large text and social network data sets are increasingly of interest. I will discuss recent work on probabilistic latent variable models for such data.
We discuss the Elias-Fano encoding for index compression, and a new partitioning technique to improve its space usage without affecting performance.
Giovanni Stilo presents SAX, an efficient technique for mining events of social interest; tested and evaluated over one year of Twitter's stream.
To what extent are Airbnb stays serving as substitutes for hotel stays, and what is the impact on the bottom line of affected hotels? We explore.
I will present a large scale quantitative study of information overload and evaluate its impact on information dissemination throughout Twitter.