Large-scale 2014 World Cup outcome analysis based on Tumblr posts

Aug 24, 2014

With the 2014 FIFA World Cup kicking o on June 12th, billions of fans across the world have turned their attention toward host country Brazil to root for their teams. Soccer (or football, if you prefer) fans are loud; you need only remember the last World Cup's infamous vuvuzelas for a demonstration. But fans are not only loud in stadiums. They also make their voices heard across social media. And though you may assume these fans are just blowing their vuvuzelas into the social abyss, if you listen closely, you will discover a treasure trove of data, including an answer to the most important question of all: "Who will win?". In this paper we use Tumblr posts collected during 4 months prior to the start of the World Cup to predict the outcome of every game. We describe the prediction algorithm as well as the analysis of the performance results, including comparison of the predictions of several competing methods.

  • Workshop on Large-Scale Sports Analytics at ACM SIGKDD Conference on Knowledge Discovery and Data Mining (SportsKDD), New York City, USA, 2014.
  • Conference/Workshop Paper