ACL-08: HLT - Columbus, OH June 15-20, 2008
 
ACL-08: HLT -  Annual Meeting of the Association for Computational Linguistics
ACL-08: HLT TUTORIAL
INTRODUCTION TO COMPUTATIONAL ADVERTISING



OVERVIEW

Web advertising is the primary driving force behind many Web activities, including Internet search as well as publishing of online content by third-party providers. A new discipline - Computational Advertising - has recently emerged, which studies the process of advertising on the Internet from a variety of angles. A successful advertising campaign should be relevant to the immediate user's information need as well as more generally to user's background and personalized interest profile, be economically worthwhile to the advertiser and the intermediaries (e.g., the search engine), as well as be aesthetically pleasant and not detrimental to user experience.

The tutorial does not assume any prior knowledge of Web advertising, and will begin with a comprehensive background survey of the topic. In this tutorial, we focus on one important aspect of online advertising, namely, contextual relevance. It is essential to emphasize that in most cases the context of user actions is defined by a body of text, hence the ad matching problem lends itself to many NLP methods. At first approximation, the process of obtaining relevant ads can be reduced to conventional information retrieval, where one constructs a query that describes the user's context, and then executes this query against a large inverted index of ads. We show how to augment the standard information retrieval approach using query expansion and text classification techniques. We demonstrate how to employ a relevance feedback assumption and use Web search results retrieved by the query. This step allows one to use the Web as a repository of relevant query-specific knowledge. We also go beyond the conventional bag of words indexing, and construct additional features using a large external taxonomy and a lexicon of named entities obtained by analyzing the entire Web as a corpus. Computational advertising poses numerous challenges and open research problems in text summarization, natural language generation, named entity extraction, computer-human interaction, and others. The last part of the tutorial will be devoted to recent research results as well as open problems, such as automatically classifying cases when no ads should be shown, handling geographic names, and context modeling for vertical portals.


Topics

  • Introduction
  • Advertising on the Web
    • The evolution of Web advertising
    • Advertese (introduction of terminology)
    • Main scenarios of online advertising
      • Sponsored search
      • Content match
      • Exact match vs. broad match
    • The economics of Web advertising
  • Main technical challenges for NLP and IR
  • Bibliography survey
  • IR modeling
    • Ads as information supply and reduction to search
    • A unified approach to Web advertising
    • Using search results as external knowledge
    • Text classification
    • Named entities
  • The research frontier
    • Text summarization / just-in-time advertising
    • When not to advertise / ad spam
    • Location awareness / geo-targeting
    • Context modeling
  • Discussion

Instructors

Affiliation: Yahoo! Research, Computational Advertising and Search Technology Group


Short Bios

    Evgeniy Gabrilovich is a Senior Research Scientist at Yahoo! Research. His research interests include information retrieval, machine learning, and computational linguistics. He serves on the program committees of ACL-08:HLT, AAAI '08, JCDL '08, CIKM '08 and WWW '08, and in the past he served on the program committees of AAAI, EMNLP-CoNLL, COLING-ACL, served as a mentor at SIGIR '07, as well as reviewed papers for ACM TOIT, IP&M, JNLE, CACM, AAAI, AAMAS, WWW and CIKM. Evgeniy earned his MSc and PhD degrees in Computer Science from the Technion - Israel Institute of Technology.

    Vanja Josifovski is a Principal Research Scientist at Yahoo! Research, where he works on search and advertisement technologies for the Internet. He is currently exploring designs for the next generation ad placement platforms for contextual and search advertising. Previously, Vanja was a Research Staff Member at the IBM Almaden Research Center working on several projects in database runtime and optimization, federated databases, and enterprise. He earned his MSc degree from the University of Florida at Gainesville, and his PhD from the Linkoping University in Sweden. Vanja published over thirty peer reviewed publications, authored around 20 patent applications, and was on the program committees of WWW, SIGIR, ICDE, VLDB and other major conferences in the database, information retrieval, and search areas.
    Bo Pang is a Research Scientist at Yahoo! Research. Her primary research interests are in natural language processing, machine learning, and information retrieval. She obtained her PhD in Computer Science from Cornell University, where she worked on automatic analysis of sentiment in text and paraphrase extraction and generation in the context of machine translation. She has served on the program committees of ACL, HLT-NAACL, EMNLP, and AAAI, and reviewed for journals including ACM TOIS, JMLR, JAIR, Computer Speech and Language, and Computational Linguistics.