Robust Tree-based Causal Inference for Complex Ad Effectiveness Analysis

Feb 3, 2015

As the online advertising industry has evolved into an age of diverse ad formats and delivery channels, users are exposed to complex ad treatments involving various ad characteristics. The diversity and generality of ad treatments call for accurate and causal measurement of ad effectiveness, i.e., how the ad treatment {\it causes} the changes in outcomes without the confounding effect by user characteristics. Various causal inference approaches have been proposed to measure the causal effect of ad treatments. However, most existing causal inference methods focus on univariate and binary treatment and are not well suited for complex ad treatments. Moreover, to be practical in the data-rich online environment, the measurement needs to be highly general and efficient, which is not addressed in conventional causal inference approaches. In this paper we propose a novel causal inference framework for assessing the impact of general advertising treatments. Our new framework enables analysis on uni- or multi-dimensional ad treatments, where each dimension (ad treatment factor) could be discrete or continuous. We prove that our approach is able to provide an unbiased estimation of the ad effectiveness by controlling the confounding effect of user characteristics. The framework is computationally efficient by employing a tree structure that specifies the relationship between user characteristics and the corresponding ad treatment. This tree-based framework is robust to model misspecification and highly flexible with minimal manual tuning. To demonstrate the efficacy of our approach, we apply it to two advertising campaigns. In the first campaign we evaluate the impact of different ad frequencies, and in the second one we consider the synthetic ad effectiveness across TV and online platforms. Our framework successfully provides the causal impact of ads with different frequencies in both campaigns. Moreover, it shows that the ad frequency usually has a treatment effect cap, which is usually over-estimated by naive estimation.

  • WSDM 2015
  • Conference/Workshop Paper