The Science of Measuring Advertising Effectiveness

Feb 3, 2015

By Pengyuan Wang and Dawei Yin In the online advertising market, it is crucial to provide advertisers with a reliable measurement of ad effectiveness to optimize ad campaign budgeting and planning. At Yahoo, as one of the largest advertising technology providers in the industry with capabilities across desktop, mobile, and smart TV, and various formats including video, native, and search ads, we invest a lot of effort measuring the effectiveness of our advertisers’ campaigns. But what is true ad effectiveness? And more importantly, how do we measure it? The industry standard of measuring consumer “conversions” or “success actions” (as defined by advertisers for each campaign, e.g. a consumer obtaining an auto insurance quote or making a purchase) is to calculate the ratio of users who performed these success actions after viewing an ad versus all users who have viewed the ad. However, what marketers often leave out of the equation is that a major segment of audiences may take actions regardless of exposure to ads, so their analysis may overestimate a campaign’s ad effectiveness. To solve for this issue, we developed a more nuanced approach that directly measures the causal effect of ads, i.e., how do the ads themselves cause more purchases or actions compared to no ad exposure? The measurement of such causal effects require the in-depth analysis of user characteristics (e.g., age, gender, interests). For example, assume that in an auto campaign most of the people who see the ad are females and most of those who don’t are males, as in Figure 1 below. If the females generally have a higher success rate than males, the effectiveness of the campaign could be overestimated because of the confounding effects of user characteristics–in this case, gender: it might just be that females are more likely to be exposed to an ad during a campaign for whatever reason and perform a success action. Therefore, the relationship between the ad exposure and conversion is not causal without eliminating such biases introduced by user characteristics. In a research paper published in the proceedings of last year’s ACM International Conference on Web Search and Data Mining (WSDM 2014), we presented analysis to tackle the above problem and measure the true ad effectiveness of a single ad campaign. The major idea is to re-weight the users, such that after re-weighting, the characteristics of users who were not exposed to ads are similar to the ones who were. For example, in the above auto campaign scenario, there are fewer females in the non-exposed group of users than males. We therefore then weigh the females in the non-exposed group heavier than the males, and as a result reach the balanced user characteristics shown in Figure 2 below. Then, we estimate the success rates of the weighted groups, calculate the lift (i.e., difference of the success rates of the two groups), and hence obtain an unbiased estimation of the causality effectiveness of the ad campaign without the confounding effect of user characteristics. Figure 1: Gender Distribution image Figure 2: Gender Distribution after Weighting image The problem of measuring the causal effect of ads can be further broken down to consider more complex ad treatments (i.e., whatever a given user sees of an ad campaign): for example, measuring the effectiveness of different ad frequencies (do more ad impressions cause more conversions?) or treatments across ad channels (does exposure to both TV and desktop ads cause more conversions compared to only TV or desktop?). In a follow-on research paper published at this week’s WSDM 2015 conference, we present an approach that is able to measure the two kinds of treatments. The main idea is to group users with similar user characteristics together, and then measure the ad effectiveness of the given treatment within each group. We do so using a tree structure (Figure 3) that is both flexible and computationally efficient. In each tree leaf, the users are homogeneous in terms of the ad treatments they might potentially get based on their user characteristics. We then compute the conversion rates within each of the leaf nodes, and the final success rate estimation, or conversion rate, is a weighted average of all the leaf nodes. In Figure 3, the tree splits the user base into two leaf nodes with homogeneous user characteristics (gender in this case). In reality, there are thousands of user characteristics, and the tree split is controlled by cross-validation to prevent overfitting. Figure 3: Illustration of the Tree-Based Grouping Approach image The two branches of methodologies for measuring ad effectiveness we have researched and described, re-weighting and the grouping of people according to user characteristics related to a given ad treatment, are actively used to evaluate the ad campaign performance of Yahoo’s clients. Their application provides Yahoo’s advertisers with a more complete assessment of the effectiveness of their campaigns. Perhaps even more crucially, these methods allow for Yahoo’s clients to make better judgements as to how to spend their money. The approach to re-weighting users’ given their characteristics, serves as the anchor to Yahoo’s proprietary approach to multi-touch attribution, a valuable tool at our clients’ disposal. Developed in a partnership between Yahoo Labs and Yahoo’s B2B Insights team, the analysis applies the principle of measuring the causal effect of advertising to the industry-wide problem of attributing credit to advertising across a wide range of media tactics (e.g., ad channels). For example, if a person sees multiple ads from an advertiser on his or her mobile phone and desktop, we can tell our clients with accuracy how views of each contribute by percentage to conversion. Furthermore, our research that measures the incremental impact of an advertising campaign (lift) on user behavior has been implemented by Yahoo's Ad Product Measurement team in the beta version of the Yahoo Brand Index. The Yahoo Brand Index is a scorecard of advertiser health and measures the impact of our clients’ campaigns on various success factors to help them allocate ad spend for optimal impact across various channels at Yahoo. Figure 4: Yahoo Brand Index Trend image Figure 5: Campaign Lift Info image The measurements for ad effectiveness developed by the Advertising Sciences team at Yahoo Labs can be expanded upon and tailored to a client’s specific request; for example, our research also considered the treatment of ad designs (do larger ads cause more conversions?). As we continue to expand upon and refine our research measuring true ad effectiveness, we make sure to translate our analyses into products that help advertisers determine how to best optimize their budgets to improve effectiveness and efficiency. Stay tuned for more advertising research insights coming soon. Blog contributions by: Jeremy Kanterman, Director, Ad Effectiveness and Analytics; and Bhaskar Krishnan, Senior Product Manager, Yahoo Display Advertising Analytics. Research co-authors: Jimmy Yang and Yi Chang