Publication

Efficient and Accurate Lp-norm Multiple Kernel Learning

Source:

NIPS (2009)

Abstract:

Learning linear combinations of multiple kernels is an appealing strategy when the right choice of features is unknown. Previous approaches to multiple kernel learning (MKL) promote sparse kernel combinations to support interpretability. Unfortunately, L1-norm MKL is hardly observed to outperform trivial baselines in practical applications. To allow for robust kernel mixtures, we generalize MKL to arbitrary Lp-norms. We devise new insights on the connection between several existing MKL formulations and develop two efficient interleaved optimization strategies for arbitrary p>1. Empirically, we demonstrate that the interleaved optimization strategies are much faster compared to the traditionally used wrapper approaches. Finally, we apply Lp-norm MKL to real-world problems from computational biology, showing that non-sparse MKL achieves accuracies that go beyond the state-of-the-art.

Publication

Bidding for Representative Allocations for Display Advertising

Source:

WINE (2009)

Abstract:

Display advertising has traditionally been sold via guaranteed contracts -- a guaranteed contract is a deal between a publisher and an advertiser to allocate a certain number of impressions over a certain period, for a pre-specified price per impression. However, as spot markets for display ads, such as the RightMedia Exchange, have grown in prominence, the selection of advertisements to show on a given page is increasingly being chosen based on price, using an auction. As the number of participants in the exchange grows, the price of an impressions becomes a signal of its value. This correlation between price and value means that a seller implementing the contract through bidding should offer the contract buyer a range of prices, and not just the cheapest impressions necessary to fulfill its demand. Implementing a contract using a range of prices, is akin to creating a mutual fund of advertising impressions, and requires {\em randomized bidding}. We characterize what allocations can be implemented with randomized bidding, namely those where the desired share obtained at each price is a non-increasing function of price. In addition, we provide a full characterization of when a set of campaigns are compatible and how to implement them with randomized bidding strategies.

Download:

Publication

Query-Sets: Using Implicit Feedback and Query Patterns to Organize Web Documents

Source:

17th international Conference on World Wide Web, ACM Press, Beijing, China (2008)

Publication

Dr. Searcher and Mr. Browser: a unified hyperlink-click graph

Source:

ACM 17th Conference on Information and Knowledge Management, ACM Press, Napa Valley, California (2008)

Publication

Website Privacy Preservation for Query Log Publishing

Source:

Proceedings of the First SIGKDD International Workshop on Privacy, Security, and Trust in KDD (PinKDD'07), Springer, Volume 4890 (2008)

Abstract:

In this paper we study privacy preservation for the publication of search engine query logs. We introduce a new privacy concern, "website privacy" as a special case of "business privacy". We define the possible adversaries who could be interested in disclosing website information and the vulnerabilities in the query log, which they could exploit. We elaborate on anonymization techniques to protect website information, discuss different types of attacks that an adversary could use and propose an anonymization strategy for one of these attacks. We then present a graph-based heuristic to validate the effectiveness of our anonymization method and perform an experimental evaluation of this approach. Our experimental results show that the query log can be appropriately anonymized against the specific attack, while retaining a significant volume of useful data.

Publication

Issues with Privacy Preservation in Query Log Mining

Source:

Privacy-Aware Knowledge Discovery: Novel Applications and New Techniques, Chapman and Hall/CRC Press (2009)

Abstract:

In this chapter we present and analyze the current state of the art in query log privacy preservation. We focus on two complementary issues: the privacy of users that submit queries, and the privacy of websites featured in search results. We study vulnerabilities that arise in query log publishing, specifically in Web search engine logs, and discuss the effects that these have on the parties involved. Our analysis gives an overview of anonymization techniques that have been attempted and their weaknesses at preventing attacks on query log data. Furthermore, our research studies the implications for public data produced by query log data mining applications, and how it poses a risk of involuntary private data disclosure.

Publication

Origins of Homophily in an Evolving Social Network

Source:

American Journal of Sociology, Volume 115, Issue 2, p.405-450 (2009)

Download:

Project

News

Yahoo! at ACM International Conference on Multimedia



Yahoo! Labs had a prominent presence at the 2009 ACM International Conference on Multimedia held on October 19 -23, 2009 in Beijing, China.

Publication

Dynamics in Network Interaction Games

Source:

23rd Intl. Symposium on Distributed Computing (DISC) (2009)

Download: