Learning from the past: answering new questions with past answers
Source:
WWW (2012)
URL:
http://www2012.org/proceedings/proceedings/p759.pdf
Abstract:
Community-based Question Answering sites, such as Yahoo!
Answers or Baidu Zhidao, allow users to get answers to complex,
detailed and personal questions from other users. However,
since answering a question depends on the ability and
willingness of users to address the asker’s needs, a significant
fraction of the questions remain unanswered. We measured
that in Yahoo! Answers, this fraction represents 15% of all
incoming English questions. At the same time, we discovered
that around 25% of questions in certain categories are
recurrent, at least at the question-title level, over a period
of one year.
We attempt to reduce the rate of unanswered questions in
Yahoo! Answers by reusing the large repository of past resolved
questions, openly available on the site. More specifically,
we estimate the probability whether certain new questions
can be satisfactorily answered by a best answer from
the past, using a statistical model specifically trained for
this task. We leverage concepts and methods from queryperformance
prediction and natural language processing in
order to extract a wide range of features for our model. The
key challenge here is to achieve a level of quality similar to
the one provided by the best human answerers.
We evaluated our algorithm on offline data extracted from
Yahoo! Answers, but more interestingly, also on online data
by using three “live” answering robots that automatically
provide past answers to new questions when a certain degree
of confidence is reached. We report the success rate of these
robots in three active Yahoo! Answers categories in terms of
both accuracy, coverage and askers’ satisfaction. This work
presents a first attempt, to the best of our knowledge, of
automatic question answering to questions of social nature,
by reusing past answers of high quality.