For a few dollars less: Identifying review pages sans human labels
Source:
NAACL (2009)
Abstract:
We address the problem of large-scale automatic detection of online
reviews without using any human labels. We propose an
efficient method that combines two basic ideas: Building
a classifier from a large number of noisy examples and using the
structure of the website to enhance the performance of this
classifier. Experiments suggest that our method
is competitive against supervised learning methods that mandate
expensive human effort.