Publications > Natural Language & Dialogue Understanding > DocTag2Vec: An Embedding Based Multi-label Learning Approach for Document Tagging

DocTag2Vec: An Embedding Based Multi-label Learning Approach for Document Tagging

Publication

Aug 3, 2017

Abstract

Tagging news articles or blog posts with relevant tags from a collection of predefined ones is coined as document tagging in this work. Accurate tagging of articles can benefit several downstream applications such as recommendation and search. In this work, we propose a novel yet simple approach called DocTag2Vec to accomplish this task. We substantially extend Word2Vec and Doc2Vec—two pop-ular models for learning distributed rep- resentation of words and documents. In DocTag2Vec, we simultaneously learn the representation of words, documents, and tags in a joint vector space during training, and employ the simple k-nearest neighbor search to predict tags for unseen documents. In contrast to previous multi-label learning methods, DocTag2Vec directly deals with raw text instead of provided feature vector, and in addition, enjoys ad- vantages like the learning of tag representation, and the ability of handling newly created tags. To demonstrate the effec- tiveness of our approach, we conduct experiments on several datasets and show promising results against state-of-the-art methods.

Download

Venue:

The 2nd Workshop on Representation Learning for NLP 2017 (ACL Rep4NLP 2017)

Type:

Conference/Workshop Paper

Authors:

Sheng Chen
Akshay Soni
Yashar Mehdad

BibTeX

@inproceedings{ author = {Sheng Chen and Akshay Soni and Yashar Mehdad}, title = {DocTag2Vec: An Embedding Based Multi-label Learning Approach for Document Tagging}, booktitle = {Proceedings of The 2nd Workshop on Representation Learning for NLP 2017}, year = {2017} }

- Help
- About our ads

DocTag2Vec: An Embedding Based Multi-label Learning Approach for Document Tagging

Publication

Abstract

The 2nd Workshop on Representation Learning for NLP 2017 (ACL Rep4NLP 2017)

Conference/Workshop Paper

Sheng Chen

Akshay Soni

Yashar Mehdad

BibTeX