Relation Extraction using Multi-Encoder LSTM Network on Noisy Alignments

Feb 1, 2018

Relation extraction techniques are used to find potential relational facts in textual content. Relation Extraction systems require huge amount of training data to model semantics of sentences and identify relations. Distant supervision, often used to construct training data for relation extraction, produces noisy alignments that can hurt the performance of relation extraction systems. To this end, we propose a simple, yet effective, technique to automatically compute confidence levels of alignments. We compute the confidence values of automatically labeled content using co-occurrence statistics of relations and dependency patterns of aligned sentences. Thereafter, we propose a novel multiencoder bidirectional Long Short Term Memory (LSTM) model to identify relations in a given sentence. We use different features (words, part-of-speech (POS) tags and dependency paths) in each encoder and concatenate the hidden states of all the encoders to predict the relations. Our experiments show that a multi-encoder network can handle different features together to predict relations more accurately (~9% improvement over a single encoder model). We also conduct visualization experiments to show that our model learns intermediate representations effectively to identify relations in sentences

  • IEEE International Conference on Semantic Computing (IEEE ICSC 2018)
  • Conference/Workshop Paper