Introduction
This blog-post will explain the basics of Relation Extraction (RE) task. If it’s the first time that you engage with such kind of task, this blog post is exactly for you!
What is Relation Extraction?
Relation Extraction (RE) task aims to derive relationship between entities in an unstructured text. Learning to extract semantic relations between entity pairs from text plays a vital role in many NLP tasks such as question answering and information extraction. For example, let’s take a look at the sentence below:
“Donald Tramp he is the president of United States”
In a RE task we will predict the semantic relationship between two tagged entities in a sentence and derive triples of relations from the text such as:
Entity 1: United States, Entity 2: Donald Trump, Relation: President
There are few strategies to find relations in a sentence, so let’s go over each of them!
Relation Extraction Strategies
Hand-built patterns
The first and simplest strategy is the Hand-built patterns that defines RE patterns manually. This strategy has a pool of preset templates that it uses to classify the relations in each sentence.
For example, we can use the pattern “X is the president of Y” to find new instances of “president” type. If we will have a large corpus we will probably find the relational triple (President, United States, Donald Trump) from the example above.
This approach might work to accumulate knowledge on new relations considering the fact that we live in the Big Data era.
So, Why not?
The problem with such an idea is that the manual patterns do not necessarily fit to generalize most of the cases. Because of the great diversity of linguistic expression, you can express the same meaning by many different versions of sentences. If the sentence was written a bit different than the pattern that we defined, it will be hard to capture the relations.
For instance, the pattern above will not be able to capture the relation of the sentence below:
“United State is led by Donald Trump”
Supervised learning
Another strategy is the supervised learning approach. The relations between the entities have labels and they are classified using training procedure of the learning model.
If the model will have labels it will be able to deal with all kind of sentences such as the positive examples below:
- “The presidency of Donald Trump began on January 20, 2017.”
- “The 45th and current govern of the United States is Donald Trump. He was sworn in on January 20, 2017.”
It can deal also with negative examples sentences which include “Donald Trump” and “United States” but do not express the “president” relation, such as:
- “Donald Trump came back lately to United State”.
The fully-supervised model will be able to learn linguistic patterns and classify the relations in a good way.
So, Why not?
Well, in this case we will achieve better generalization than the previous strategy. The problem is that the annotation step is expensive and in most cases we will not be able to receive a big amount of labeled data.
Distant supervision
The idea of this strategy is that the learning step will be based on a prior knowledge that was labeled already to assume relations of new unlabeled data on a large corpus. The advantage here compared to the previous strategy is that you don’t need too many labeled examples and you can extent your training data significantly.
For example, if we already have in our knowledge base (KB) the relational triple (President, United States, Donald Trump), we can search for sentences in our corpus which the entities "United States" and "Donald Trump" co-occur. make the assumption that all the sentences express the “President ” relation, and then use them as training data for the learning model to identify new instances of the “President ” relation — all without doing any manual labeling.
If we can accumulate a large KB of relational triples, we can use it to power question answering and other applications.
A classifier cannot be trained on positive instances alone. In order to apply the distant supervision paradigm, we will also need some negative instances — that is, entity pairs which do not belong to any known relation.
So, Why not?
Firstly, who said that your assumptions are reliable? Not all sentences with related entities are truly express the relation that we have defined.
For example, not all of the sentences in which “United States” and “Donald Trump” co-occur will not express the “President” relation. This unreliable assumption adds lots of noise samples into our training data that we will need to compensate on them by adding many examples that will make the noise neglected..
Secondly, we will still need to gather a bench of knowledge base samples for the training.
End Notes
If you wish to continue reading about relation extraction models, you can continue to my next blog post — Latest Relation Extraction Models.