A Quick Review of Coreference Resolution Task

5 min readMay 29, 2020

Introduction

In this blog-post we will go over the Coreference Resolution task. This is one of the tasks in NLP that belongs to the discourse analysis part in which the research detects how the sentences are combined to one long meaningful text. So, let’s start!

What is a Coreference Resolution Task?

A coreference resolution task is a clustering task in NLP that identifies all the nouns in the sentence\document\corpus that refer to the same entity\event. An entity is defined by either a person name, location or organization.

In many natural language applications, such as question answering, automatic document summarization or machine translation, the first step is to preprocess the text for identifying references to entities before start working on the main task.

To understand the coreference task better, let’s see an example:

“Emma said that she thinks that Nelson really likes to dance, because he goes to the dancing studio 4 times a week.“

Emma and she belong to the same entity and Nelson and he belong to the same entity, while each entity will represent a different cluster.

Nouns or entities that don’t have more than one reference in the corpus are called singletons (a cluster that contains just one unit).

The coreference task is separated into two sub tasks:

Mention Detection
Mention Clustering

1. Mention Detection

In this sub-task the main goal is to find all the candidates spans referring to some entities. For example, in the sentence below the mention detection step will color all the candidates spans in blue:

There are three kinds of mentions that will be detected in this step:

Pronouns:

A pronoun is a word that substitutes for noun phrase and usually involves anaphora, where the meaning of the pronoun is dependent on an antecedent. For instance, ‘She’ is the pronoun in the sentence “Noa gave an amazing lecture, when she was in the conference at Madrid last year.”

2. Named-entity recognition (NER):

NER model locates entities in unstructured text and classify them into pre-defined categories (such as person names, organizations, locations, products, etc).

For example:

3. Noun-phrases:

A noun phrase is a bunch words that is headed by a noun and includes modifiers (e.g., ‘the,’ ‘a,’ ‘of them,’ ‘with her’).

2. Mention Clustering

Once we hold the mentions, the goal of the second sub-task is attempting to identify which ones refer to the same entity. Then, merging the mentions into the cluster corresponding to the entities presented in the text.

Basic Terms

Let’s go over some terms in this domain. We will start with an example that describes them in a simplistic way:

“Gal came back late from the party, because she really enjoyed there.”

Antecedent — An expression in the text that dictates the meaning to all the rest pronouns. For instance, in the example above, the pronoun she takes its meaning from Gal, so Gal is the antecedent of she.

Anaphora — Refers to an expression which its interpretation depends upon another expression in context (its antecedent). For example,

“If you want a cake, there is some in the kitchen.”

Cataphora — Cataphora is a type of anaphora but the pronoun appears earlier than the noun that it refers to. For example,

“If you want some, there is a cake in the kitchen.”

In the sentence above the pronoun some appears before the noun cake.

Common Features

There are some common features that can help us create the coreference links between mentions just by using the language structure rules, so let’s take a look at some examples.

Hand-Crafted Features

Recency:

More recently mentioned entities are likely to be referred. For instance, “Donna went to the dancing class and Ann joined her because she likes to dance.” She refers to Ann because it is recent; although, it can refer to Donna but intuitively it refers to the recent one.

Grammatical role:

Subject position entities are more likely to be referred than object position entities. For instance, “Donna went to dancing class and Ann joined her. She likes the dancing lessons.” In this case She refers to Donna because Ann is in the object position and Donna is the subject position.

Parallelism:

“Donna went to dancing class with Ann, and Maria went with her to watch a movie.” Her refers to Ann while the word went create a parallel structure that give us a clue on the antecedent of the pronoun her.

Verb semantics:

The semantics in the sentence can also give us a clue of how to link between the mentions in the sentence.

“The animal didn’t cross the street because it was too tired.”

“The animal didn’t cross the street because it was too wide.”

In the first sentence the word it refers to the animal and in the second sentence it refers to the street, we know it because of the relation to the word tired and wide.

Additional Features

There are few more features that can help building a supervised resolution classifier that are called — PNG Constraints:

Person (1st person, 2nd person, 3rd person)
Number (single or plural)
Gender (male or female)

So, how does coreference system works in practice?

The traditional way was to run a kind of pipeline that is composed of separated models, while first we needed to run a part of a speech tagger to find all the pronouns in the text, the second step was running a named-entity recognizer to detect all the entities and then a parser and named mention detector and coreference clustering system which is summarized to a five-step pipeline. It worked like that until approximately 2016 when the approach changed to be more directed into end-to-end coreference models.

End Notes

This blog-post summarized all the basic terms of coreference task. If you wish to continue reading about the models development of the last few years in this domain you can check out my next blog-post —Coreference Resolution Models, that gives a short review about it.