Model-based annotation of coreference

Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review

Rahul Aralikatte
Søgaard, Anders

Humans do not make inferences over texts, but over models of what texts are about. When annotators are asked to annotate coreferent spans of text, it is therefore a somewhat unnatural task. This paper presents an alternative in which we preprocess documents, linking entities to a knowledge base, and turn the coreference annotation task - in our case limited to pronouns - into an annotation task where annotators are asked to assign pronouns to entities. Model-based annotation is shown to lead to faster annotation and higher inter-annotator agreement, and we argue that it also opens up for an alternative approach to coreference resolution. We present two new coreference benchmark datasets, for English Wikipedia and English teacher-student dialogues, and evaluate state-of-the-art coreference resolvers on them.

Original language	English
Title of host publication	LREC 2020 - 12th International Conference on Language Resources and Evaluation, Conference Proceedings
Editors	Nicoletta Calzolari, Frederic Bechet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Helene Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Publisher	European Language Resources Association (ELRA)
Publication date	2020
Pages	74-79
ISBN (Electronic)	9791095546344
Publication status	Published - 2020
Event	12th International Conference on Language Resources and Evaluation, LREC 2020 - Marseille, France Duration: 11 May 2020 → 16 May 2020

Conference

Conference	12th International Conference on Language Resources and Evaluation, LREC 2020
Land	France
By	Marseille
Periode	11/05/2020 → 16/05/2020
Sponsor	Amazon AWS, Bertin, Lenovo, Ontotex, Vecsys, Vocapia

Research areas

Coreference resolution, Linguistic mental models

ID: 258332299