The Role of Syntactic Planning in Compositional Image Captioning

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

Standard

The Role of Syntactic Planning in Compositional Image Captioning. / Bugliarello, Emanuele; Elliott, Desmond.

Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume. Online : Association for Computational Linguistics, 2021. p. 593–607.

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

Harvard

Bugliarello, E & Elliott, D 2021, The Role of Syntactic Planning in Compositional Image Captioning. in Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume. Association for Computational Linguistics, Online, pp. 593–607, The 16th Conference of the European Chapter
of the Association for Computational Linguistics, 21/04/2021. https://doi.org/10.18653/v1/2021.eacl-main.48

APA

Bugliarello, E., & Elliott, D. (2021). The Role of Syntactic Planning in Compositional Image Captioning. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume (pp. 593–607). Association for Computational Linguistics. https://doi.org/10.18653/v1/2021.eacl-main.48

Vancouver

Bugliarello E, Elliott D. The Role of Syntactic Planning in Compositional Image Captioning. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume. Online: Association for Computational Linguistics. 2021. p. 593–607 https://doi.org/10.18653/v1/2021.eacl-main.48

Author

Bugliarello, Emanuele ; Elliott, Desmond. / The Role of Syntactic Planning in Compositional Image Captioning. Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume. Online : Association for Computational Linguistics, 2021. pp. 593–607

Bibtex

@inproceedings{c73f928e721c435ba5199064ca7035c6,
title = "The Role of Syntactic Planning in Compositional Image Captioning",
abstract = "Image captioning has focused on generalizing to images drawn from the same distribution as the training set, and not to the more challenging problem of generalizing to different distributions of images. Recently, Nikolaus et al. (2019) introduced a dataset to assess compositional generalization in image captioning, where models are evaluated on their ability to describe images with unseen adjective–noun and noun–verb compositions. In this work, we investigate different methods to improve compositional generalization by planning the syntactic structure of a caption. Our experiments show that jointly modeling tokens and syntactic tags enhances generalization in both RNN- and Transformer-based models, while also improving performance on standard metrics.",
author = "Emanuele Bugliarello and Desmond Elliott",
year = "2021",
month = apr,
doi = "10.18653/v1/2021.eacl-main.48",
language = "English",
pages = "593–607",
booktitle = "Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume",
publisher = "Association for Computational Linguistics",
note = "The 16th Conference of the European Chapter<br/> of the Association for Computational Linguistics : EACL 2021, EACL 2021 ; Conference date: 21-04-2021 Through 23-04-2021",
url = "https://2021.eacl.org/",

}

RIS

TY - GEN

T1 - The Role of Syntactic Planning in Compositional Image Captioning

AU - Bugliarello, Emanuele

AU - Elliott, Desmond

N1 - Conference code: 16

PY - 2021/4

Y1 - 2021/4

N2 - Image captioning has focused on generalizing to images drawn from the same distribution as the training set, and not to the more challenging problem of generalizing to different distributions of images. Recently, Nikolaus et al. (2019) introduced a dataset to assess compositional generalization in image captioning, where models are evaluated on their ability to describe images with unseen adjective–noun and noun–verb compositions. In this work, we investigate different methods to improve compositional generalization by planning the syntactic structure of a caption. Our experiments show that jointly modeling tokens and syntactic tags enhances generalization in both RNN- and Transformer-based models, while also improving performance on standard metrics.

AB - Image captioning has focused on generalizing to images drawn from the same distribution as the training set, and not to the more challenging problem of generalizing to different distributions of images. Recently, Nikolaus et al. (2019) introduced a dataset to assess compositional generalization in image captioning, where models are evaluated on their ability to describe images with unseen adjective–noun and noun–verb compositions. In this work, we investigate different methods to improve compositional generalization by planning the syntactic structure of a caption. Our experiments show that jointly modeling tokens and syntactic tags enhances generalization in both RNN- and Transformer-based models, while also improving performance on standard metrics.

U2 - 10.18653/v1/2021.eacl-main.48

DO - 10.18653/v1/2021.eacl-main.48

M3 - Article in proceedings

SP - 593

EP - 607

BT - Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume

PB - Association for Computational Linguistics

CY - Online

T2 - The 16th Conference of the European Chapter<br/> of the Association for Computational Linguistics

Y2 - 21 April 2021 through 23 April 2021

ER -

ID: 275339891