CompGuessWhat?!

CompGuessWhat?! A Multi-task Evaluation Framework for Grounded Language Learning

Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review

Standard

CompGuessWhat?! A Multi-task Evaluation Framework for Grounded Language Learning. / Suglia, Alessandro; Konstas, Ioannis; Vanzo, Andrea; Bastianelli, Emanuele; Elliott, Desmond; Frank, Stella; Lemon, Oliver.

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 2020. p. 7625–7641.

Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review

Harvard

Suglia, A, Konstas, I, Vanzo, A, Bastianelli, E, Elliott, D, Frank, S & Lemon, O 2020, CompGuessWhat?! A Multi-task Evaluation Framework for Grounded Language Learning. in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, pp. 7625–7641, 58th Annual Meeting of the Association for Computational Linguistics, Online, 05/07/2020. https://doi.org/10.18653/v1/2020.acl-main.682

APA

Suglia, A., Konstas, I., Vanzo, A., Bastianelli, E., Elliott, D., Frank, S., & Lemon, O. (2020). CompGuessWhat?! A Multi-task Evaluation Framework for Grounded Language Learning. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (pp. 7625–7641). Association for Computational Linguistics. https://doi.org/10.18653/v1/2020.acl-main.682

Vancouver

Suglia A, Konstas I, Vanzo A, Bastianelli E, Elliott D, Frank S et al. CompGuessWhat?! A Multi-task Evaluation Framework for Grounded Language Learning. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics. 2020. p. 7625–7641 https://doi.org/10.18653/v1/2020.acl-main.682

Author

Suglia, Alessandro ; Konstas, Ioannis ; Vanzo, Andrea ; Bastianelli, Emanuele ; Elliott, Desmond ; Frank, Stella ; Lemon, Oliver. / CompGuessWhat?! A Multi-task Evaluation Framework for Grounded Language Learning. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 2020. pp. 7625–7641

Bibtex

@inproceedings{5e6858a1b51642ebb41cb9322499b98e,

title = "CompGuessWhat?!: A Multi-task Evaluation Framework for Grounded Language Learning",

abstract = " Approaches to Grounded Language Learning typically focus on a single task-based final performance measure that may not depend on desirable properties of the learned hidden representations, such as their ability to predict salient attributes or to generalise to unseen situations. To remedy this, we present GROLLA, an evaluation framework for Grounded Language Learning with Attributes with three sub-tasks: 1) Goal-oriented evaluation; 2) Object attribute prediction evaluation; and 3) Zero-shot evaluation. We also propose a new dataset CompGuessWhat?! as an instance of this framework for evaluating the quality of learned neural representations, in particular concerning attribute grounding. To this end, we extend the original GuessWhat?! dataset by including a semantic layer on top of the perceptual one. Specifically, we enrich the VisualGenome scene graphs associated with the GuessWhat?! images with abstract and situated attributes. By using diagnostic classifiers, we show that current models learn representations that are not expressive enough to encode object attributes (average F1 of 44.27). In addition, they do not learn strategies nor representations that are robust enough to perform well when novel scenes or objects are involved in gameplay (zero-shot best accuracy 50.06%). ",

keywords = "cs.CL, cs.AI, cs.LG",

author = "Alessandro Suglia and Ioannis Konstas and Andrea Vanzo and Emanuele Bastianelli and Desmond Elliott and Stella Frank and Oliver Lemon",

year = "2020",

doi = "10.18653/v1/2020.acl-main.682",

language = "English",

pages = "7625–7641",

booktitle = "Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics",

publisher = "Association for Computational Linguistics",

note = "58th Annual Meeting of the Association for Computational Linguistics ; Conference date: 05-07-2020 Through 10-07-2020",

}

RIS

TY - GEN

T1 - CompGuessWhat?!

T2 - 58th Annual Meeting of the Association for Computational Linguistics

AU - Suglia, Alessandro

AU - Konstas, Ioannis

AU - Vanzo, Andrea

AU - Bastianelli, Emanuele

AU - Elliott, Desmond

AU - Frank, Stella

AU - Lemon, Oliver

PY - 2020

Y1 - 2020

N2 - Approaches to Grounded Language Learning typically focus on a single task-based final performance measure that may not depend on desirable properties of the learned hidden representations, such as their ability to predict salient attributes or to generalise to unseen situations. To remedy this, we present GROLLA, an evaluation framework for Grounded Language Learning with Attributes with three sub-tasks: 1) Goal-oriented evaluation; 2) Object attribute prediction evaluation; and 3) Zero-shot evaluation. We also propose a new dataset CompGuessWhat?! as an instance of this framework for evaluating the quality of learned neural representations, in particular concerning attribute grounding. To this end, we extend the original GuessWhat?! dataset by including a semantic layer on top of the perceptual one. Specifically, we enrich the VisualGenome scene graphs associated with the GuessWhat?! images with abstract and situated attributes. By using diagnostic classifiers, we show that current models learn representations that are not expressive enough to encode object attributes (average F1 of 44.27). In addition, they do not learn strategies nor representations that are robust enough to perform well when novel scenes or objects are involved in gameplay (zero-shot best accuracy 50.06%).

AB - Approaches to Grounded Language Learning typically focus on a single task-based final performance measure that may not depend on desirable properties of the learned hidden representations, such as their ability to predict salient attributes or to generalise to unseen situations. To remedy this, we present GROLLA, an evaluation framework for Grounded Language Learning with Attributes with three sub-tasks: 1) Goal-oriented evaluation; 2) Object attribute prediction evaluation; and 3) Zero-shot evaluation. We also propose a new dataset CompGuessWhat?! as an instance of this framework for evaluating the quality of learned neural representations, in particular concerning attribute grounding. To this end, we extend the original GuessWhat?! dataset by including a semantic layer on top of the perceptual one. Specifically, we enrich the VisualGenome scene graphs associated with the GuessWhat?! images with abstract and situated attributes. By using diagnostic classifiers, we show that current models learn representations that are not expressive enough to encode object attributes (average F1 of 44.27). In addition, they do not learn strategies nor representations that are robust enough to perform well when novel scenes or objects are involved in gameplay (zero-shot best accuracy 50.06%).

KW - cs.CL

KW - cs.AI

KW - cs.LG

U2 - 10.18653/v1/2020.acl-main.682

DO - 10.18653/v1/2020.acl-main.682

M3 - Article in proceedings

SP - 7625

EP - 7641

BT - Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

PB - Association for Computational Linguistics

Y2 - 5 July 2020 through 10 July 2020

ER -

ID: 305182192