Principled Multi-Aspect Evaluation Measures of Rankings
Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review
Standard
Principled Multi-Aspect Evaluation Measures of Rankings. / Maistro, Maria; Lima, Lucas Chaves; Simonsen, Jakob Grue; Lioma, Christina.
CIKM 2021 - Proceedings of the 30th ACM International Conference on Information and Knowledge Management. Association for Computing Machinery, Inc, 2021. p. 1232-1242.Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review
Harvard
APA
Vancouver
Author
Bibtex
}
RIS
TY - GEN
T1 - Principled Multi-Aspect Evaluation Measures of Rankings
AU - Maistro, Maria
AU - Lima, Lucas Chaves
AU - Simonsen, Jakob Grue
AU - Lioma, Christina
N1 - Publisher Copyright: © 2021 Owner/Author.
PY - 2021
Y1 - 2021
N2 - Information Retrieval evaluation has traditionally focused on defining principled ways of assessing the relevance of a ranked list of documents with respect to a query. Several methods extend this type of evaluation beyond relevance, making it possible to evaluate different aspects of a document ranking (e.g., relevance, usefulness, or credibility) using a single measure (multi-aspect evaluation). However, these methods either are (i) tailor-made for specific aspects and do not extend to other types or numbers of aspects, or (ii) have theoretical anomalies, e.g. assign maximum score to a ranking where all documents are labelled with the lowest grade with respect to all aspects (e.g., not relevant, not credible, etc.). We present a theoretically principled multi-aspect evaluation method that can be used for any number, and any type, of aspects. A thorough empirical evaluation using up to 5 aspects and a total of 425 runs officially submitted to 10 TREC tracks shows that our method is more discriminative than the state-of-the-art and overcomes theoretical limitations of the state-of-the-art.
AB - Information Retrieval evaluation has traditionally focused on defining principled ways of assessing the relevance of a ranked list of documents with respect to a query. Several methods extend this type of evaluation beyond relevance, making it possible to evaluate different aspects of a document ranking (e.g., relevance, usefulness, or credibility) using a single measure (multi-aspect evaluation). However, these methods either are (i) tailor-made for specific aspects and do not extend to other types or numbers of aspects, or (ii) have theoretical anomalies, e.g. assign maximum score to a ranking where all documents are labelled with the lowest grade with respect to all aspects (e.g., not relevant, not credible, etc.). We present a theoretically principled multi-aspect evaluation method that can be used for any number, and any type, of aspects. A thorough empirical evaluation using up to 5 aspects and a total of 425 runs officially submitted to 10 TREC tracks shows that our method is more discriminative than the state-of-the-art and overcomes theoretical limitations of the state-of-the-art.
KW - evaluation
KW - multiple aspects
KW - partial order
KW - ranking
UR - http://www.scopus.com/inward/record.url?scp=85119194348&partnerID=8YFLogxK
U2 - 10.1145/3459637.3482287
DO - 10.1145/3459637.3482287
M3 - Article in proceedings
AN - SCOPUS:85119194348
SP - 1232
EP - 1242
BT - CIKM 2021 - Proceedings of the 30th ACM International Conference on Information and Knowledge Management
PB - Association for Computing Machinery, Inc
T2 - 30th ACM International Conference on Information and Knowledge Management, CIKM 2021
Y2 - 1 November 2021 through 5 November 2021
ER -
ID: 300918675