Principled Multi-Aspect Evaluation Measures of Rankings

Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review

Standard

Principled Multi-Aspect Evaluation Measures of Rankings. / Maistro, Maria; Lima, Lucas Chaves; Simonsen, Jakob Grue; Lioma, Christina.

CIKM 2021 - Proceedings of the 30th ACM International Conference on Information and Knowledge Management. Association for Computing Machinery, Inc, 2021. p. 1232-1242.

Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review

Harvard

Maistro, M, Lima, LC, Simonsen, JG & Lioma, C 2021, Principled Multi-Aspect Evaluation Measures of Rankings. in CIKM 2021 - Proceedings of the 30th ACM International Conference on Information and Knowledge Management. Association for Computing Machinery, Inc, pp. 1232-1242, 30th ACM International Conference on Information and Knowledge Management, CIKM 2021, Virtual, Online, Australia, 01/11/2021. https://doi.org/10.1145/3459637.3482287

APA

Maistro, M., Lima, L. C., Simonsen, J. G., & Lioma, C. (2021). Principled Multi-Aspect Evaluation Measures of Rankings. In CIKM 2021 - Proceedings of the 30th ACM International Conference on Information and Knowledge Management (pp. 1232-1242). Association for Computing Machinery, Inc. https://doi.org/10.1145/3459637.3482287

Vancouver

Maistro M, Lima LC, Simonsen JG, Lioma C. Principled Multi-Aspect Evaluation Measures of Rankings. In CIKM 2021 - Proceedings of the 30th ACM International Conference on Information and Knowledge Management. Association for Computing Machinery, Inc. 2021. p. 1232-1242 https://doi.org/10.1145/3459637.3482287

Author

Maistro, Maria ; Lima, Lucas Chaves ; Simonsen, Jakob Grue ; Lioma, Christina. / Principled Multi-Aspect Evaluation Measures of Rankings. CIKM 2021 - Proceedings of the 30th ACM International Conference on Information and Knowledge Management. Association for Computing Machinery, Inc, 2021. pp. 1232-1242

Bibtex

@inproceedings{f8a7c7acfb92457a92b17c7394904737,

title = "Principled Multi-Aspect Evaluation Measures of Rankings",

abstract = "Information Retrieval evaluation has traditionally focused on defining principled ways of assessing the relevance of a ranked list of documents with respect to a query. Several methods extend this type of evaluation beyond relevance, making it possible to evaluate different aspects of a document ranking (e.g., relevance, usefulness, or credibility) using a single measure (multi-aspect evaluation). However, these methods either are (i) tailor-made for specific aspects and do not extend to other types or numbers of aspects, or (ii) have theoretical anomalies, e.g. assign maximum score to a ranking where all documents are labelled with the lowest grade with respect to all aspects (e.g., not relevant, not credible, etc.). We present a theoretically principled multi-aspect evaluation method that can be used for any number, and any type, of aspects. A thorough empirical evaluation using up to 5 aspects and a total of 425 runs officially submitted to 10 TREC tracks shows that our method is more discriminative than the state-of-the-art and overcomes theoretical limitations of the state-of-the-art.",

keywords = "evaluation, multiple aspects, partial order, ranking",

author = "Maria Maistro and Lima, {Lucas Chaves} and Simonsen, {Jakob Grue} and Christina Lioma",

note = "Publisher Copyright: {\textcopyright} 2021 Owner/Author.; 30th ACM International Conference on Information and Knowledge Management, CIKM 2021 ; Conference date: 01-11-2021 Through 05-11-2021",

year = "2021",

doi = "10.1145/3459637.3482287",

language = "English",

pages = "1232--1242",

booktitle = "CIKM 2021 - Proceedings of the 30th ACM International Conference on Information and Knowledge Management",

publisher = "Association for Computing Machinery, Inc",

}

RIS

TY - GEN

T1 - Principled Multi-Aspect Evaluation Measures of Rankings

AU - Maistro, Maria

AU - Lima, Lucas Chaves

AU - Simonsen, Jakob Grue

AU - Lioma, Christina

PY - 2021

Y1 - 2021

N2 - Information Retrieval evaluation has traditionally focused on defining principled ways of assessing the relevance of a ranked list of documents with respect to a query. Several methods extend this type of evaluation beyond relevance, making it possible to evaluate different aspects of a document ranking (e.g., relevance, usefulness, or credibility) using a single measure (multi-aspect evaluation). However, these methods either are (i) tailor-made for specific aspects and do not extend to other types or numbers of aspects, or (ii) have theoretical anomalies, e.g. assign maximum score to a ranking where all documents are labelled with the lowest grade with respect to all aspects (e.g., not relevant, not credible, etc.). We present a theoretically principled multi-aspect evaluation method that can be used for any number, and any type, of aspects. A thorough empirical evaluation using up to 5 aspects and a total of 425 runs officially submitted to 10 TREC tracks shows that our method is more discriminative than the state-of-the-art and overcomes theoretical limitations of the state-of-the-art.

AB - Information Retrieval evaluation has traditionally focused on defining principled ways of assessing the relevance of a ranked list of documents with respect to a query. Several methods extend this type of evaluation beyond relevance, making it possible to evaluate different aspects of a document ranking (e.g., relevance, usefulness, or credibility) using a single measure (multi-aspect evaluation). However, these methods either are (i) tailor-made for specific aspects and do not extend to other types or numbers of aspects, or (ii) have theoretical anomalies, e.g. assign maximum score to a ranking where all documents are labelled with the lowest grade with respect to all aspects (e.g., not relevant, not credible, etc.). We present a theoretically principled multi-aspect evaluation method that can be used for any number, and any type, of aspects. A thorough empirical evaluation using up to 5 aspects and a total of 425 runs officially submitted to 10 TREC tracks shows that our method is more discriminative than the state-of-the-art and overcomes theoretical limitations of the state-of-the-art.

KW - evaluation

KW - multiple aspects

KW - partial order

KW - ranking

UR - http://www.scopus.com/inward/record.url?scp=85119194348&partnerID=8YFLogxK

U2 - 10.1145/3459637.3482287

DO - 10.1145/3459637.3482287

M3 - Article in proceedings

AN - SCOPUS:85119194348

SP - 1232

EP - 1242

BT - CIKM 2021 - Proceedings of the 30th ACM International Conference on Information and Knowledge Management

PB - Association for Computing Machinery, Inc

T2 - 30th ACM International Conference on Information and Knowledge Management, CIKM 2021

Y2 - 1 November 2021 through 5 November 2021

ER -

ID: 300918675