Benchmarking representation learning for natural world image collections

Research output: Contribution to journal › Conference article › Research › peer-review

Standard

Benchmarking representation learning for natural world image collections. / van Horn, Grant; Cole, Elijah; Beery, Sara; Wilber, Kimberly; Belongie, Serge; Aodha, Oisin Mac.

In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2021, p. 12879-12888.

Research output: Contribution to journal › Conference article › Research › peer-review

Harvard

van Horn, G, Cole, E, Beery, S, Wilber, K, Belongie, S & Aodha, OM 2021, 'Benchmarking representation learning for natural world image collections', Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 12879-12888. https://doi.org/10.1109/CVPR46437.2021.01269

APA

van Horn, G., Cole, E., Beery, S., Wilber, K., Belongie, S., & Aodha, O. M. (2021). Benchmarking representation learning for natural world image collections. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 12879-12888. https://doi.org/10.1109/CVPR46437.2021.01269

Vancouver

van Horn G, Cole E, Beery S, Wilber K, Belongie S, Aodha OM. Benchmarking representation learning for natural world image collections. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 2021;12879-12888. https://doi.org/10.1109/CVPR46437.2021.01269

Author

van Horn, Grant ; Cole, Elijah ; Beery, Sara ; Wilber, Kimberly ; Belongie, Serge ; Aodha, Oisin Mac. / Benchmarking representation learning for natural world image collections. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 2021 ; pp. 12879-12888.

Bibtex

@inproceedings{31ff82daef37490bb0da6aabce1e9f33,

title = "Benchmarking representation learning for natural world image collections",

abstract = "Recent progress in self-supervised learning has resulted in models that are capable of extracting rich representations from image collections without requiring any explicit label supervision. However, to date the vast majority of these approaches have restricted themselves to training on standard benchmark datasets such as ImageNet. We argue that fine-grained visual categorization problems, such as plant and animal species classification, provide an informative testbed for self-supervised learning. In order to facilitate progress in this area we present two new natural world visual classification datasets, iNat2021 and NeWT. The former consists of 2.7M images from 10k different species uploaded by users of the citizen science application iNaturalist. We designed the latter, NeWT, in collaboration with domain experts with the aim of benchmarking the performance of representation learning algorithms on a suite of challenging natural world binary classification tasks that go beyond standard species classification. These two new datasets allow us to explore questions related to large-scale representation and transfer learning in the context of fine-grained categories. We provide a comprehensive analysis of feature extractors trained with and without supervision on ImageNet and iNat2021, shedding light on the strengths and weaknesses of different learned features across a diverse set of tasks. We find that features produced by standard supervised methods still outperform those produced by self-supervised approaches such as SimCLR. However, improved self-supervised learning methods are constantly being released and the iNat2021 and NeWT datasets are a valuable resource for tracking their progress.",

author = "{van Horn}, Grant and Elijah Cole and Sara Beery and Kimberly Wilber and Serge Belongie and Aodha, {Oisin Mac}",

note = "Funding Information: Thanks to the iNaturalist team and community for providing access to data, Eliot Miller and Mitch Barry for helping to curate NeWT, and to Pietro Perona for valuable feedback. Publisher Copyright: {\textcopyright} 2021 IEEE; 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2021 ; Conference date: 19-06-2021 Through 25-06-2021",

year = "2021",

doi = "10.1109/CVPR46437.2021.01269",

language = "English",

pages = "12879--12888",

journal = "I E E E Conference on Computer Vision and Pattern Recognition. Proceedings",

issn = "1063-6919",

publisher = "Institute of Electrical and Electronics Engineers",

}

RIS

TY - GEN

T1 - Benchmarking representation learning for natural world image collections

AU - van Horn, Grant

AU - Cole, Elijah

AU - Beery, Sara

AU - Wilber, Kimberly

AU - Belongie, Serge

AU - Aodha, Oisin Mac

N1 - Funding Information: Thanks to the iNaturalist team and community for providing access to data, Eliot Miller and Mitch Barry for helping to curate NeWT, and to Pietro Perona for valuable feedback. Publisher Copyright: © 2021 IEEE

PY - 2021

Y1 - 2021

N2 - Recent progress in self-supervised learning has resulted in models that are capable of extracting rich representations from image collections without requiring any explicit label supervision. However, to date the vast majority of these approaches have restricted themselves to training on standard benchmark datasets such as ImageNet. We argue that fine-grained visual categorization problems, such as plant and animal species classification, provide an informative testbed for self-supervised learning. In order to facilitate progress in this area we present two new natural world visual classification datasets, iNat2021 and NeWT. The former consists of 2.7M images from 10k different species uploaded by users of the citizen science application iNaturalist. We designed the latter, NeWT, in collaboration with domain experts with the aim of benchmarking the performance of representation learning algorithms on a suite of challenging natural world binary classification tasks that go beyond standard species classification. These two new datasets allow us to explore questions related to large-scale representation and transfer learning in the context of fine-grained categories. We provide a comprehensive analysis of feature extractors trained with and without supervision on ImageNet and iNat2021, shedding light on the strengths and weaknesses of different learned features across a diverse set of tasks. We find that features produced by standard supervised methods still outperform those produced by self-supervised approaches such as SimCLR. However, improved self-supervised learning methods are constantly being released and the iNat2021 and NeWT datasets are a valuable resource for tracking their progress.

AB - Recent progress in self-supervised learning has resulted in models that are capable of extracting rich representations from image collections without requiring any explicit label supervision. However, to date the vast majority of these approaches have restricted themselves to training on standard benchmark datasets such as ImageNet. We argue that fine-grained visual categorization problems, such as plant and animal species classification, provide an informative testbed for self-supervised learning. In order to facilitate progress in this area we present two new natural world visual classification datasets, iNat2021 and NeWT. The former consists of 2.7M images from 10k different species uploaded by users of the citizen science application iNaturalist. We designed the latter, NeWT, in collaboration with domain experts with the aim of benchmarking the performance of representation learning algorithms on a suite of challenging natural world binary classification tasks that go beyond standard species classification. These two new datasets allow us to explore questions related to large-scale representation and transfer learning in the context of fine-grained categories. We provide a comprehensive analysis of feature extractors trained with and without supervision on ImageNet and iNat2021, shedding light on the strengths and weaknesses of different learned features across a diverse set of tasks. We find that features produced by standard supervised methods still outperform those produced by self-supervised approaches such as SimCLR. However, improved self-supervised learning methods are constantly being released and the iNat2021 and NeWT datasets are a valuable resource for tracking their progress.

UR - http://www.scopus.com/inward/record.url?scp=85118006673&partnerID=8YFLogxK

U2 - 10.1109/CVPR46437.2021.01269

DO - 10.1109/CVPR46437.2021.01269

M3 - Conference article

AN - SCOPUS:85118006673

SP - 12879

EP - 12888

JO - I E E E Conference on Computer Vision and Pattern Recognition. Proceedings

JF - I E E E Conference on Computer Vision and Pattern Recognition. Proceedings

SN - 1063-6919

T2 - 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2021

Y2 - 19 June 2021 through 25 June 2021

ER -

ID: 301817659