Visual recognition with humans in the loop

Research output: Contribution to journal › Conference article › Research › peer-review

Standard

Visual recognition with humans in the loop. / Branson, Steve; Wah, Catherine; Schroff, Florian; Babenko, Boris; Welinder, Peter; Perona, Pietro; Belongie, Serge.

In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), No. PART 4, 2010, p. 438-451.

Research output: Contribution to journal › Conference article › Research › peer-review

Harvard

Branson, S, Wah, C, Schroff, F, Babenko, B, Welinder, P, Perona, P & Belongie, S 2010, 'Visual recognition with humans in the loop', Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), no. PART 4, pp. 438-451. https://doi.org/10.1007/978-3-642-15561-1_32

APA

Branson, S., Wah, C., Schroff, F., Babenko, B., Welinder, P., Perona, P., & Belongie, S. (2010). Visual recognition with humans in the loop. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), (PART 4), 438-451. https://doi.org/10.1007/978-3-642-15561-1_32

Vancouver

Branson S, Wah C, Schroff F, Babenko B, Welinder P, Perona P et al. Visual recognition with humans in the loop. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 2010;(PART 4):438-451. https://doi.org/10.1007/978-3-642-15561-1_32

Author

Branson, Steve ; Wah, Catherine ; Schroff, Florian ; Babenko, Boris ; Welinder, Peter ; Perona, Pietro ; Belongie, Serge. / Visual recognition with humans in the loop. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 2010 ; No. PART 4. pp. 438-451.

Bibtex

@inproceedings{7b514b53c7144cddbddd386de404ef01,

title = "Visual recognition with humans in the loop",

abstract = "We present an interactive, hybrid human-computer method for object classification. The method applies to classes of objects that are recognizable by people with appropriate expertise (e.g., animal species or airplane model), but not (in general) by people without such expertise. It can be seen as a visual version of the 20 questions game, where questions based on simple visual attributes are posed interactively. The goal is to identify the true class while minimizing the number of questions asked, using the visual content of the image. We introduce a general framework for incorporating almost any off-the-shelf multi-class object recognition algorithm into the visual 20 questions game, and provide methodologies to account for imperfect user responses and unreliable computer vision algorithms. We evaluate our methods on Birds-200, a difficult dataset of 200 tightly-related bird species, and on the Animals With Attributes dataset. Our results demonstrate that incorporating user input drives up recognition accuracy to levels that are good enough for practical applications, while at the same time, computer vision reduces the amount of human interaction required.",

author = "Steve Branson and Catherine Wah and Florian Schroff and Boris Babenko and Peter Welinder and Pietro Perona and Serge Belongie",

year = "2010",

doi = "10.1007/978-3-642-15561-1_32",

language = "English",

pages = "438--451",

journal = "Lecture Notes in Computer Science",

issn = "0302-9743",

publisher = "Springer Verlag",

number = "PART 4",

note = "11th European Conference on Computer Vision, ECCV 2010 ; Conference date: 10-09-2010 Through 11-09-2010",

}

RIS

TY - GEN

T1 - Visual recognition with humans in the loop

AU - Branson, Steve

AU - Wah, Catherine

AU - Schroff, Florian

AU - Babenko, Boris

AU - Welinder, Peter

AU - Perona, Pietro

AU - Belongie, Serge

PY - 2010

Y1 - 2010

N2 - We present an interactive, hybrid human-computer method for object classification. The method applies to classes of objects that are recognizable by people with appropriate expertise (e.g., animal species or airplane model), but not (in general) by people without such expertise. It can be seen as a visual version of the 20 questions game, where questions based on simple visual attributes are posed interactively. The goal is to identify the true class while minimizing the number of questions asked, using the visual content of the image. We introduce a general framework for incorporating almost any off-the-shelf multi-class object recognition algorithm into the visual 20 questions game, and provide methodologies to account for imperfect user responses and unreliable computer vision algorithms. We evaluate our methods on Birds-200, a difficult dataset of 200 tightly-related bird species, and on the Animals With Attributes dataset. Our results demonstrate that incorporating user input drives up recognition accuracy to levels that are good enough for practical applications, while at the same time, computer vision reduces the amount of human interaction required.

AB - We present an interactive, hybrid human-computer method for object classification. The method applies to classes of objects that are recognizable by people with appropriate expertise (e.g., animal species or airplane model), but not (in general) by people without such expertise. It can be seen as a visual version of the 20 questions game, where questions based on simple visual attributes are posed interactively. The goal is to identify the true class while minimizing the number of questions asked, using the visual content of the image. We introduce a general framework for incorporating almost any off-the-shelf multi-class object recognition algorithm into the visual 20 questions game, and provide methodologies to account for imperfect user responses and unreliable computer vision algorithms. We evaluate our methods on Birds-200, a difficult dataset of 200 tightly-related bird species, and on the Animals With Attributes dataset. Our results demonstrate that incorporating user input drives up recognition accuracy to levels that are good enough for practical applications, while at the same time, computer vision reduces the amount of human interaction required.

UR - http://www.scopus.com/inward/record.url?scp=78149300909&partnerID=8YFLogxK

U2 - 10.1007/978-3-642-15561-1_32

DO - 10.1007/978-3-642-15561-1_32

M3 - Conference article

AN - SCOPUS:78149300909

SP - 438

EP - 451

JO - Lecture Notes in Computer Science

JF - Lecture Notes in Computer Science

SN - 0302-9743

IS - PART 4

T2 - 11th European Conference on Computer Vision, ECCV 2010

Y2 - 10 September 2010 through 11 September 2010

ER -

ID: 302048098