Category systems for real-world scenes

Categorization performance is a popular metric of scene recognition and understanding in behavioral and computational research. However, categorical constructs and their labels can be somewhat arbitrary. Derived from exhaustive vocabularies of place names (e.g., Deng et al., 2009), or the judgements of small groups of researchers (e.g., Fei-Fei, Iyer, Koch,&Perona, 2007), these categories may not correspond with human-preferred taxonomies. Here, we propose clustering by increasing the rand index via coordinate ascent (CIRCA): an unsupervised, data-driven clustering method for deriving ground-truth scene categories. In Experiment 1, human participants organized 80 stereoscopic images of outdoor scenes from the Southampton-York Natural Scenes (SYNS) dataset (Adams et al., 2016) into discrete categories. In separate tasks, images were grouped according to i) semantic content, ii) three-dimensional spatial structure, or iii) two-dimensional image appearance. Participants provided text labels for each group. Using the CIRCA method, we determined the most representative category structure and then derived category labels for each task/dimension. In Experiment 2, we found that these categories generalized well to a larger set of SYNS images, and new observers. In Experiment 3, we tested the relationship between our category systems and the spatial envelope model (Oliva&Torralba, 2001). Finally, in Experiment 4, we validated CIRCA on a larger, independent dataset of same-different category judgements. The derived category systems outperformed the SUN taxonomy (Xiao, Hays, Ehinger, Oliva,&Torralba, 2010) and an alternative clustering method (Greene, 2019). In summary, we believe this novel categorization method can be applied to a wide range of datasets to derive optimal categorical groupings and labels from psychophysical judgements of stimulus similarity.

Categorization, Category system, Clustering, Natural scenes, Real-world scene perception, taxonomy

10.1167/jov.21.2.8

1534-7362

1-31

Anderson, Matt D.

53946cbf-a70a-4782-ab28-12f3b9f34aa6

Graf, Erich

1a5123e2-8f05-4084-a6e6-837dcfc66209

Elder, James H.

f7d4f18e-09dd-4e5c-8fc9-b03064c9ff71

Ehinger, Krista

3738096b-076a-4137-964e-bdb8a163f9e8

Adams, Wendy

25685aaa-fc54-4d25-8d65-f35f4c5ab688

3 February 2021

Anderson, Matt D.

53946cbf-a70a-4782-ab28-12f3b9f34aa6

Graf, Erich

1a5123e2-8f05-4084-a6e6-837dcfc66209

Elder, James H.

f7d4f18e-09dd-4e5c-8fc9-b03064c9ff71

Ehinger, Krista

3738096b-076a-4137-964e-bdb8a163f9e8

Adams, Wendy

25685aaa-fc54-4d25-8d65-f35f4c5ab688

Anderson, Matt D., Graf, Erich, Elder, James H., Ehinger, Krista and Adams, Wendy (2021) Category systems for real-world scenes. Journal of Vision, 21 (2), 1-31, [8]. (doi:10.1167/jov.21.2.8).

Record type: Article

Abstract

Text

Category Systems For Real-World Scenes (Manuscript) - Accepted Manuscript

Available under License Creative Commons Attribution.

Download (62MB)

Text

Categorysystemsforreal-worldscenes - Version of Record

Available under License Creative Commons Attribution.

Download (18MB)

More information

Accepted/In Press date: 6 January 2021

Published date: 3 February 2021

Additional Information: Funding Information: Supported by EPSRC grant EP/K005952/1, EPSRC grant EP/S016368/1, and a York University VISTA Visiting Trainee Award. Publisher Copyright: © 2021. All rights reserved.

Related URLs: