Category and depth discrimination in real-world scenes
Category and depth discrimination in real-world scenes
Visual understanding of real-world scenes is near-instantaneous. Humans can extract a wealth of information, including spatial structure, semantic category, and the identity of embedded objects, from images viewed for fewer than 100 msecs. Visual processing has capacity limits, and, as a result, the computational processes that underlie this behaviour must be highly efficient. Computational theories of realworld scene perception model early image processing in various ways. In Chapter 1, I review these theories, and in Chapter 2, I review the role of depth cues in rapid visual processing. This discussion reveals three problems: (i) Tests of the agreement between model predictions and human responses may be biased by the arbitrary choice of category system, (ii) Current models posit that scene semantics is estimated from spatial structure properties, but empirical support for this position is inconsistent, and (iii) The time-course of depth estimation in real-world scenes is poorly understood. To address these problems, three empirical papers are presented in Chapters 3, 4, and 5. In Chapter 3, I propose and validate a novel clustering algorithm that can be applied to image databases to derive category systems for visual experiments. In Chapters 3 and 4, I examine the relationship between spatial structure and semantic information, and find little support for the position that spatial
structure properties inform semantic discrimination. In Chapters 4 and 5, I
characterize the time-course of depth processing for images presented for <267
msecs, and conclude that binocular disparity and elevation cues contribute to realworld perception shortly after image onset (<50 msecs). These findings are discussed together in Chapter 6. This thesis contributes to the evaluation of modern models of real-world scene perception, and helps to characterize how visual understanding unfolds over time.
University of Southampton
Anderson, Matt D.
53946cbf-a70a-4782-ab28-12f3b9f34aa6
2022
Anderson, Matt D.
53946cbf-a70a-4782-ab28-12f3b9f34aa6
Graf, Erich
1a5123e2-8f05-4084-a6e6-837dcfc66209
Adams, Wendy
25685aaa-fc54-4d25-8d65-f35f4c5ab688
Anderson, Matt D.
(2022)
Category and depth discrimination in real-world scenes.
University of Southampton, Doctoral Thesis, 245pp.
Record type:
Thesis
(Doctoral)
Abstract
Visual understanding of real-world scenes is near-instantaneous. Humans can extract a wealth of information, including spatial structure, semantic category, and the identity of embedded objects, from images viewed for fewer than 100 msecs. Visual processing has capacity limits, and, as a result, the computational processes that underlie this behaviour must be highly efficient. Computational theories of realworld scene perception model early image processing in various ways. In Chapter 1, I review these theories, and in Chapter 2, I review the role of depth cues in rapid visual processing. This discussion reveals three problems: (i) Tests of the agreement between model predictions and human responses may be biased by the arbitrary choice of category system, (ii) Current models posit that scene semantics is estimated from spatial structure properties, but empirical support for this position is inconsistent, and (iii) The time-course of depth estimation in real-world scenes is poorly understood. To address these problems, three empirical papers are presented in Chapters 3, 4, and 5. In Chapter 3, I propose and validate a novel clustering algorithm that can be applied to image databases to derive category systems for visual experiments. In Chapters 3 and 4, I examine the relationship between spatial structure and semantic information, and find little support for the position that spatial
structure properties inform semantic discrimination. In Chapters 4 and 5, I
characterize the time-course of depth processing for images presented for <267
msecs, and conclude that binocular disparity and elevation cues contribute to realworld perception shortly after image onset (<50 msecs). These findings are discussed together in Chapter 6. This thesis contributes to the evaluation of modern models of real-world scene perception, and helps to characterize how visual understanding unfolds over time.
Text
Matt Anderson PhD Thesis - final copy unsigned
- Version of Record
Text
Matt Anderson Permission to deposit thesis
Restricted to Repository staff only
More information
Published date: 2022
Identifiers
Local EPrints ID: 468559
URI: http://eprints.soton.ac.uk/id/eprint/468559
PURE UUID: 2ee0dc81-d72a-4103-a1b1-3bab89a83cbf
Catalogue record
Date deposited: 18 Aug 2022 16:30
Last modified: 17 Mar 2024 02:59
Export record
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics