The University of Southampton
University of Southampton Institutional Repository

Category and depth discrimination in real-world scenes

Category and depth discrimination in real-world scenes
Category and depth discrimination in real-world scenes
Visual understanding of real-world scenes is near-instantaneous. Humans can extract a wealth of information, including spatial structure, semantic category, and the identity of embedded objects, from images viewed for fewer than 100 msecs. Visual processing has capacity limits, and, as a result, the computational processes that underlie this behaviour must be highly efficient. Computational theories of realworld scene perception model early image processing in various ways. In Chapter 1, I review these theories, and in Chapter 2, I review the role of depth cues in rapid visual processing. This discussion reveals three problems: (i) Tests of the agreement between model predictions and human responses may be biased by the arbitrary choice of category system, (ii) Current models posit that scene semantics is estimated from spatial structure properties, but empirical support for this position is inconsistent, and (iii) The time-course of depth estimation in real-world scenes is poorly understood. To address these problems, three empirical papers are presented in Chapters 3, 4, and 5. In Chapter 3, I propose and validate a novel clustering algorithm that can be applied to image databases to derive category systems for visual experiments. In Chapters 3 and 4, I examine the relationship between spatial structure and semantic information, and find little support for the position that spatial
structure properties inform semantic discrimination. In Chapters 4 and 5, I
characterize the time-course of depth processing for images presented for <267
msecs, and conclude that binocular disparity and elevation cues contribute to realworld perception shortly after image onset (<50 msecs). These findings are discussed together in Chapter 6. This thesis contributes to the evaluation of modern models of real-world scene perception, and helps to characterize how visual understanding unfolds over time.
University of Southampton
Anderson, Matt D.
53946cbf-a70a-4782-ab28-12f3b9f34aa6
Anderson, Matt D.
53946cbf-a70a-4782-ab28-12f3b9f34aa6
Graf, Erich
1a5123e2-8f05-4084-a6e6-837dcfc66209
Adams, Wendy
25685aaa-fc54-4d25-8d65-f35f4c5ab688

Anderson, Matt D. (2022) Category and depth discrimination in real-world scenes. University of Southampton, Doctoral Thesis, 245pp.

Record type: Thesis (Doctoral)

Abstract

Visual understanding of real-world scenes is near-instantaneous. Humans can extract a wealth of information, including spatial structure, semantic category, and the identity of embedded objects, from images viewed for fewer than 100 msecs. Visual processing has capacity limits, and, as a result, the computational processes that underlie this behaviour must be highly efficient. Computational theories of realworld scene perception model early image processing in various ways. In Chapter 1, I review these theories, and in Chapter 2, I review the role of depth cues in rapid visual processing. This discussion reveals three problems: (i) Tests of the agreement between model predictions and human responses may be biased by the arbitrary choice of category system, (ii) Current models posit that scene semantics is estimated from spatial structure properties, but empirical support for this position is inconsistent, and (iii) The time-course of depth estimation in real-world scenes is poorly understood. To address these problems, three empirical papers are presented in Chapters 3, 4, and 5. In Chapter 3, I propose and validate a novel clustering algorithm that can be applied to image databases to derive category systems for visual experiments. In Chapters 3 and 4, I examine the relationship between spatial structure and semantic information, and find little support for the position that spatial
structure properties inform semantic discrimination. In Chapters 4 and 5, I
characterize the time-course of depth processing for images presented for <267
msecs, and conclude that binocular disparity and elevation cues contribute to realworld perception shortly after image onset (<50 msecs). These findings are discussed together in Chapter 6. This thesis contributes to the evaluation of modern models of real-world scene perception, and helps to characterize how visual understanding unfolds over time.

Text
Matt Anderson PhD Thesis - final copy unsigned - Version of Record
Available under License University of Southampton Thesis Licence.
Download (135MB)
Text
Matt Anderson Permission to deposit thesis
Restricted to Repository staff only

More information

Published date: 2022

Identifiers

Local EPrints ID: 468559
URI: http://eprints.soton.ac.uk/id/eprint/468559
PURE UUID: 2ee0dc81-d72a-4103-a1b1-3bab89a83cbf
ORCID for Erich Graf: ORCID iD orcid.org/0000-0002-3162-4233
ORCID for Wendy Adams: ORCID iD orcid.org/0000-0002-5832-1056

Catalogue record

Date deposited: 18 Aug 2022 16:30
Last modified: 19 Aug 2022 01:38

Export record

Contributors

Thesis advisor: Erich Graf ORCID iD
Thesis advisor: Wendy Adams ORCID iD

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×