The University of Southampton
University of Southampton Institutional Repository

Emergent visual communication

Emergent visual communication
Emergent visual communication
Our ability to perceive, represent and understand the surrounding visual world is one of the most fascinating, but equally intricate, parts of our nervous system. Supported by a sufficiently complex brain, we start to learn and develop these cognitive abilities and skills such as communication from the moment we are born. These capabilities are essential in numerous tasks we carry out in our daily life. The development of such intelligence is promoted by exploration of the environment and social interaction. As such, advancing artificial intelligent agents capable of interaction through communication with each other and with humans has been a long-standing goal. This thesis seeks to uncover how inter-agent communication about the visual world, emerging in a completely self-supervised way, can be modelled and its interpretability improved. This research draws inspiration from how human communication developed and first compares the processes involved in transmitting meaningful information between humans and machines. In the context of referential signalling games played with realistic images, intelligent agents modelled as deep neural networks have previously been shown to develop successful token-based communication protocols to achieve a shared goal. This thesis analyses the factors which influence the emergence of meaningful protocols and shows that visual semantics can be learned in a self-supervised way. Nonetheless, qualitative and quantitative insights into emergent token-based communication are not easily explainable to humans. We thus propose drawing as a communication channel which is a much simpler and more directly interpretable modality than language. To enable end-to-end learnable models of visual communication, a differentiable relaxation of the process of drawing vector primitives into pixel rasters is proposed. Using this approach, the physical act of drawing with a pen on paper can be modelled. We then demonstrate that agents cooperating on a signalling game learn to communicate through sketching. An extensive analysis of the factors which influence the meaning and intent of agents’ drawings is presented. The final two chapters show how interpretable sketches emerge when inducing visual perceptual similarity constraints. Through human evaluation of the emergent visual communication, we explore how, with appropriate inductive biases, artificial agents learn to draw in a fashion that humans can interpret.
University of Southampton
Mihai, Andreea Daniela
f8910fe1-18e7-45b3-8923-b34b5cd136fa
Mihai, Andreea Daniela
f8910fe1-18e7-45b3-8923-b34b5cd136fa
Hare, Jonathon
65ba2cda-eaaf-4767-a325-cd845504e5a9

Mihai, Andreea Daniela (2022) Emergent visual communication. University of Southampton, Doctoral Thesis, 186pp.

Record type: Thesis (Doctoral)

Abstract

Our ability to perceive, represent and understand the surrounding visual world is one of the most fascinating, but equally intricate, parts of our nervous system. Supported by a sufficiently complex brain, we start to learn and develop these cognitive abilities and skills such as communication from the moment we are born. These capabilities are essential in numerous tasks we carry out in our daily life. The development of such intelligence is promoted by exploration of the environment and social interaction. As such, advancing artificial intelligent agents capable of interaction through communication with each other and with humans has been a long-standing goal. This thesis seeks to uncover how inter-agent communication about the visual world, emerging in a completely self-supervised way, can be modelled and its interpretability improved. This research draws inspiration from how human communication developed and first compares the processes involved in transmitting meaningful information between humans and machines. In the context of referential signalling games played with realistic images, intelligent agents modelled as deep neural networks have previously been shown to develop successful token-based communication protocols to achieve a shared goal. This thesis analyses the factors which influence the emergence of meaningful protocols and shows that visual semantics can be learned in a self-supervised way. Nonetheless, qualitative and quantitative insights into emergent token-based communication are not easily explainable to humans. We thus propose drawing as a communication channel which is a much simpler and more directly interpretable modality than language. To enable end-to-end learnable models of visual communication, a differentiable relaxation of the process of drawing vector primitives into pixel rasters is proposed. Using this approach, the physical act of drawing with a pen on paper can be modelled. We then demonstrate that agents cooperating on a signalling game learn to communicate through sketching. An extensive analysis of the factors which influence the meaning and intent of agents’ drawings is presented. The final two chapters show how interpretable sketches emerge when inducing visual perceptual similarity constraints. Through human evaluation of the emergent visual communication, we explore how, with appropriate inductive biases, artificial agents learn to draw in a fashion that humans can interpret.

Text
Emergent Visual Communication-archival - Version of Record
Available under License University of Southampton Thesis Licence.
Download (79MB)
Text
Final-thesis-submission-Examination-Miss-Andreea-Mihai (2)
Restricted to Repository staff only
Available under License University of Southampton Thesis Licence.

More information

Published date: September 2022

Identifiers

Local EPrints ID: 469905
URI: http://eprints.soton.ac.uk/id/eprint/469905
PURE UUID: ff98f1f5-f206-480c-97f8-3c0634361170
ORCID for Andreea Daniela Mihai: ORCID iD orcid.org/0000-0003-3368-9062
ORCID for Jonathon Hare: ORCID iD orcid.org/0000-0003-2921-4283

Catalogue record

Date deposited: 28 Sep 2022 16:49
Last modified: 17 Mar 2024 03:05

Export record

Contributors

Author: Andreea Daniela Mihai ORCID iD
Thesis advisor: Jonathon Hare ORCID iD

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×