Emergent visual communication
Emergent visual communication
Our ability to perceive, represent and understand the surrounding visual world is one of the most fascinating, but equally intricate, parts of our nervous system. Supported by a sufficiently complex brain, we start to learn and develop these cognitive abilities and skills such as communication from the moment we are born. These capabilities are essential in numerous tasks we carry out in our daily life. The development of such intelligence is promoted by exploration of the environment and social interaction. As such, advancing artificial intelligent agents capable of interaction through communication with each other and with humans has been a long-standing goal. This thesis seeks to uncover how inter-agent communication about the visual world, emerging in a completely self-supervised way, can be modelled and its interpretability improved. This research draws inspiration from how human communication developed and first compares the processes involved in transmitting meaningful information between humans and machines. In the context of referential signalling games played with realistic images, intelligent agents modelled as deep neural networks have previously been shown to develop successful token-based communication protocols to achieve a shared goal. This thesis analyses the factors which influence the emergence of meaningful protocols and shows that visual semantics can be learned in a self-supervised way. Nonetheless, qualitative and quantitative insights into emergent token-based communication are not easily explainable to humans. We thus propose drawing as a communication channel which is a much simpler and more directly interpretable modality than language. To enable end-to-end learnable models of visual communication, a differentiable relaxation of the process of drawing vector primitives into pixel rasters is proposed. Using this approach, the physical act of drawing with a pen on paper can be modelled. We then demonstrate that agents cooperating on a signalling game learn to communicate through sketching. An extensive analysis of the factors which influence the meaning and intent of agents’ drawings is presented. The final two chapters show how interpretable sketches emerge when inducing visual perceptual similarity constraints. Through human evaluation of the emergent visual communication, we explore how, with appropriate inductive biases, artificial agents learn to draw in a fashion that humans can interpret.
University of Southampton
Mihai, Andreea Daniela
f8910fe1-18e7-45b3-8923-b34b5cd136fa
September 2022
Mihai, Andreea Daniela
f8910fe1-18e7-45b3-8923-b34b5cd136fa
Hare, Jonathon
65ba2cda-eaaf-4767-a325-cd845504e5a9
Mihai, Andreea Daniela
(2022)
Emergent visual communication.
University of Southampton, Doctoral Thesis, 186pp.
Record type:
Thesis
(Doctoral)
Abstract
Our ability to perceive, represent and understand the surrounding visual world is one of the most fascinating, but equally intricate, parts of our nervous system. Supported by a sufficiently complex brain, we start to learn and develop these cognitive abilities and skills such as communication from the moment we are born. These capabilities are essential in numerous tasks we carry out in our daily life. The development of such intelligence is promoted by exploration of the environment and social interaction. As such, advancing artificial intelligent agents capable of interaction through communication with each other and with humans has been a long-standing goal. This thesis seeks to uncover how inter-agent communication about the visual world, emerging in a completely self-supervised way, can be modelled and its interpretability improved. This research draws inspiration from how human communication developed and first compares the processes involved in transmitting meaningful information between humans and machines. In the context of referential signalling games played with realistic images, intelligent agents modelled as deep neural networks have previously been shown to develop successful token-based communication protocols to achieve a shared goal. This thesis analyses the factors which influence the emergence of meaningful protocols and shows that visual semantics can be learned in a self-supervised way. Nonetheless, qualitative and quantitative insights into emergent token-based communication are not easily explainable to humans. We thus propose drawing as a communication channel which is a much simpler and more directly interpretable modality than language. To enable end-to-end learnable models of visual communication, a differentiable relaxation of the process of drawing vector primitives into pixel rasters is proposed. Using this approach, the physical act of drawing with a pen on paper can be modelled. We then demonstrate that agents cooperating on a signalling game learn to communicate through sketching. An extensive analysis of the factors which influence the meaning and intent of agents’ drawings is presented. The final two chapters show how interpretable sketches emerge when inducing visual perceptual similarity constraints. Through human evaluation of the emergent visual communication, we explore how, with appropriate inductive biases, artificial agents learn to draw in a fashion that humans can interpret.
Text
Emergent Visual Communication-archival
- Version of Record
Text
Final-thesis-submission-Examination-Miss-Andreea-Mihai (2)
Restricted to Repository staff only
More information
Published date: September 2022
Identifiers
Local EPrints ID: 469905
URI: http://eprints.soton.ac.uk/id/eprint/469905
PURE UUID: ff98f1f5-f206-480c-97f8-3c0634361170
Catalogue record
Date deposited: 28 Sep 2022 16:49
Last modified: 17 Mar 2024 03:05
Export record
Contributors
Author:
Andreea Daniela Mihai
Thesis advisor:
Jonathon Hare
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics