Learning to count objects in natural images for visual question answering

Zhang, Yan, Hare, Jonathon and Prügel-Bennett, Adam (2018) Learning to count objects in natural images for visual question answering. International Conference on Learning Representations, Vancouver Convention Center, Vancouver, Canada. 30 Apr - 03 May 2018. pp. 1-17 .

Record type: Conference or Workshop Item (Paper)

Abstract

Visual Question Answering (VQA) models have struggled with counting objects in natural images so far. We identify a fundamental problem due to soft attention in these models as a cause. To circumvent this problem, we propose a neural network component that allows robust counting from object proposals. Experiments on a toy task show the effectiveness of this component and we obtain state-of-the-art accuracy on the number category of the VQA v2 dataset without negatively affecting other categories, even outperforming ensemble models with our single model. On a difficult balanced pair metric, the component gives a substantial improvement in counting over a strong baseline by 6.6%.

Text

learning_to_count_objects_in_natural_images_for_visual_question_answering - Version of Record

Restricted to Repository staff only

Request a copy

More information

Accepted/In Press date: 29 January 2018

e-pub ahead of print date: 19 February 2018

Published date: 30 April 2018

Venue - Dates: International Conference on Learning Representations, Vancouver Convention Center, Vancouver, Canada, 2018-04-30 - 2018-05-03

Related URLs:

Keywords: vqa

Learn more about the Southampton Marine and Maritime Institute Learn more about the Vision, Learning and Control Learn more about the Electronics & Computer Science

Identifiers

Local EPrints ID: 418094

URI: http://eprints.soton.ac.uk/id/eprint/418094

PURE UUID: 609cde96-c244-4978-a283-63decb049b91

ORCID for Yan Zhang:

orcid.org/0000-0003-3470-3663

ORCID for Jonathon Hare:

orcid.org/0000-0003-2921-4283

Catalogue record

Date deposited: 22 Feb 2018 17:30

Last modified: 16 Mar 2024 03:50

Export record

Share this record

Share this on Facebook Share this on Twitter Share this on Weibo

Contributors

Author: Yan Zhang

Author: Jonathon Hare

Author: Adam Prügel-Bennett

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Library staff additional information