Evaluating and complementing vision-to-language technology for people who are blind with conversational crowdsourcing
Evaluating and complementing vision-to-language technology for people who are blind with conversational crowdsourcing
We study how real-time crowdsourcing can be used both for evaluating the value provided by existing automated approaches and for enabling workflows that provide scalable and useful alt text to blind users. We show that the shortcomings of existing AI image captioning systems frequently hinder a user's understanding of an image they cannot see to a degree that even clarifying conversations with sighted assistants cannot correct. Based on analysis of clarifying conversations collected from our studies, we design experiences that can effectively assist users in a scalable way without the need for real-time interaction. Our results provide lessons and guidelines that the designers of future AI captioning systems can use to improve labeling of social media imagery for blind users.
5349-5353
International Joint Conferences on Artificial Intelligence
Salisbury, Elliot
3573f86f-8305-4850-b911-41fdf896e946
Kamar, Ece
a913ec75-625a-4eb7-841f-5ae748b68db6
Morris, Meredith Ringel
c1b86d80-2548-4e92-acb7-ee2bf7d5303c
2018
Salisbury, Elliot
3573f86f-8305-4850-b911-41fdf896e946
Kamar, Ece
a913ec75-625a-4eb7-841f-5ae748b68db6
Morris, Meredith Ringel
c1b86d80-2548-4e92-acb7-ee2bf7d5303c
Salisbury, Elliot, Kamar, Ece and Morris, Meredith Ringel
(2018)
Evaluating and complementing vision-to-language technology for people who are blind with conversational crowdsourcing.
In Proceedings of the 27th International Joint Conference on Artificial Intelligence, IJCAI 2018.
vol. 2018-July,
International Joint Conferences on Artificial Intelligence.
.
(doi:10.24963/ijcai.2018/751).
Record type:
Conference or Workshop Item
(Paper)
Abstract
We study how real-time crowdsourcing can be used both for evaluating the value provided by existing automated approaches and for enabling workflows that provide scalable and useful alt text to blind users. We show that the shortcomings of existing AI image captioning systems frequently hinder a user's understanding of an image they cannot see to a degree that even clarifying conversations with sighted assistants cannot correct. Based on analysis of clarifying conversations collected from our studies, we design experiences that can effectively assist users in a scalable way without the need for real-time interaction. Our results provide lessons and guidelines that the designers of future AI captioning systems can use to improve labeling of social media imagery for blind users.
This record has no associated files available for download.
More information
Published date: 2018
Venue - Dates:
International Joint Conference on Artificial Intelligence, , Stockholm, Sweden, 2018-07-13 - 2018-07-19
Identifiers
Local EPrints ID: 426530
URI: http://eprints.soton.ac.uk/id/eprint/426530
PURE UUID: fd6d989c-c79a-4a8d-98d3-b1db11ff669a
Catalogue record
Date deposited: 30 Nov 2018 17:30
Last modified: 15 Mar 2024 23:10
Export record
Altmetrics
Contributors
Author:
Elliot Salisbury
Author:
Ece Kamar
Author:
Meredith Ringel Morris
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics