The University of Southampton
University of Southampton Institutional Repository

Multimodal speech and visual gesture control interface technique for small unmanned multirotor aircraft

Multimodal speech and visual gesture control interface technique for small unmanned multirotor aircraft
Multimodal speech and visual gesture control interface technique for small unmanned multirotor aircraft

This research conducted an investigation into the use of novel human computer interaction(HCI) interfaces in the control of small multirotor unmanned aerial vehicles(UAVs). The main objective was to propose, design, and develop an alternative control interface for the small multirotor UAV, which could perform better than the standard RC joystick (RCJ) controller, and to evaluate the performance of the proposed interface. The multimodal speech and visual gesture (mSVG)interface were proposed, designed, and developed. This was then coupled to a Rotor S ROS Gazebo UAV simulator. An experiment study was designed to determine how practical the use of the proposed multimodal speech and visual gesture interface was in the control of small multirotor UAVs by determining the limits of speech and gesture at different ambient noise levels and under different background-lighting conditions, respectively. And to determine how the mSVG interface compares to the RC joystick controller for a simple navigational control task - in terms of performance (time of completion and accuracy of navigational control) and from a human factor’s perspective (user satisfaction and cognitive workload). 37 participants were recruited. From the results of the experiments conducted, the mSVG interface was found to be an effective alternative to the RCJ interface when operated within a constrained application environment. From the result of the noise level experiment, it was observed that speech recognition accuracy/success rate falls as noise levels rise, with75 dB noise level being the practical aerial robot (aerobot) application limit. From the results of the gesture lighting experiment, gestures were successfully recognised from 10 Lux and above on distinct solid backgrounds, but the effect of varying both the lighting conditions and the environment background on the quality of gesture recognition, was insignificant (< 0.5%), implying that the technology used, type of gesture captured, and the image processing technique used were more important. From the result of the performance and cognitive workload comparison between the RCJ and mSVG interfaces, the mSVG interface was found to perform better at higher nCA application levels than the RCJ interface. The mSVG interface was 1 minute faster and 25% more accurate than the RCJ interface; and the RCJ interface was found to be 1.4 times more cognitively demanding than the mSVG interface. The main limitation of this research was the limited lighting level range of 10 Lux - 1400 Lux used during the gesture lighting experiment, which constrains the application limit to lowlighting indoor environments. Suggested further works from this research included the development of a more robust gesture and speech algorithm and the coupling of the improved mSVG interface on to a practical UAV.

University of Southampton
Abioye, Ayodeji
1f067792-1ebc-4c72-b4e5-67db8706f3ee
Abioye, Ayodeji
1f067792-1ebc-4c72-b4e5-67db8706f3ee
Prior, Stephen
9c753e49-092a-4dc5-b4cd-6d5ff77e9ced
Ramchurn, Sarvapali
1d62ae2a-a498-444e-912d-a6082d3aaea3
Thomas, Glyn T.
01cfee2e-b9b6-4e71-a3ce-080f3d27f11c
Saddington, Peter
5f3b9162-2a5d-4cae-8cfc-a5116d3edef6

Abioye, Ayodeji (2023) Multimodal speech and visual gesture control interface technique for small unmanned multirotor aircraft. University of Southampton, Doctoral Thesis, 372pp.

Record type: Thesis (Doctoral)

Abstract

This research conducted an investigation into the use of novel human computer interaction(HCI) interfaces in the control of small multirotor unmanned aerial vehicles(UAVs). The main objective was to propose, design, and develop an alternative control interface for the small multirotor UAV, which could perform better than the standard RC joystick (RCJ) controller, and to evaluate the performance of the proposed interface. The multimodal speech and visual gesture (mSVG)interface were proposed, designed, and developed. This was then coupled to a Rotor S ROS Gazebo UAV simulator. An experiment study was designed to determine how practical the use of the proposed multimodal speech and visual gesture interface was in the control of small multirotor UAVs by determining the limits of speech and gesture at different ambient noise levels and under different background-lighting conditions, respectively. And to determine how the mSVG interface compares to the RC joystick controller for a simple navigational control task - in terms of performance (time of completion and accuracy of navigational control) and from a human factor’s perspective (user satisfaction and cognitive workload). 37 participants were recruited. From the results of the experiments conducted, the mSVG interface was found to be an effective alternative to the RCJ interface when operated within a constrained application environment. From the result of the noise level experiment, it was observed that speech recognition accuracy/success rate falls as noise levels rise, with75 dB noise level being the practical aerial robot (aerobot) application limit. From the results of the gesture lighting experiment, gestures were successfully recognised from 10 Lux and above on distinct solid backgrounds, but the effect of varying both the lighting conditions and the environment background on the quality of gesture recognition, was insignificant (< 0.5%), implying that the technology used, type of gesture captured, and the image processing technique used were more important. From the result of the performance and cognitive workload comparison between the RCJ and mSVG interfaces, the mSVG interface was found to perform better at higher nCA application levels than the RCJ interface. The mSVG interface was 1 minute faster and 25% more accurate than the RCJ interface; and the RCJ interface was found to be 1.4 times more cognitively demanding than the mSVG interface. The main limitation of this research was the limited lighting level range of 10 Lux - 1400 Lux used during the gesture lighting experiment, which constrains the application limit to lowlighting indoor environments. Suggested further works from this research included the development of a more robust gesture and speech algorithm and the coupling of the improved mSVG interface on to a practical UAV.

Text
Ayodeji_Abioyephd_Doctoral_thesis_PDFA - Version of Record
Available under License University of Southampton Thesis Licence.
Download (28MB)

More information

Submitted date: August 2019
Published date: July 2023

Identifiers

Local EPrints ID: 479472
URI: http://eprints.soton.ac.uk/id/eprint/479472
PURE UUID: c62a0b0b-0180-4287-96bb-b3affd715e84
ORCID for Stephen Prior: ORCID iD orcid.org/0000-0002-4993-4942
ORCID for Sarvapali Ramchurn: ORCID iD orcid.org/0000-0001-9686-4302

Catalogue record

Date deposited: 25 Jul 2023 16:31
Last modified: 17 Mar 2024 03:30

Export record

Contributors

Author: Ayodeji Abioye
Thesis advisor: Stephen Prior ORCID iD
Thesis advisor: Sarvapali Ramchurn ORCID iD
Thesis advisor: Glyn T. Thomas
Thesis advisor: Peter Saddington

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×