M-Arg: multimodal argument mining dataset for political debates with audio and transcripts
M-Arg: multimodal argument mining dataset for political debates with audio and transcripts
Argumentation mining aims at extracting, analysing and modelling people’s arguments, but large, high-quality annotated datasets are limited, and no multimodal datasets exist for this task. In this paper, we present M-Arg, a multimodal argument mining dataset with a corpus of US 2020 presidential debates, annotated through crowd-sourced annotations. This dataset allows models to be trained to extract arguments from natural dialogue such as debates using information like the intonation and rhythm of the speaker. Our dataset contains 7 hours of annotated US presidential debates, 6527 utterances and 4104 relation labels, and we report results from different baseline models with highest accuracy of 0.86 with a multimodal model.
Natural Language Processing, Argument Mining, Artificial Intelligence
Mestre, Rafael
33721a01-ab1a-4f71-8b0e-abef8afc92f3
Milicin, Razvan
bcc0599d-114a-4cd0-8fc9-202421b69caa
Middleton, Stuart
404b62ba-d77e-476b-9775-32645b04473f
Ryan, Matthew
f07cd3e8-f3d9-4681-9091-84c2df07cd54
Zhu, Jiatong
52569115-5d72-4fc0-8876-a66b991ed209
Norman, Timothy
663e522f-807c-4569-9201-dc141c8eb50d
November 2021
Mestre, Rafael
33721a01-ab1a-4f71-8b0e-abef8afc92f3
Milicin, Razvan
bcc0599d-114a-4cd0-8fc9-202421b69caa
Middleton, Stuart
404b62ba-d77e-476b-9775-32645b04473f
Ryan, Matthew
f07cd3e8-f3d9-4681-9091-84c2df07cd54
Zhu, Jiatong
52569115-5d72-4fc0-8876-a66b991ed209
Norman, Timothy
663e522f-807c-4569-9201-dc141c8eb50d
Mestre, Rafael, Milicin, Razvan, Middleton, Stuart, Ryan, Matthew, Zhu, Jiatong and Norman, Timothy
(2021)
M-Arg: multimodal argument mining dataset for political debates with audio and transcripts.
8th Workshop on Argument Mining.
9 pp
.
(doi:10.18653/v1/2021.argmining-1.8).
Record type:
Conference or Workshop Item
(Paper)
Abstract
Argumentation mining aims at extracting, analysing and modelling people’s arguments, but large, high-quality annotated datasets are limited, and no multimodal datasets exist for this task. In this paper, we present M-Arg, a multimodal argument mining dataset with a corpus of US 2020 presidential debates, annotated through crowd-sourced annotations. This dataset allows models to be trained to extract arguments from natural dialogue such as debates using information like the intonation and rhythm of the speaker. Our dataset contains 7 hours of annotated US presidential debates, 6527 utterances and 4104 relation labels, and we report results from different baseline models with highest accuracy of 0.86 with a multimodal model.
Text
ArgMin_Workshop_Paper
- Accepted Manuscript
More information
Accepted/In Press date: 1 September 2021
Published date: November 2021
Venue - Dates:
8th Workshop on Argument Mining, 2021-11-01
Keywords:
Natural Language Processing, Argument Mining, Artificial Intelligence
Identifiers
Local EPrints ID: 452873
URI: http://eprints.soton.ac.uk/id/eprint/452873
PURE UUID: 86bb4e31-ee54-47b7-99ff-90259049a07a
Catalogue record
Date deposited: 06 Jan 2022 17:37
Last modified: 17 Mar 2024 04:06
Export record
Altmetrics
Contributors
Author:
Razvan Milicin
Author:
Jiatong Zhu
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics