Enriching media fragments with named entities for video classification
Enriching media fragments with named entities for video classification
With the steady increase of videos published on media sharing platforms such as Dailymotion and YouTube, more and more efforts are spent to automatically annotate and organize these videos. In this paper, we propose a framework for classifying video items using both textual features such as named entities extracted from subtitles, and temporal features such as the duration of the media fragments where particular entities are spotted. We implement four automatic machine learning algorithms for multiclass classification problems, namely Logistic Regression (LG), K-Nearest Neighbour (KNN), Naive Bayes (NB) and Support Vector Machine (SVM). We study the temporal distribution patterns of named entities extracted from 805 Dailymotion videos. The results show that the best performance using the entity distribution is obtained with KNN (overall accuracy of 46.58%) while the best performance using the temporal distribution of named entities for each type is obtained with SVM (overall accuracy of 43.60%). We conclude that this approach is promising for automatically classifying online videos.
Li, Yunjia
3a0d988e-b5e3-43c9-a268-dc14b5313547
Rizzo, Giuseppe
f19ab8da-0fbc-41df-97eb-bc8caa58893c
Garcia, Jose Luis Redondo
f4771184-5569-4459-a80c-61a9906327fd
Troncy, Raphael
c8dad007-f619-4533-9b0a-0b278a9c9828
Wald, Mike
90577cfd-35ae-4e4a-9422-5acffecd89d5
Wills, Gary
3a594558-6921-4e82-8098-38cd8d4e8aa0
13 May 2013
Li, Yunjia
3a0d988e-b5e3-43c9-a268-dc14b5313547
Rizzo, Giuseppe
f19ab8da-0fbc-41df-97eb-bc8caa58893c
Garcia, Jose Luis Redondo
f4771184-5569-4459-a80c-61a9906327fd
Troncy, Raphael
c8dad007-f619-4533-9b0a-0b278a9c9828
Wald, Mike
90577cfd-35ae-4e4a-9422-5acffecd89d5
Wills, Gary
3a594558-6921-4e82-8098-38cd8d4e8aa0
Li, Yunjia, Rizzo, Giuseppe, Garcia, Jose Luis Redondo, Troncy, Raphael, Wald, Mike and Wills, Gary
(2013)
Enriching media fragments with named entities for video classification.
First Worldwide Web Workshop on Linked Media (LiME-2013), Rio de Janeiro, Brazil.
13 - 17 May 2013.
Record type:
Conference or Workshop Item
(Paper)
Abstract
With the steady increase of videos published on media sharing platforms such as Dailymotion and YouTube, more and more efforts are spent to automatically annotate and organize these videos. In this paper, we propose a framework for classifying video items using both textual features such as named entities extracted from subtitles, and temporal features such as the duration of the media fragments where particular entities are spotted. We implement four automatic machine learning algorithms for multiclass classification problems, namely Logistic Regression (LG), K-Nearest Neighbour (KNN), Naive Bayes (NB) and Support Vector Machine (SVM). We study the temporal distribution patterns of named entities extracted from 805 Dailymotion videos. The results show that the best performance using the entity distribution is obtained with KNN (overall accuracy of 46.58%) while the best performance using the temporal distribution of named entities for each type is obtained with SVM (overall accuracy of 43.60%). We conclude that this approach is promising for automatically classifying online videos.
Text
mfenricher.pdf
- Other
More information
Published date: 13 May 2013
Venue - Dates:
First Worldwide Web Workshop on Linked Media (LiME-2013), Rio de Janeiro, Brazil, 2013-05-13 - 2013-05-17
Organisations:
Web & Internet Science
Identifiers
Local EPrints ID: 352219
URI: http://eprints.soton.ac.uk/id/eprint/352219
PURE UUID: 20623f62-fb23-4a4a-ba14-f951a319b7a8
Catalogue record
Date deposited: 08 May 2013 13:49
Last modified: 15 Mar 2024 02:51
Export record
Contributors
Author:
Yunjia Li
Author:
Giuseppe Rizzo
Author:
Jose Luis Redondo Garcia
Author:
Raphael Troncy
Author:
Mike Wald
Author:
Gary Wills
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics