The University of Southampton
University of Southampton Institutional Repository

Enriching media fragments with named entities for video classification

Enriching media fragments with named entities for video classification
Enriching media fragments with named entities for video classification
With the steady increase of videos published on media sharing platforms such as Dailymotion and YouTube, more and more efforts are spent to automatically annotate and organize these videos. In this paper, we propose a framework for classifying video items using both textual features such as named entities extracted from subtitles, and temporal features such as the duration of the media fragments where particular entities are spotted. We implement four automatic machine learning algorithms for multiclass classification problems, namely Logistic Regression (LG), K-Nearest Neighbour (KNN), Naive Bayes (NB) and Support Vector Machine (SVM). We study the temporal distribution patterns of named entities extracted from 805 Dailymotion videos. The results show that the best performance using the entity distribution is obtained with KNN (overall accuracy of 46.58%) while the best performance using the temporal distribution of named entities for each type is obtained with SVM (overall accuracy of 43.60%). We conclude that this approach is promising for automatically classifying online videos.
Li, Yunjia
3a0d988e-b5e3-43c9-a268-dc14b5313547
Rizzo, Giuseppe
f19ab8da-0fbc-41df-97eb-bc8caa58893c
Garcia, Jose Luis Redondo
f4771184-5569-4459-a80c-61a9906327fd
Troncy, Raphael
c8dad007-f619-4533-9b0a-0b278a9c9828
Wald, Mike
90577cfd-35ae-4e4a-9422-5acffecd89d5
Wills, Gary
3a594558-6921-4e82-8098-38cd8d4e8aa0
Li, Yunjia
3a0d988e-b5e3-43c9-a268-dc14b5313547
Rizzo, Giuseppe
f19ab8da-0fbc-41df-97eb-bc8caa58893c
Garcia, Jose Luis Redondo
f4771184-5569-4459-a80c-61a9906327fd
Troncy, Raphael
c8dad007-f619-4533-9b0a-0b278a9c9828
Wald, Mike
90577cfd-35ae-4e4a-9422-5acffecd89d5
Wills, Gary
3a594558-6921-4e82-8098-38cd8d4e8aa0

Li, Yunjia, Rizzo, Giuseppe, Garcia, Jose Luis Redondo, Troncy, Raphael, Wald, Mike and Wills, Gary (2013) Enriching media fragments with named entities for video classification. First Worldwide Web Workshop on Linked Media (LiME-2013), Rio de Janeiro, Brazil. 13 - 17 May 2013.

Record type: Conference or Workshop Item (Paper)

Abstract

With the steady increase of videos published on media sharing platforms such as Dailymotion and YouTube, more and more efforts are spent to automatically annotate and organize these videos. In this paper, we propose a framework for classifying video items using both textual features such as named entities extracted from subtitles, and temporal features such as the duration of the media fragments where particular entities are spotted. We implement four automatic machine learning algorithms for multiclass classification problems, namely Logistic Regression (LG), K-Nearest Neighbour (KNN), Naive Bayes (NB) and Support Vector Machine (SVM). We study the temporal distribution patterns of named entities extracted from 805 Dailymotion videos. The results show that the best performance using the entity distribution is obtained with KNN (overall accuracy of 46.58%) while the best performance using the temporal distribution of named entities for each type is obtained with SVM (overall accuracy of 43.60%). We conclude that this approach is promising for automatically classifying online videos.

Text
mfenricher.pdf - Other
Download (1MB)

More information

Published date: 13 May 2013
Venue - Dates: First Worldwide Web Workshop on Linked Media (LiME-2013), Rio de Janeiro, Brazil, 2013-05-13 - 2013-05-17
Organisations: Web & Internet Science

Identifiers

Local EPrints ID: 352219
URI: http://eprints.soton.ac.uk/id/eprint/352219
PURE UUID: 20623f62-fb23-4a4a-ba14-f951a319b7a8
ORCID for Gary Wills: ORCID iD orcid.org/0000-0001-5771-4088

Catalogue record

Date deposited: 08 May 2013 13:49
Last modified: 15 Mar 2024 02:51

Export record

Contributors

Author: Yunjia Li
Author: Giuseppe Rizzo
Author: Jose Luis Redondo Garcia
Author: Raphael Troncy
Author: Mike Wald
Author: Gary Wills ORCID iD

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×