The University of Southampton
University of Southampton Institutional Repository

Optimisation classification on the web of data using linked data. A study case: Movie popularity classification

Optimisation classification on the web of data using linked data. A study case: Movie popularity classification
Optimisation classification on the web of data using linked data. A study case: Movie popularity classification
Data mining algorithms have been widely used to solve various types of prediction models in movie domain. Classification problems especially to predict the future success of movies have attracted many researchers in order to find efficient ways to address them. However, movie popularity classification has become very complicated as it has too many parameters with different degrees. In this thesis, we review a broad range of literature on (1) movie prediction domain and identify related data, a main data source, and additional data sources to address these problems; (2) on data mining algorithms to build more robust classification models to predict movie popularity. To obtain the robust movie popularity classification model, three experiments were conducted. The first experiment examined five single classifiers (Artificial Neural Network (ANN), Decision Tree (DT), k-NN, Rule Induction (RI) and SVM Polynomial) to develop classification models to predict the future success of movie. The second experiment assessed the use of wrapper-type feature selection algorithms to develop classification models of movie popularity. The last one scrutinized two ensemble methods, bagging and boosting in classifying movie popularity. Based upon the finding and analysis, this thesis contributes in four areas: (1) it demonstrates the capabilities of linked data to get external movie related data sources and shows how additional attributes from external data sources can be used to improve performances of the classification model based on a single data source; (2) it presents the use of Grid Search to get a set of optimal hyper-parameters of Artificial Neural Network (ANN), Decision Tree (DT), Rule Induction (RI) and SVM Polynomial classifiers so as to get more robust classification model; (3) it proves the use of wrapper-type feature selection using Genetic Algorithm suited to those classifiers either using default or optimized parameters in order to get the robust classification model and (4) it establishes the use of ensemble methods (bagging and boosting) to those classifiers either using default or optimized parameters in order to get the model in question.
University of Southampton
Budiprasetyo, Gunawan
4783cf01-8fff-4e52-8d51-339ddf7221e6
Budiprasetyo, Gunawan
4783cf01-8fff-4e52-8d51-339ddf7221e6
Hall, Wendy
11f7f8db-854c-4481-b1ae-721a51d8790c

Budiprasetyo, Gunawan (2019) Optimisation classification on the web of data using linked data. A study case: Movie popularity classification. University of Southampton, Doctoral Thesis, 360pp.

Record type: Thesis (Doctoral)

Abstract

Data mining algorithms have been widely used to solve various types of prediction models in movie domain. Classification problems especially to predict the future success of movies have attracted many researchers in order to find efficient ways to address them. However, movie popularity classification has become very complicated as it has too many parameters with different degrees. In this thesis, we review a broad range of literature on (1) movie prediction domain and identify related data, a main data source, and additional data sources to address these problems; (2) on data mining algorithms to build more robust classification models to predict movie popularity. To obtain the robust movie popularity classification model, three experiments were conducted. The first experiment examined five single classifiers (Artificial Neural Network (ANN), Decision Tree (DT), k-NN, Rule Induction (RI) and SVM Polynomial) to develop classification models to predict the future success of movie. The second experiment assessed the use of wrapper-type feature selection algorithms to develop classification models of movie popularity. The last one scrutinized two ensemble methods, bagging and boosting in classifying movie popularity. Based upon the finding and analysis, this thesis contributes in four areas: (1) it demonstrates the capabilities of linked data to get external movie related data sources and shows how additional attributes from external data sources can be used to improve performances of the classification model based on a single data source; (2) it presents the use of Grid Search to get a set of optimal hyper-parameters of Artificial Neural Network (ANN), Decision Tree (DT), Rule Induction (RI) and SVM Polynomial classifiers so as to get more robust classification model; (3) it proves the use of wrapper-type feature selection using Genetic Algorithm suited to those classifiers either using default or optimized parameters in order to get the robust classification model and (4) it establishes the use of ensemble methods (bagging and boosting) to those classifiers either using default or optimized parameters in order to get the model in question.

Text
GunawanBudiprasetyoFinalThesisWithThesisCopyrightDeclaration - Version of Record
Available under License University of Southampton Thesis Licence.
Download (9MB)

More information

Published date: March 2019

Identifiers

Local EPrints ID: 433553
URI: http://eprints.soton.ac.uk/id/eprint/433553
PURE UUID: 2d4263e7-8c2a-41e6-a210-08512fffdd9e
ORCID for Wendy Hall: ORCID iD orcid.org/0000-0003-4327-7811

Catalogue record

Date deposited: 27 Aug 2019 16:30
Last modified: 07 Sep 2019 00:40

Export record

Contributors

Author: Gunawan Budiprasetyo
Thesis advisor: Wendy Hall ORCID iD

University divisions

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×