Content based retrieval and classification of music using polyphonic timbre similarity
Content based retrieval and classification of music using polyphonic timbre similarity
Digital technology and the Internet have changed the music industry's landscape. Music has become more accessible allowing consumers to store and share thousands of items in their computer's hard disk, portable media player, mobile phoone and other devices. Recent developments allow consumers to store digital music on the Internet through cloud storage. Given the large music collections available, there is a need for new applications for browsing, organising, discovering and generating playlists for users.
In previous years, searching for music has been similar to a textual information search. However, this limits music discovery as it usually requires specific information that may be unknown to the user. This thesis investigates three of the core components of content-based music retrieval: audio features, similarity functions and indexing methods. In the content-based paradigm, audio files are analyzed using their waveform and are represented by high-dimensional features. This study focuses on polyphonic timbre similarity. Polyphonic timbre is the characteristic that allows listeners to differentiate between two music signals or complex instrumental textures with the same perceived pitch and loudness. The different attributes of timbre are examined and suitable features that can be used for music retrieval using timbre similarity are investigated. Evaluations are performed to compare the performance of these features. To improve the overall performance and reduce the undesirable effects of operating in high-dimensionality space, methods on how feature spaces can be combined are also explored.
A full linear scan of the feature space is impractical for large music collections. Hence, the filter-and-refine method is adopted to expedite the retrieval process. The objective is to filter a dataset by quickly returning a set of candidate songs then refining the results using an exact similarity measure. Some novel modifications of the filtering step are made to ensure that the level of performance is maintained. The application of our timbre similarity systems are extended to automatic audio classification. In the paradigm, anunlabeled track is tagged with the label of the nearest track. Finally, the performance of our similarity estimator and audio classifier are validated in the annual Music Information Retrieval Evaluation eXchange (MIREX). The MIREX results show that our techniques are state-of-the-art methods.
De Leon, Franz
49495c02-9bb1-4366-b354-a49268e42c8b
May 2014
De Leon, Franz
49495c02-9bb1-4366-b354-a49268e42c8b
Martinez, Kirk
5f711898-20fc-410e-a007-837d8c57cb18
De Leon, Franz
(2014)
Content based retrieval and classification of music using polyphonic timbre similarity.
University of Southampton, Physical Sciences and Engineering, Doctoral Thesis, 247pp.
Record type:
Thesis
(Doctoral)
Abstract
Digital technology and the Internet have changed the music industry's landscape. Music has become more accessible allowing consumers to store and share thousands of items in their computer's hard disk, portable media player, mobile phoone and other devices. Recent developments allow consumers to store digital music on the Internet through cloud storage. Given the large music collections available, there is a need for new applications for browsing, organising, discovering and generating playlists for users.
In previous years, searching for music has been similar to a textual information search. However, this limits music discovery as it usually requires specific information that may be unknown to the user. This thesis investigates three of the core components of content-based music retrieval: audio features, similarity functions and indexing methods. In the content-based paradigm, audio files are analyzed using their waveform and are represented by high-dimensional features. This study focuses on polyphonic timbre similarity. Polyphonic timbre is the characteristic that allows listeners to differentiate between two music signals or complex instrumental textures with the same perceived pitch and loudness. The different attributes of timbre are examined and suitable features that can be used for music retrieval using timbre similarity are investigated. Evaluations are performed to compare the performance of these features. To improve the overall performance and reduce the undesirable effects of operating in high-dimensionality space, methods on how feature spaces can be combined are also explored.
A full linear scan of the feature space is impractical for large music collections. Hence, the filter-and-refine method is adopted to expedite the retrieval process. The objective is to filter a dataset by quickly returning a set of candidate songs then refining the results using an exact similarity measure. Some novel modifications of the filtering step are made to ensure that the level of performance is maintained. The application of our timbre similarity systems are extended to automatic audio classification. In the paradigm, anunlabeled track is tagged with the label of the nearest track. Finally, the performance of our similarity estimator and audio classifier are validated in the annual Music Information Retrieval Evaluation eXchange (MIREX). The MIREX results show that our techniques are state-of-the-art methods.
More information
Published date: May 2014
Organisations:
University of Southampton, Web & Internet Science
Identifiers
Local EPrints ID: 368591
URI: http://eprints.soton.ac.uk/id/eprint/368591
PURE UUID: b02c6e07-1d5d-43f7-a8f6-d26978bf4674
Catalogue record
Date deposited: 24 Oct 2014 12:24
Last modified: 15 Mar 2024 02:53
Export record
Contributors
Author:
Franz De Leon
Thesis advisor:
Kirk Martinez
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics