Rank selection in non-negative matrix factorization using minimum description Length
Rank selection in non-negative matrix factorization using minimum description Length
Nonnegative matrix factorization (NMF) is primarily a linear dimensionality reduction technique that factorizes a nonnegative data matrix into two smaller nonnegative matrices: one that represents the basis of the new subspace and the second that holds the coefficients of all the data points in that new space. In principle, the nonnegativity constraint forces the representation to be sparse and parts based. Instead of extracting holistic features from the data, real parts are extracted that should be significantly easier to interpret and analyze. The size of the new subspace selects how many features will be extracted from the data. An effective choice should minimize the noise while extracting the key features. We propose a mechanism for selecting the subspace size by using a minimum description length technique. We demonstrate that our technique provides plausible estimates for real data as well as accurately predicting the known size of synthetic data. We provide an implementation of our code in a Matlab format.
2164-2176
Squires, Steven
68512c11-065d-45e7-a0a9-54a32198e6b3
Prugel-Bennett, Adam
b107a151-1751-4d8b-b8db-2c395ac4e14e
Niranjan, Mahesan
5cbaeea8-7288-4b55-a89c-c43d212ddd4f
August 2017
Squires, Steven
68512c11-065d-45e7-a0a9-54a32198e6b3
Prugel-Bennett, Adam
b107a151-1751-4d8b-b8db-2c395ac4e14e
Niranjan, Mahesan
5cbaeea8-7288-4b55-a89c-c43d212ddd4f
Squires, Steven, Prugel-Bennett, Adam and Niranjan, Mahesan
(2017)
Rank selection in non-negative matrix factorization using minimum description Length.
Neural Computation, 29 (8), .
(doi:10.1162/NECO_a_00980).
Abstract
Nonnegative matrix factorization (NMF) is primarily a linear dimensionality reduction technique that factorizes a nonnegative data matrix into two smaller nonnegative matrices: one that represents the basis of the new subspace and the second that holds the coefficients of all the data points in that new space. In principle, the nonnegativity constraint forces the representation to be sparse and parts based. Instead of extracting holistic features from the data, real parts are extracted that should be significantly easier to interpret and analyze. The size of the new subspace selects how many features will be extracted from the data. An effective choice should minimize the noise while extracting the key features. We propose a mechanism for selecting the subspace size by using a minimum description length technique. We demonstrate that our technique provides plausible estimates for real data as well as accurately predicting the known size of synthetic data. We provide an implementation of our code in a Matlab format.
Text
Rank Selection in Non-negative Matrix Factorization
- Accepted Manuscript
More information
Accepted/In Press date: 1 March 2017
e-pub ahead of print date: 17 July 2017
Published date: August 2017
Identifiers
Local EPrints ID: 413094
URI: http://eprints.soton.ac.uk/id/eprint/413094
PURE UUID: 4150db3c-19aa-4aa7-8e0a-3a8b7762c680
Catalogue record
Date deposited: 15 Aug 2017 16:30
Last modified: 16 Mar 2024 03:55
Export record
Altmetrics
Contributors
Author:
Steven Squires
Author:
Adam Prugel-Bennett
Author:
Mahesan Niranjan
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics