The University of Southampton
University of Southampton Institutional Repository

A review of NMF, PLSA, LBA, EMA, and LCA with a focus on the identifiability issue

A review of NMF, PLSA, LBA, EMA, and LCA with a focus on the identifiability issue
A review of NMF, PLSA, LBA, EMA, and LCA with a focus on the identifiability issue
Across fields such as machine learning, social science, geography, considerable attention has been given to models that factorize a nonnegative matrix into the product of two or three matrices, subject to nonnegative or row-sum-to-1 constraints. Although these models are to a large extend similar or even equivalent, they are presented under different names, and their similarity is not well known. This paper highlights similarities among five popular models, latent budget analysis (LBA), latent class analysis (LCA), end-member analysis (EMA), probabilistic latent semantic analysis (PLSA), and nonnegative matrix factorization (NMF). We focus on an essential issue-identifiability-of these models and prove that the solution of LBA, EMA, LCA, PLSA is unique if and only if the solution of NMF is unique. We also provide a brief review for algorithms of these models. We illustrate the models with a time budget dataset from social science, and end the paper with a discussion of closely related models such as archetypal analysis.
stat.ML, cs.LG, math.OC, math.ST
arXiv
Qi, Qianqian
47673ec0-7ef7-413d-8102-10789990f40c
van der Heijden, Peter G.M.
85157917-3b33-4683-81be-713f987fd612
Qi, Qianqian
47673ec0-7ef7-413d-8102-10789990f40c
van der Heijden, Peter G.M.
85157917-3b33-4683-81be-713f987fd612

[Unknown type: UNSPECIFIED]

Record type: UNSPECIFIED

Abstract

Across fields such as machine learning, social science, geography, considerable attention has been given to models that factorize a nonnegative matrix into the product of two or three matrices, subject to nonnegative or row-sum-to-1 constraints. Although these models are to a large extend similar or even equivalent, they are presented under different names, and their similarity is not well known. This paper highlights similarities among five popular models, latent budget analysis (LBA), latent class analysis (LCA), end-member analysis (EMA), probabilistic latent semantic analysis (PLSA), and nonnegative matrix factorization (NMF). We focus on an essential issue-identifiability-of these models and prove that the solution of LBA, EMA, LCA, PLSA is unique if and only if the solution of NMF is unique. We also provide a brief review for algorithms of these models. We illustrate the models with a time budget dataset from social science, and end the paper with a discussion of closely related models such as archetypal analysis.

Text
2512.22282v1 - Author's Original
Available under License Other.
Download (641kB)

More information

Accepted/In Press date: 25 December 2025
Keywords: stat.ML, cs.LG, math.OC, math.ST

Identifiers

Local EPrints ID: 509043
URI: http://eprints.soton.ac.uk/id/eprint/509043
PURE UUID: b6f5ecd9-0c41-4164-afae-8b8de27300d9
ORCID for Peter G.M. van der Heijden: ORCID iD orcid.org/0000-0002-3345-096X

Catalogue record

Date deposited: 10 Feb 2026 17:49
Last modified: 11 Feb 2026 02:47

Export record

Altmetrics

Contributors

Author: Qianqian Qi

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×