The University of Southampton
University of Southampton Institutional Repository

Alternating maximization: unifying framework for 8 sparse PCA formulations and efficient parallel codes

Alternating maximization: unifying framework for 8 sparse PCA formulations and efficient parallel codes
Alternating maximization: unifying framework for 8 sparse PCA formulations and efficient parallel codes
Given a multivariate data set, sparse principal component analysis (SPCA) aims to extract several linear combinations of the variables that together explain the variance in the data as much as possible, while controlling the number of nonzero loadings in these combinations. In this paper we consider 8 different optimization formulations for computing a single sparse loading vector: we employ two norms for measuring variance (L2, L1) and two sparsity-inducing norms (L0, L1), which are used in two ways (constraint, penalty). Three of our formulations, notably the one with L0 constraint and L1 variance, have not been considered in the literature. We give a unifying reformulation which we propose to solve via the alternating maximization (AM) method. We show that AM is equivalent to GPower for all formulations. Besides this, we provide 24 efficient parallel SPCA implementations: 3 codes (multi-core, GPU and cluster) for each of the 8 problems. Parallelism in the methods is aimed at (1) speeding up computations (our GPU code can be 100 times faster than an efficient serial code written in C++), (2) obtaining solutions explaining more variance and (3) dealing with big data problems (our cluster code can solve a 357 GB problem in a minute).
Alternating maximization, Big data analytics, GPower, Sparse PCA, Unsupervised learning
1389-4420
Richtárik, Peter
6fba6051-a2f1-4602-8962-d3b12647d6ce
Jahani, Majid
d6c31361-9f62-4c40-ac27-3dec107cfa1a
Ahipasaoglu, Selin Damla
d69f1b80-5c05-4d50-82df-c13b87b02687
Takáč, Martin
4fb42b43-5b23-4430-8047-a52664119823
Richtárik, Peter
6fba6051-a2f1-4602-8962-d3b12647d6ce
Jahani, Majid
d6c31361-9f62-4c40-ac27-3dec107cfa1a
Ahipasaoglu, Selin Damla
d69f1b80-5c05-4d50-82df-c13b87b02687
Takáč, Martin
4fb42b43-5b23-4430-8047-a52664119823

Richtárik, Peter, Jahani, Majid, Ahipasaoglu, Selin Damla and Takáč, Martin (2020) Alternating maximization: unifying framework for 8 sparse PCA formulations and efficient parallel codes. Optimization and Engineering. (doi:10.1007/s11081-020-09562-3).

Record type: Article

Abstract

Given a multivariate data set, sparse principal component analysis (SPCA) aims to extract several linear combinations of the variables that together explain the variance in the data as much as possible, while controlling the number of nonzero loadings in these combinations. In this paper we consider 8 different optimization formulations for computing a single sparse loading vector: we employ two norms for measuring variance (L2, L1) and two sparsity-inducing norms (L0, L1), which are used in two ways (constraint, penalty). Three of our formulations, notably the one with L0 constraint and L1 variance, have not been considered in the literature. We give a unifying reformulation which we propose to solve via the alternating maximization (AM) method. We show that AM is equivalent to GPower for all formulations. Besides this, we provide 24 efficient parallel SPCA implementations: 3 codes (multi-core, GPU and cluster) for each of the 8 problems. Parallelism in the methods is aimed at (1) speeding up computations (our GPU code can be 100 times faster than an efficient serial code written in C++), (2) obtaining solutions explaining more variance and (3) dealing with big data problems (our cluster code can solve a 357 GB problem in a minute).

This record has no associated files available for download.

More information

Accepted/In Press date: 2020
e-pub ahead of print date: 7 September 2020
Published date: 22 September 2020
Additional Information: Publisher Copyright: © 2020, Springer Science+Business Media, LLC, part of Springer Nature. Copyright: Copyright 2020 Elsevier B.V., All rights reserved.
Keywords: Alternating maximization, Big data analytics, GPower, Sparse PCA, Unsupervised learning

Identifiers

Local EPrints ID: 444041
URI: http://eprints.soton.ac.uk/id/eprint/444041
ISSN: 1389-4420
PURE UUID: 4f4f8e77-e71d-416f-9c02-83eab8e50627
ORCID for Selin Damla Ahipasaoglu: ORCID iD orcid.org/0000-0003-1371-315X

Catalogue record

Date deposited: 23 Sep 2020 16:31
Last modified: 17 Mar 2024 04:03

Export record

Altmetrics

Contributors

Author: Peter Richtárik
Author: Majid Jahani
Author: Martin Takáč

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×