Fast projection onto the capped simplex with applications to sparse regression in bioinformatics
Fast projection onto the capped simplex with applications to sparse regression in bioinformatics
We consider the problem of projecting a vector onto the so-called k-capped simplex, which is a hyper-cube cut by a hyperplane. For an n-dimensional input vector with bounded elements, we found that a simple algorithm based on Newton’s method is able to solve the projection problem to high precision with a complexity roughly about O(n), which has a much lower computational cost compared with the existing sorting-based methods proposed in the literature. We provide a theory for partial explanation and justification of the method. We demonstrate that the proposed algorithm can produce a solution of the projection problem with high precision on large scale datasets, and the algorithm is able to significantly outperform the state-of-the-art methods in terms of runtime (about 6-8 times faster than a commercial software with respect to CPU time for input vector with 1 million variables or more). We further illustrate the effectiveness of the proposed algorithm on solving sparse regression in a bioinformatics problem. Empirical results on the GWAS dataset (with 1,500,000 single-nucleotide polymorphisms) show that, when using the proposed method to accelerate the Projected Quasi-Newton (PQN) method, the accelerated PQN algorithm is able to handle huge-scale regression problem and it is more efficient (about 3-6 times faster) than the current state-of-the-art methods.
9990-9999
Neural Information Processing Systems Foundation
Ang, Man Shun
ed509ecd-39a3-4887-a709-339fdaded867
Ma, Jianzhu
c24fdcba-e5ea-4456-a416-3be0aea60568
Liu, Nianjun
15a79917-50c2-4c30-9b5d-a2fd2eec3b71
Huang, Kun
85ec42dd-c528-4a3c-bc26-e8fb75989962
Wang, Yijie
52f11931-664c-4fc1-8d0e-b70a1d956f46
2021
Ang, Man Shun
ed509ecd-39a3-4887-a709-339fdaded867
Ma, Jianzhu
c24fdcba-e5ea-4456-a416-3be0aea60568
Liu, Nianjun
15a79917-50c2-4c30-9b5d-a2fd2eec3b71
Huang, Kun
85ec42dd-c528-4a3c-bc26-e8fb75989962
Wang, Yijie
52f11931-664c-4fc1-8d0e-b70a1d956f46
Ang, Man Shun, Ma, Jianzhu, Liu, Nianjun, Huang, Kun and Wang, Yijie
(2021)
Fast projection onto the capped simplex with applications to sparse regression in bioinformatics.
Ranzato, Marc'Aurelio, Beygelzimer, Alina, Dauphin, Yann, Liang, Percy S. and Wortman Vaughan, Jenn
(eds.)
In Advances in Neural Information Processing Systems 34: 35th Conference on Neural Information Processing Systems (NeurIPS 2021).
Neural Information Processing Systems Foundation.
.
Record type:
Conference or Workshop Item
(Paper)
Abstract
We consider the problem of projecting a vector onto the so-called k-capped simplex, which is a hyper-cube cut by a hyperplane. For an n-dimensional input vector with bounded elements, we found that a simple algorithm based on Newton’s method is able to solve the projection problem to high precision with a complexity roughly about O(n), which has a much lower computational cost compared with the existing sorting-based methods proposed in the literature. We provide a theory for partial explanation and justification of the method. We demonstrate that the proposed algorithm can produce a solution of the projection problem with high precision on large scale datasets, and the algorithm is able to significantly outperform the state-of-the-art methods in terms of runtime (about 6-8 times faster than a commercial software with respect to CPU time for input vector with 1 million variables or more). We further illustrate the effectiveness of the proposed algorithm on solving sparse regression in a bioinformatics problem. Empirical results on the GWAS dataset (with 1,500,000 single-nucleotide polymorphisms) show that, when using the proposed method to accelerate the Projected Quasi-Newton (PQN) method, the accelerated PQN algorithm is able to handle huge-scale regression problem and it is more efficient (about 3-6 times faster) than the current state-of-the-art methods.
This record has no associated files available for download.
More information
Published date: 2021
Additional Information:
Publisher Copyright:
© 2021 Neural information processing systems foundation. All rights reserved.
Venue - Dates:
35th Conference on Neural Information Processing Systems, NeurIPS 2021, , Virtual, Online, 2021-12-06 - 2021-12-14
Identifiers
Local EPrints ID: 495224
URI: http://eprints.soton.ac.uk/id/eprint/495224
PURE UUID: 9da9b45e-10fa-4e0f-a003-756f2545fa54
Catalogue record
Date deposited: 01 Nov 2024 18:15
Last modified: 02 Nov 2024 03:08
Export record
Contributors
Author:
Man Shun Ang
Author:
Jianzhu Ma
Author:
Nianjun Liu
Author:
Kun Huang
Author:
Yijie Wang
Editor:
Marc'Aurelio Ranzato
Editor:
Alina Beygelzimer
Editor:
Yann Dauphin
Editor:
Percy S. Liang
Editor:
Jenn Wortman Vaughan
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics