Optimal deep neural networks by maximization of the approximation power
Optimal deep neural networks by maximization of the approximation power
We propose an optimal architecture for deep neural networks of given size. The optimal architecture obtains from maximizing the lower bound of the maximum number of linear regions approximated by a deep neural network with a ReLu activation function. The accuracy of the approximation function relies on the neural network structure characterized by the number, dependence and hierarchy between the nodes within and across layers. We show how the accuracy of the approximation improves as we optimally choose the width and depth of the network. A Monte-Carlo simulation exercise illustrates the outperformance of the optimized architecture against cross-validation methods and gridsearch for linear and nonlinear prediction models. The application of this methodology to the Boston Housing dataset confirms empirically the outperformance of our method against state-of the-art machine learning models.
artificial intelligence, data science, feedforward neural networks, forecasting, machine learning, Machine learning, Data science, Feedforward neural networks, Artificial intelligence, Forecasting
Calvo-Pardo, Hector
07a586f0-48ec-4049-932e-fb9fc575f59f
Mancini, Tullio
3e5a59a2-e184-4996-a7d6-7b4394bec08c
Olmo, Jose
706f68c8-f991-4959-8245-6657a591056e
August 2023
Calvo-Pardo, Hector
07a586f0-48ec-4049-932e-fb9fc575f59f
Mancini, Tullio
3e5a59a2-e184-4996-a7d6-7b4394bec08c
Olmo, Jose
706f68c8-f991-4959-8245-6657a591056e
Calvo-Pardo, Hector, Mancini, Tullio and Olmo, Jose
(2023)
Optimal deep neural networks by maximization of the approximation power.
Computers and Operations Research, 156, [106264].
(doi:10.1016/j.cor.2023.106264).
Abstract
We propose an optimal architecture for deep neural networks of given size. The optimal architecture obtains from maximizing the lower bound of the maximum number of linear regions approximated by a deep neural network with a ReLu activation function. The accuracy of the approximation function relies on the neural network structure characterized by the number, dependence and hierarchy between the nodes within and across layers. We show how the accuracy of the approximation improves as we optimally choose the width and depth of the network. A Monte-Carlo simulation exercise illustrates the outperformance of the optimized architecture against cross-validation methods and gridsearch for linear and nonlinear prediction models. The application of this methodology to the Boston Housing dataset confirms empirically the outperformance of our method against state-of the-art machine learning models.
Text
1-s2.0-S0305054823001284-main
- Version of Record
More information
Accepted/In Press date: 24 April 2023
e-pub ahead of print date: 28 April 2023
Published date: August 2023
Additional Information:
Funding Information:
Jose Olmo acknowledges financial support from project PID2019-104326GB-I00 from Ministerio de Ciencia e Innovación and from Fundación Agencia Aragonesa para la Investigación y el Desarrollo (ARAID) .
Keywords:
artificial intelligence, data science, feedforward neural networks, forecasting, machine learning, Machine learning, Data science, Feedforward neural networks, Artificial intelligence, Forecasting
Identifiers
Local EPrints ID: 477462
URI: http://eprints.soton.ac.uk/id/eprint/477462
ISSN: 0305-0548
PURE UUID: c8ae1bbe-c240-4aaf-964c-2eaf1d5855a9
Catalogue record
Date deposited: 06 Jun 2023 17:09
Last modified: 06 Jun 2024 01:51
Export record
Altmetrics
Contributors
Author:
Tullio Mancini
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics