Hankala, T., Hannula, M., Kontinen, J. et al. (1 more author) (2024) Complexity of neural network training and ETR: extensions with effectively continuous functions. In: Proceedings of the 38th AAAI Conference on Artificial Intelligence. 38th Annual AAAI Conference on Artificial Intelligence, 20-27 Feb 2024, Vancouver, Canada. Association for the Advancement of Artificial Intelligence , pp. 12778-12285. ISBN 9781577358879
Abstract
The training problem of neural networks (NNs) is known to be ∃R- complete with respect to ReLU and linear activation functions. We show that the training problem for NNs equipped with arbitrary activation functions is polynomial-time bi reducible to the existential theory of the reals extended with the corresponding activation functions. For effectively continuous activation functions (e.g., the sigmoid function), we obtain an inclusion to low levels of the arithmetical hierarchy. Consequently, the sigmoid activation function leads to the existential theory of the reals with the exponential function, and hence the decidability of training NNs using the sigmoid activation function is equivalent to the decidability of the existential theory of the reals with the exponential function, along-standing open problem. In contrast, we obtain that the training problem is undecidable if sinusoidal activation functions are considered.
Metadata
Item Type: | Proceedings Paper |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | © 2024 The Authors. Except as otherwise noted, this author-accepted version of a paper published in Proceedings of the 38th AAAI Conference on Artificial Intelligence is made available via the University of Sheffield Research Publications and Copyright Policy under the terms of the Creative Commons Attribution 4.0 International License (CC-BY 4.0), which permits unrestricted use, distribution and reproduction in any medium, provided the original work is properly cited. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ |
Keywords: | ML: Deep Neural Architectures and Foundation Models; ML: Deep Learning Theory |
Dates: |
|
Institution: | The University of Sheffield |
Academic Units: | The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Computer Science (Sheffield) |
Funding Information: | Funder Grant number DEUTSCHE FORSCHUNGSGEMEINSCHAFT 432788559 |
Depositing User: | Symplectic Sheffield |
Date Deposited: | 21 Dec 2023 09:30 |
Last Modified: | 04 Apr 2024 10:53 |
Status: | Published |
Publisher: | Association for the Advancement of Artificial Intelligence |
Refereed: | Yes |
Identification Number: | 10.1609/aaai.v38i11.29118 |
Related URLs: | |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:206869 |