Winkler, J.R. (2026) Overfitting, regularisation and condition estimation in regression. International Journal of Data Science and Analytics, 22. 13. ISSN: 2364-415X
Abstract
Overfitting is a problem in regression and deep neural networks, and it is often stated that Tikhonov regularisation minimises its adverse effects, but the relationship between regularisation and overfitting has not been established. The theory of regularisation is well developed, but overfitting has a qualitative description and it is not defined mathematically. This paper addresses the relationship between overfitting, regularisation and condition estimation by considering underdetermined and overdetermined least squares (LS) problems that arise in regression. This study is important because regularisation is not benign since its use when a condition on the decay of the singular values of the coefficient matrix in the LS minimisation is not satisfied leads to a large error in the solution of the regularised LS problem. Examples in which the regression curve overfits the data are shown, but regularisation must not be applied because the LS problem is well conditioned. Also, an ill conditioned LS problem whose solution does not display overfitting is shown, but its ill conditioned nature implies regularisation should be applied in order to obtain a numerically stable solution. It is concluded that regularisation does not solve the problem of overfitting in regression.
Metadata
| Item Type: | Article |
|---|---|
| Authors/Creators: |
|
| Copyright, Publisher and Additional Information: | © The Author(s) 2025. Open Access: This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. |
| Keywords: | Regression; Overfitting; Regularisation; Condition estimation |
| Dates: |
|
| Institution: | The University of Sheffield |
| Academic Units: | The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Computer Science (Sheffield) |
| Date Deposited: | 08 Jan 2026 11:10 |
| Last Modified: | 08 Jan 2026 11:10 |
| Status: | Published |
| Publisher: | Springer |
| Refereed: | Yes |
| Identification Number: | 10.1007/s41060-025-00906-9 |
| Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:236294 |
Download
Filename: s41060-025-00906-9.pdf
Licence: CC-BY 4.0
CORE (COnnecting REpositories)
CORE (COnnecting REpositories)