Rudd-Orthner, R. orcid.org/0000-0002-2534-0920 and Mihaylova, L. orcid.org/0000-0001-5856-2223 (2021) Non-random weight initialisation in deep convolutional networks applied to safety critical artificial intelligence. In: 2020 13th International Conference on Developments in eSystems Engineering (DeSE). International Conference on Developments in eSystems Engineering (DESE), 14-17 Dec 2020, Liverpool, UK. IEEE , Liverpool, United Kingdom , pp. 1-8. ISBN 9781665422383
Abstract
This paper presents a non-random weight initialisation scheme for convolutional neural network layers. It builds upon previous work that was limited to perceptron layers, but in that work repeatable determinism was achieved with equality in categorisation accuracy between the established random scheme and a linear ramp non-random scheme. This work however, is in Convolutional layers and are the layers that have been responsible for better than human performance in image recognition. The previous perceptron work found that number range was more important rather than the gradient. However, that was due to the fully connected nature of dense layers. Although, in convolutional layers by contrast, there is an order direction implied, and the weights relate to filters rather than image pixel positions, so the weight initialisation is more complex. However, the paper demonstrates a better performance, over the currently established random schemes with convolutional layers. The proposed method also induces earlier learning through the use of striped forms, and as such has less unlearning of the traditionally speckled random forms. That proposed scheme also provides a higher performing accuracy in a single learning session, with improvements of: 3.35% un-shuffled, 2.813% shuffled in the first epoch and 0.521% over the 5 epochs of the model. Of which the first epoch is more relevant as it is the epoch after initialisation. Also the proposed method is repeatable and deterministic, which is also a desirable quality for safety critical applications within image classification. The proposed method is also robust to He initialisation values too, and scored 97.55% accuracy compared to 96.929% accuracy with the Glorot/ Xavier in the traditional random forms, of which the benchmark model was originally optimised with.
Metadata
Item Type: | Proceedings Paper |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | © 2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works. Reproduced in accordance with the publisher's self-archiving policy. |
Keywords: | Repeatable; Weight Initialization; Information Assurance; Convolutional Layers |
Dates: |
|
Institution: | The University of Sheffield |
Academic Units: | The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Automatic Control and Systems Engineering (Sheffield) |
Depositing User: | Symplectic Sheffield |
Date Deposited: | 12 Jul 2021 13:39 |
Last Modified: | 21 Jun 2022 11:17 |
Status: | Published |
Publisher: | IEEE |
Refereed: | Yes |
Identification Number: | 10.1109/DeSE51703.2020.9450242 |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:176089 |