Non-random weight initialisation in deep convolutional networks applied to safety critical artificial intelligence

Rudd-Orthner, R. orcid.org/0000-0002-2534-0920 and Mihaylova, L. orcid.org/0000-0001-5856-2223 (2021) Non-random weight initialisation in deep convolutional networks applied to safety critical artificial intelligence. In: 2020 13th International Conference on Developments in eSystems Engineering (DeSE). International Conference on Developments in eSystems Engineering (DESE), 14-17 Dec 2020, Liverpool, UK. IEEE , Liverpool, United Kingdom , pp. 1-8. ISBN 9781665422383

Abstract

This paper presents a non-random weight initialisation scheme for convolutional neural network layers. It builds upon previous work that was limited to perceptron layers, but in that work repeatable determinism was achieved with equality in categorisation accuracy between the established random scheme and a linear ramp non-random scheme. This work however, is in Convolutional layers and are the layers that have been responsible for better than human performance in image recognition. The previous perceptron work found that number range was more important rather than the gradient. However, that was due to the fully connected nature of dense layers. Although, in convolutional layers by contrast, there is an order direction implied, and the weights relate to filters rather than image pixel positions, so the weight initialisation is more complex. However, the paper demonstrates a better performance, over the currently established random schemes with convolutional layers. The proposed method also induces earlier learning through the use of striped forms, and as such has less unlearning of the traditionally speckled random forms. That proposed scheme also provides a higher performing accuracy in a single learning session, with improvements of: 3.35% un-shuffled, 2.813% shuffled in the first epoch and 0.521% over the 5 epochs of the model. Of which the first epoch is more relevant as it is the epoch after initialisation. Also the proposed method is repeatable and deterministic, which is also a desirable quality for safety critical applications within image classification. The proposed method is also robust to He initialisation values too, and scored 97.55% accuracy compared to 96.929% accuracy with the Glorot/ Xavier in the traditional random forms, of which the benchmark model was originally optimised with.

Metadata

Item Type:	Proceedings Paper
Authors/Creators:	Rudd-Orthner, R. https://orcid.org/0000-0002-2534-0920 Mihaylova, L. https://orcid.org/0000-0001-5856-2223
Copyright, Publisher and Additional Information:	© 2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works. Reproduced in accordance with the publisher's self-archiving policy.
Keywords:	Repeatable; Weight Initialization; Information Assurance; Convolutional Layers
Dates:	Published (online): 14 June 2021 Published: 14 June 2021
Institution:	The University of Sheffield
Academic Units:	The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Automatic Control and Systems Engineering (Sheffield)
Depositing User:	Symplectic Sheffield
Date Deposited:	12 Jul 2021 13:39
Last Modified:	21 Jun 2022 11:17
Status:	Published
Publisher:	IEEE
Refereed:	Yes
Identification Number:	10.1109/DeSE51703.2020.9450242
Open Archives Initiative ID (OAI ID):	oai:eprints.whiterose.ac.uk:176089

CORE (COnnecting REpositories)

Non-random weight initialisation in deep convolutional networks applied to safety critical artificial intelligence

Abstract

Metadata

Download

Accepted Version

Export

Statistics