Zhao, Y., Zhao, H. and Yang, S. orcid.org/0000-0003-0531-2903 (2025) Accelerate On-Chip Artificial Neural Network Training: a Customised fNIRS Signals Processing System-on-Chip. In: Proceedings of 38th IEEE International System-on-Chip Conference. 38th IEEE International System-on-Chip Conference (SOCC), 29 Sep - 01 Oct 2025, Dubai, United Arab Emirates. . IEEE. ISBN: 979-8-3315-9478-7. ISSN: 2164-1706. EISSN: 2164-1706.
Abstract
During artificial neural network (ANN) training, batch normalisation (BN) techniques improve training performance algorithmically but create hardware implementation challenges. This paper aims to address this specific bottleneck within the broader challenge of on-chip training. We propose an all-onchip artificial neural network (ANN) training framework that utilises AMD’s Versal XCVC1902 heterogeneous SoC with a purpose-built batch-normalisation (BN) accelerator to overcome the two main barriers to edge computing: memory bandwidth and batch normalisation layer latency. By hardwiring batch normalisation calculation to form a hardware primitive and using fixed-point arithmetic, the design improves batch normalisation latency from tens of cycles to nine cycles, improving the efficiency of the ANN training process. The hardware-software co-design strategy demonstrates how deep processing unit cores, BN engine based on FPGA fabric, and CPUs can be orchestrated through a unifying abstraction, pointing toward application-specific SoCs that hide low-level scheduling from machine learning developers. The framework is evaluated as an application of real-time functional near-infrared spectroscopy (fNIRS) signal reconstruction, where cloud off-loading is undesirable for privacy and latency reasons. Although evaluated on fNIRS, the tiled batch normalisation engine and zero-copy dataflow can be reused in any ANN training that spends a significant fraction of time in batch normalisation layers.
Metadata
| Item Type: | Proceedings Paper |
|---|---|
| Authors/Creators: |
|
| Copyright, Publisher and Additional Information: | This is an author produced version of an conference paper published in Proceedings of 38th IEEE International System-on-Chip Conference, made available under the terms of the Creative Commons Attribution License (CC-BY), which permits unrestricted use, distribution and reproduction in any medium, provided the original work is properly cited. |
| Keywords: | fNIRS, Edge Acceleration, Hardware/Software Co-design, Optimisation, System-on-chip |
| Dates: |
|
| Institution: | The University of Leeds |
| Academic Units: | The University of Leeds > Faculty of Engineering & Physical Sciences (Leeds) > School of Mechanical Engineering (Leeds) > Institute of Medical and Biological Engineering (iMBE) (Leeds) |
| Funding Information: | Funder Grant number Napier University Edinburgh R2183 |
| Date Deposited: | 24 Jul 2025 10:59 |
| Last Modified: | 20 Apr 2026 12:04 |
| Status: | Published |
| Publisher: | IEEE |
| Identification Number: | 10.1109/SOCC66126.2025.11235415 |
| Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:229435 |
Download
Filename: IEEESOCC2025.pdf
Licence: CC-BY 4.0

CORE (COnnecting REpositories)
CORE (COnnecting REpositories)