

This is a repository copy of Dynamical characteristics of a nano-ionic solid electrolyte FET using an LSTM model.

White Rose Research Online URL for this paper: <u>https://eprints.whiterose.ac.uk/221346/</u>

Version: Accepted Version

## **Proceedings Paper:**

Gaurav, A., Song, X., Manhas, S.K. et al. (1 more author) (2025) Dynamical characteristics of a nano-ionic solid electrolyte FET using an LSTM model. In: 2024 IEEE Nanotechnology Materials and Devices Conference (NMDC). 19th IEEE Nanotechnology Materials and Devices Conference (IEEE NMDC 2024), 21-24 Oct 2024, Salt Lake City, Utah, USA. Institute of Electrical and Electronics Engineers (IEEE) , pp. 45-48. ISBN 979-8-3315-4144-6

https://doi.org/10.1109/NMDC58214.2024.10894584

© 2025 The Author(s). The Authors. Except as otherwise noted, this author-accepted version of a conference paper published in 2024 IEEE Nanotechnology Materials and Devices Conference (NMDC) is made available via the University of Sheffield Research Publications and Copyright Policy under the terms of the Creative Commons Attribution 4.0 International License (CC-BY 4.0), which permits unrestricted use, distribution and reproduction in any medium, provided the original work is properly cited. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/

## Reuse

This article is distributed under the terms of the Creative Commons Attribution (CC BY) licence. This licence allows you to distribute, remix, tweak, and build upon the work, even commercially, as long as you credit the authors for the original work. More information and the full terms of the licence here: https://creativecommons.org/licenses/

## Takedown

If you consider content in White Rose Research Online to be in breach of UK law, please notify us by emailing eprints@whiterose.ac.uk including the URL of the record and the reason for the withdrawal request.



# Dynamical Characteristics of a Nano-Ionic Solid Electrolyte FET Using an LSTM model\*

Ankit Gaurav, Xiaoyao Song, Sanjeev Kumar Manhas, Maria Merlyne De Souza

*Abstract*— The complexity of multi-state devices (*e.g.*, memristors, ferroelectric RAMs (FERAMs) hinder the creation of their unified physics-based model. Data-driven approaches, such as machine learning (ML), are increasingly favored to address this challenge. In this study, we demonstrate the dynamic modelling of a synaptic ZnO/Ta<sub>2</sub>O<sub>5</sub> Solid Electrolyte-FET by transforming its characteristics into a multivariate time-series problem based on which a Long-Short Term Memory model of the device is constructed. Our method can also be applied to other multi-state devices to accelerate the development time of Neuromorphic Computing Systems.

#### I. INTRODUCTION

Emerging semiconductor devices necessitate compact models for device-circuit co-design [1]. While most models stem from well-established laws such as current continuity and Poisson equations, the underlying physics of some devices can sometimes lie outside the realm of these equations which makes it not immediately clear. This complexity poses challenges in creating physics-based models of novel devices, especially those with multiple states such as memristors [2], FERAMs [3], and antiferroelectric FET [4]. These devices, whether volatile or not, display distinctive features including hysteresis, plasticity, negative capacitance, stochasticity, and nonlinear responses which add to the complexity of their models. On the flip side, these characteristics facilitate a broad spectrum of neuromorphic computing applications such as vision [5], [6], speech recognition [7], [8] and forecasting [9]. Neuromorphic computing emulates the architecture and functionality of biological neurons, by prioritizing in-memory computation, energy efficiency, and parallel processing to execute complex tasks with low power. While the materials, device designs, and fabrication methods of multi state devices are still being explored, it is crucial to be able to quickly examine various possibilities during the design technology co-optimization (DTCO) process [10]. In recent years to resolve this problem, researchers have proposed generalized compact machine learning models based on multilayer perceptron (MLP) neural networks [11], [12] of memristor devices.



Figure 1. A typical multilayer perceptron neural network architecture used in the modelling of memristive devices.



**Figure 2.** A schematic representation of an LSTM cell. It consists of a cell state (memory), a hidden state (output), and three gates (forget, input, and output) that control information flow.

However, in memristors, a single voltage can result in different current values corresponding to low and high resistance states, necessitating a separate model for each switching state or requiring state information as input to make an accurate prediction. Figure 1 shows a typical MLP based model utilized in the modelling of memristor devices. In contrast to MLP-based models, compact long-short term memory (LSTM) models [13] of two-terminal filamentary memristive devices eliminate the need for separate representations of two switching states, because the LSTM output depends solely on previous input and output states. An LSTM network comprises a series of LSTM cells, each equipped with input, output, and forget gates. These gates regulate the flow of information within the cell, enabling LSTMs to capture long-term dependencies in data as shown in Figure 2. In this work, we demonstrate a compact model to predict the dynamic characteristics of a three terminal non-filamentary nano-ionic solid electrolyte FET (SE-FET) [14] using an LSTM neural network. We have earlier used a simple drift-diffusion model coupled with the Poisson equation to explain the origins of negative capacitance in this device [15].

<sup>\*</sup>Invited paper. This work was supported by UKIERI and MHRD-SPARC under grant code No. P436 between University of Sheffield and IIT Roorkee. Maria Merlyne De Souza also acknowledges funds from the EPSRC: APRIL Hub <u>https://www.april.ac.uk/</u> and R/175631-11-1.

Ankit Gaurav and Sanjeev Kumar Manhas are with Departments of Electronics and Communication, IIT Roorkee, Roorkee, 247667, India. (e-mail: sanjeev.manhas@ece.iitr.ac.in).

Xiaoyao Song and Maria Merlyne De Souza are with Departments of Electronic and Electrical, University of Sheffield, Sheffield, S37HQ, U.K. (e-mail: <u>m.desouza@sheffield.ac.uk</u>).



Figure 3. A generalised framework and process flow for modelling a multi-state device using an LSTM neural network (exemplified with a SE-FET).

However, this approach is unable to reproduce the dynamic characteristics of the device whose behaviors is governed by the ionic electrochemical response of the defects in the insulator and requires a special gate current model which was implemented in [16], which is not easily amenable to scaling.

We model the device dynamics by transforming it into a multivariate time series problem which incorporates all terminal features of the device  $I_{DS}$ ,  $V_{GS}$  and  $V_{DS}$ . The generalized framework and process flow for modelling using an LSTM neural network is shown in Figure 3.

#### II. METHODOLOGY

#### A. Experimental Fabrication and Device Mechanism

Our bottom gated TFTs were fabricated on glass using an ITO gate, Ta<sub>2</sub>O<sub>5</sub> gate insulator 275 nm and 40 nm ZnO as channel deposited via RF sputtering. The key feature of this device mechanism is its distinct redox reaction in the insulator which is captured in the MATLAB Simulink model reported in [16]. When a gate voltage is applied, positively charged vacancies accumulate at the channel end of the insulator boosting the current during the reverse sweep. This accumulation results in an additional electrolytic capacitance, which becomes negative during a rapid collapse of the internal electric field in the device during the reverse sweep of the gate voltage. Importantly, this process enables steep switching without relying on any filamentary behaviors [15].

#### B. Experimental Measurement

We continuously monitored the dynamical response of the device when exposed to 4-bit input sequences at a  $V_{DS}$ of -1V and -1.5V, with a reset pulse of -3V after each input sequence to restore the device to its initial state immediately before the next input sequence. For input '1',  $V_{GS} = 3V$  was applied, while for input '0', no pulse was applied (0V). The resulting change in  $I_{DS}$  was measured with a time step size of 0.5 seconds.

#### C. Machine Learning

To model the behaviors of the SEFET, we employed an LSTM network with two hidden layers, each containing 32 and 16 LSTM cells, respectively implemented using Keras library [17]. The output layer consists of a single neuron, which predicts the  $I_{DS}$ . The input layer comprises of 3 neurons, representing the  $I_{DS}$ ,  $V_{GS}$ , and  $V_{DS}$  fed sequentially. The performance during training of the LSTM is evaluated by the loss function mean squared error (MSE) defined as:

$$MSE = \frac{1}{n} \sum_{i=1}^{n} (Y_i - \hat{Y}_i)^2$$
(1)

Where *n* is the number of samples,  $Y_i$  is the actual value, and  $\hat{Y}_i$  is the value predicted by the LSTM for the *i*<sup>th</sup> sample. To optimize the LSTM weights during training, the Adaptive Moment Estimation (Adam) optimizer was employed. The entire dataset was split into 75% for training and 25% for testing/validation. Before training the LSTM model, data normalization is done to enhance training efficiency and improve prediction accuracy. The training and testing of the LSTM model was done in Python using the Keras neural network library, a component of the TensorFlow framework [17].

#### III. RESULTS AND DISCUSSION

Figure 4 shows the training results of the LSTM model (red line) compared with experiment (grey line) for 4-bit applied binary sequences for which a lower training error of 3.51E-04 MSE is achieved.



Figure 4. Output of the LSTM model during training results in a MSE of 3.51E-04.



Figure 5. Training and validation loss vs number of Epochs.



Figure 6. Testing of LSTM model for input sequences not used in training, result in a MSE error of 3.72E-04.

Figure 5 shows the training and validation loss both decreasing with each epoch, indicating that the model is not overfitting. Figure 6 shows low test error of 3.72E-04 (MSE) for input sequences not used in training. Models based on physics are extensively tailored and optimized for specific memristor devices, which restricts their applicability to other types of memristors [18]. For example, several models based on physical principles precisely simulate their electrical characteristics. Typically, such models consist of two elements: 1) the switching model, which describes how the resistance state changes, and 2) the conduction model, which determines the current flow in response to an applied voltage. The switching is explained by theories of filament growth and rupture [19], ion migration and redox reaction [14], [20]. Whereas empirical enhancements such as the window function [21] (the voltage range within which the switching occurs) and threshold theories [22] (Such as Voltage Threshold Adaptive Memristor model describe the behavior of these

 TABLE I

 COMPARISON OF DIFFERENT PHYSICS BASED MEMRISTOR MODELING

| Model                                         | Description                                                           | Accuracy/<br>Computational<br>Efficiency        |
|-----------------------------------------------|-----------------------------------------------------------------------|-------------------------------------------------|
| Linear Drift Model [23]                       | Based on linear drift of mobile ions.                                 | Lowest accuracy/<br>High                        |
| Nonlinear Drift<br>Model [24]                 | Extends linear drift<br>model with nonlinear<br>drift of mobile ions. | Low accuracy/<br>Moderate                       |
| Simmons Tunnel<br>Barrier Model [25]          | Considers tunnelling<br>effects at memristor<br>layer interfaces.     | Accurate for<br>tunnelling effects/<br>Moderate |
| Threshold Adaptive<br>Memristor Model<br>[26] | Includes adaptive<br>thresholds for<br>dynamic behavior.              | Good accuracy/<br>Moderate                      |

TABLE II Comparison of different machine learning based memristor model Generalized approaches

| Model       | Description              | Innut                                                     | Accuracy |
|-------------|--------------------------|-----------------------------------------------------------|----------|
| mouer       | Description              | Input                                                     | (RMSE)   |
| MLP         | Duplicating the data by  | voltage, state,                                           | 0.001    |
| models [11] | adding Gaussian noise    | and device                                                |          |
|             | to improve the           | parameters as                                             |          |
|             | performance              | inputs                                                    |          |
| MLP         | Decoupling switching     | voltage, and                                              | 0.03     |
| models [12] | and conducting           | current sequence                                          |          |
|             | behaviors to model it    |                                                           |          |
|             | individually             |                                                           |          |
| LSTM        | Current-voltage (I-V)    | Voltage / current                                         | 0.002    |
| model [13]  | characteristics          | sequence                                                  |          |
|             | transformed into time    |                                                           |          |
|             | series problem           |                                                           |          |
| LSTM        | Dynamic modelling by     | $I_{\text{DS}}, V_{\text{GS}}, \text{and}  V_{\text{DS}}$ | 0.019    |
| Model       | transforming into a      |                                                           |          |
| [This work] | multivariate time series |                                                           |          |
|             | problem                  |                                                           |          |

devices under varying voltage conditions) aim to refine model precision. Conduction is explained by theories such as Poole–Frenkel emission [27] (a mechanism of trapassisted electron transport in an electrical insulator), Schottky emission [28] (a phenomenon where an electric field reduces the energy barrier for electrons to be emitted from a material surface), and space charge limited current [29] (which occurs when there is an excess of charge carriers in a poorly conducting material).

A different approach to modelling apart from physics based is to gather experimental measurements and create a look-up table (LUT), similar to LUT based MOSFET model [30]. This method retrieves the device's electrical behavior through interpolation or extrapolation of collected data. LUT-based models rely less on the physical principles of the device and do not require complex model equations. However, unlike MOSFET, the output current in a memristor is influenced by the voltage applied in the past. Accessing the LUT of the output current would necessitate storing current for all previous voltages, which is unfeasible due to excessive storage and computational demand. In contrast, the machine learning based approach can be easily generalized to predict the characteristics of device for which it has been trained as it is driven by data rather than the underlying physics of the devices. Table I and Table II summaries the performance of these approaches.

## IV. CONCLUSION

We review approaches to model unusual device physics and demonstrate the dynamical modeling of multi-state memory using minimal experiment by employing an LSTM neural architecture. In contrast, the MLP based approach is not only data hungry but also requires a separate model for switching and conduction to make it efficient. Moreover, by converting device modelling problem into a multivariate time series problem we can add as many device parameters or characteristics to the input or output as desirable.

#### REFERENCES

- J.-C. Liu, T.-Y. Wu, and T.-H. Hou, "Optimizing Incremental Step Pulse Programming for RRAM Through Device–Circuit Co-Design," *IEEE Trans. Circuits Syst. II Express Briefs*, vol. 65, no. 5, pp. 617–621, May 2018.
- [2] L. Chua, "Memristor-The missing circuit element," *IEEE Trans. Circuit Theory*, vol. 18, no. 5, pp. 507–519, 1971.
- [3] M. Lederer *et al.*, "Ferroelectric Field Effect Transistors as a Synapse for Neuromorphic Application," *IEEE Trans. Electron Devices*, vol. 68, no. 5, pp. 2295–2300, 2021.
  [4] Z. Li *et al.*, "A 3D Vertical-Channel Ferroelectric/Anti-
- [4] Z. Li *et al.*, "A 3D Vertical-Channel Ferroelectric/Anti-Ferroelectric FET With Indium Oxide," *IEEE Electron Device Lett.*, vol. 43, no. 8, pp. 1227–1230, Aug. 2022.
- [5] T. Soliman *et al.*, "First demonstration of in-memory computing crossbar using multi-level Cell FeFET," *Nat. Commun.*, vol. 14, no. 1, p. 6348, Oct. 2023.
- [6] A. Gaurav *et al.*, "Reservoir Computing for Temporal Data Classification Using a Dynamic Solid Electrolyte ZnO Thin Film Transistor," *Front. Electron.*, vol. 3, no. April, pp. 1–9, Apr. 2022.
- [7] Y. Zhong, J. Tang, X. Li, B. Gao, H. Qian, and H. Wu, "Dynamic memristor-based reservoir computing for highefficiency temporal signal processing," *Nat. Commun.*, vol. 12, no. 1, pp. 1–9, 2021.
- [8] A. Gurav, X. Song, S. K. Manhas, P. P. Roy, and M. M. De Souza, "A Solid Electrolyte ZnO Thin Film Transistor for classification of spoken digits using Reservoir Computing," 7th IEEE Electron Devices Technol. Manuf., pp. 9–11, 2023.
- [9] J. Moon *et al.*, "Temporal data classification and forecasting using a memristor-based reservoir computing system," *Nat. Electron.*, vol. 2, no. 10, pp. 480–487, 2019.
- [10] J. Wang, Y. H. Kim, J. Ryu, C. Jeong, W. Choi, and D. Kim, "Artificial neural network-based compact modeling methodology for advanced transistors," *IEEE Trans. Electron Devices*, vol. 68, no. 3, pp. 1318–1325, 2021.
- [11] J. Hutchins et al., "A Generalized Workflow for Creating Machine Learning-Powered Compact Models for Multi-State Devices," *IEEE Access*, vol. 10, no. October, pp. 115513– 115519, 2022.
- [12] Y. Zhang, G. He, K. T. Tang, Y. Li, and G. Wang, "GEM: A Generalized Memristor Device Modeling Framework Based on Neural Network for Transient Circuit Simulation," *IEEE Trans. Comput. Des. Integr. Circuits Syst.*, vol. 42, no. 3, pp. 834–846, 2023.
- [13] A. S. Lin *et al.*, "A Process-Aware Memory Compact-Device Model Using Long-Short Term Memory," *IEEE Access*, vol. 9, pp. 3126–3139, 2021.
- [14] P. B. Pillai and M. M. De Souza, "Nanoionics-based threeterminal synaptic device using zinc oxide," ACS Appl. Mater.

Interfaces, vol. 9, no. 2, pp. 1609-1618, 2017.

- [15] A. Kumar, P. Balakrishna Pillai, X. Song, and M. M. De Souza, "Negative Capacitance beyond Ferroelectric Switches," *ACS Appl. Mater. Interfaces*, vol. 10, no. 23, pp. 19812–19819, 2018.
- [16] X. Song, A. Kumar, and M. M. De Souza, "A Ta2O5/ZnO Synaptic SE-FET for supervised learning in a crossbar," in 2021 5th IEEE Electron Devices Technology & Manufacturing Conference (EDTM), IEEE, Apr. 2021, pp. 1–3.
- [17] Hochreiter, "Long Short-Term Memory layer." [Online]. Available: https://www.tensorflow.org/api\_docs/python/tf/keras/layers/LS
- [18] L. Gao, Q. Ren, J. Sun, S. T. Han, and Y. Zhou, "Memristor
- modeling: Challenges in theories, simulations, and device variability," *J. Mater. Chem. C*, vol. 9, no. 47, pp. 16859–16884, 2021.
- [19] P.-Y. Chen and S. Yu, "Compact Modeling of RRAM Devices and Its Applications in 1T1R and 1S1R Array Design," *IEEE Trans. Electron Devices*, vol. 62, no. 12, pp. 4022–4028, Dec. 2015.
- [20] K. Moon *et al.*, "Bidirectional Non-Filamentary RRAM as an Analog Neuromorphic Synapse, Part I: Al/Mo/Pr 0.7 Ca 0.3 MnO 3 Material Improvements and Device Measurements," *IEEE J. Electron Devices Soc.*, vol. 6, pp. 146–155, 2018.
- [21] I. Messaris, A. Serb, S. Stathopoulos, A. Khiat, S. Nikolaidis, and T. Prodromakis, "A Data-Driven Verilog-A ReRAM Model," *IEEE Trans. Comput. Des. Integr. Circuits Syst.*, vol. 37, no. 12, pp. 3151–3162, Dec. 2018.
- [22] S. Kvatinsky, M. Ramadan, E. G. Friedman, and A. Kolodny, "VTEAM: A General Model for Voltage-Controlled Memristors," *IEEE Trans. Circuits Syst. II Express Briefs*, vol. 62, no. 8, pp. 786–790, Aug. 2015.
- [23] D. B. Strukov, G. S. Snider, D. R. Stewart, and R. S. Williams, "The missing memristor found," *Nature*, vol. 453, no. 7191, pp. 80–83, May 2008.
- [24] H. Kim, M. P. Sah, C. Yang, and L. O. Chua, "Memristor-based multilevel memory," in 2010 12th International Workshop on Cellular Nanoscale Networks and their Applications (CNNA 2010), IEEE, Feb. 2010, pp. 1–6.
- [25] M. D. Pickett *et al.*, "Switching dynamics in titanium dioxide memristive devices," *J. Appl. Phys.*, vol. 106, no. 7, Oct. 2009.
- [26] S. Kvatinsky, E. G. Friedman, A. Kolodny, and U. C. Weiser, "TEAM: ThrEshold Adaptive Memristor Model," *IEEE Trans. Circuits Syst. I Regul. Pap.*, vol. 60, no. 1, pp. 211–221, Jan. 2013.
- [27] Y.-M. Kim and J.-S. Lee, "Reproducible resistance switching characteristics of hafnium oxide-based nonvolatile memory devices," J. Appl. Phys., vol. 104, no. 11, Dec. 2008.
- [28] C.-Y. Lin, S.-Y. Wang, D.-Y. Lee, and T.-Y. Tseng, "Electrical Properties and Fatigue Behaviors of ZrO[sub 2] Resistive Switching Thin Films," *J. Electrochem. Soc.*, vol. 155, no. 8, p. H615, 2008.
- [29] Q. Liu, W. Guan, S. Long, R. Jia, M. Liu, and J. Chen, "Resistive switching memory effect of ZrO2 films with Zr+ implanted," *Appl. Phys. Lett.*, vol. 92, no. 1, Jan. 2008.
- [30] K. Xia, "A Simple Method to Create Corners for the Lookup Table-Based MOSFET Models Through Inputs and Outputs Mapping," *IEEE Trans. Electron Devices*, vol. 68, no. 4, pp. 1432–1438, Apr. 2021.