A new quasilinear saturation rule for tokamak turbulence with application to the isotope scaling of transport

A new quasilinear saturation model SAT3 has been developed for the purpose of calculating radial turbulent fluxes in the core of tokamak plasmas. The new model is shown to be able to better recreate the isotope mass dependence of nonlinear gyrokinetic fluxes compared to contemporary quasilinear models, including SAT2 (Staebler et al 2021 Nucl. Fusion 61 116007), while performing at least as well in other key equilibrium parameters. By first quantifying the isotope scaling of gyrokinetic flux spectra, it is shown that the deviation from the gyroBohm scaling of fluxes originates primarily in the magnitude of the saturated potentials. Using this result SAT3 was formulated using observations made from gyrokinetic data, including a novel and robust relation between the 1D potential spectrum and the radial spectral widths. This serves to define the underlying functional forms of SAT3 before then connecting to the linear dynamics, including a difference in saturation level between ITG- and TEM-dominated turbulence, with the resulting free parameters having been fit to a database of high-resolution nonlinear CGYRO simulations. Additional features outside of the database are included, including E × B shear and multi-ion plasma capability. The methodology used in the development of SAT3 represents an algorithm which can be used in the improvement and generation of future saturation models.


Introduction
In the field of tokamak transport it is well established that the isotope scaling of the global energy confinement time τ E is not commensurate with that suggested by simple theory [2][3][4][5], * Author to whom any correspondence should be addressed.
Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI. which assumes that local turbulent fluxes Q i follow gyroBohm scaling, for which Q i ∝ √ A for ion mass number A, and that there exists a simple translation between the local fluxes in the device and the global confinement properties, τ E ∝ 1/Q i ∝ A −0.5 . Experimental scaling laws derived from observations made across many tokamaks however consistently observe positive scalings, τ E ∝ A p , p > 0 [6,7]. While this result bodes well for the success of future tokamak devices in their shift from pure deuterium (D) plasmas to the operational 50/50 mix of deuterium-tritium (DT), the mechanisms responsible for this 'isotope effect' remain incompletely understood. Note that the isotope scaling reversal present in NL CGYRO (shaded region) is not recreated in any of the quasilinear models. Subfigure (a) reprinted from [13], with the permission of AIP Publishing.
To investigate the gyroBohm scaling assumption, nonlinear gyrokinetic codes [8][9][10] can be used to accurately simulate how local turbulent fluxes depend on isotope mass. Previous studies have shown that in sufficiently simple cases, namely those dominated by ITG turbulence with a single ion species and adiabatic electrons, the gyroBohm scaling of fluxes is produced [11,12]. The inclusion of more sophisticated physics in these simulations breaks this scaling, as has been demonstrated for the inclusion of kinetic electrons [13], electromagnetic effects [14], collisions [15,16], E × B shear [17] and fast particles [18]. Numerous studies have also shown that not only can local fluxes deviate from the gyroBohm prediction, but can follow the opposite trend, the so-called anti-gyroBohm scaling, in which fluxes scale inversely with isotope mass [13,19]. The plurality of mechanisms involved in this gyroBohmbreaking effect makes even a partial explanation of the isotope effect on a local level challenging.
Nonlinear gyrokinetics represents the most accurate plasma turbulence modelling paradigm available, having been extensively validated against experiment [20][21][22][23][24][25][26][27]. However, its great computational expense renders it impractical for use in integrated modelling simulations, in which one is required to simulate many flux surfaces over confinement timescales. Quasilinear models such as TGLF [28] and QuaLiKiz [29] are instead used for this purpose. These models bypass the expense of calculating the fluxes in the nonlinear gyrokinetic system by instead solving for the linear response of the plasma instabilities, which is then combined with an estimation of the magnitude of the saturated potentials via a saturation rule to provide a calculation of the fluxes in a greatly reduced time. A standard and relatively simple estimation of these potential magnitudes comes from the mixing length rule [29], which models the turbulent transport as a diffusive process, with a step size of a characteristic wavenumber of the linear instability and a time step of the inverse of the growth rate. This estimation is employed in QuaLiKiz [30], however saturation rules can be based on other turbulence saturation mechanisms, such as TGLF SAT1 [31] and SAT2's [1] paradigm of zonal mixing. Quasilinear models have been extensively validated against nonlinear gyrokinetic codes for deuterium plasmas [1,32] and have successfully modelled deuterium plasma discharges [11,[33][34][35][36]. However, their historically limited considerations of plasmas in other isotopes, coupled with their incomplete description of the isotope effect, cause them to struggle to replicate the behaviour of nonlinear gyrokinetic fluxes with isotope mass, rendering predictive flux-driven modelling efforts unreliable [11,19].
An example of the isotope effect as it appears in local nonlinear gyrokinetics was explicitly demonstrated in a paper by Belli et al [13], focusing on the role of kinetic electrons. It was shown using the gyrokinetic code CGYRO [9] that by increasing the equilibrium density gradient from a GA-standard case [37] baseline, one moves from a regime of ITG-dominated turbulence to TEM-dominated turbulence, accompanied by a reversal in the flux scaling. It was suggested that this anti-gyroBohm scaling may not be captured by reduced turbulence models, in part due to the observation that the mixing length rule did not exhibit this isotope reversal.
To test the adherence between the nonlinear (NL) CGYRO results and the quasilinear model results, the NL CGYRO data from the density gradient scan presented in [13] is shown in figure 1, along with the data from equivalent simulations using TGLF-SAT1, TGLF-SAT2 and QuaLiKiz. Moving from a/L n = 0.0 to a/L n = 3.0, one observes that the anti-gyroBohm scaling seen in the data from NL CGYRO for a/L n > 2.0 is not replicated in the results of any of the three quasilinear models. These all instead exhibit positive isotope scaling across the scan, indicating that the relevant physics to capture this isotope scaling reversal is missing.
The inability of current models to reliably predict this scaling undermines the efforts of integrated modelling in terms of prediction and optimisation of experimental campaigns in plasmas with compositions differing from pure deuterium. With the advent of ITER and the subsequent shift to operations with DT, as well as isotope experiments at the JET tokamak, it is essential that modern quasilinear models more accurately approximate fluxes in different isotopes.
The aim of this work is to develop a new quasilinear model that is able to more reliably predict how turbulent fluxes scale with isotope mass, by capturing the physics contained in local nonlinear gyrokinetic simulations. The paper is organised as follows: section 2 contains a discussion on the theory of turbulent plasma fluxes, and how this informs the structure of quasilinear models. The detail of the gyrokinetic database created for this work is given in section 3. In section 4, contemporary reduced transport models are compared with gyrokinetic simulations, to illustrate in detail how these models fail to capture the isotope dependence of the transport. Section 5 describes the development of the new saturation rule SAT3, with a summary of the model given in section 5.7, and section 6 presents the results from SAT3. A conclusion follows.

Fluxes in gyrokinetics
The following employs local flux-tube geometry [38] with coordinates {x, y, θ} denoting the radial, binormal and parallel-to-field coordinates respectively. A summary of the field-aligned system, geometric conventions and averages used in this paper is given in appendix A. This work focuses primarily on electrostatic turbulence, such that the only nonnegligible component of the turbulent electromagnetic field is the fluctuating electrostatic potential δφ.
Turbulent heat fluxes arise in the gyrokinetic system due to the interaction between δφ and the pressure fluctuations of the present plasma species δ p s (for species s) via the turbulent E × B velocity [39,40]. Once saturated, a statistically steadystate flux can be defined by taking both an ensemble average and a flux-surface average over these fluctuations. Of particular interest is the radial component of this averaged flux, due to its integral role in determining the confinement properties of the plasma. This is given by whereb is the unit vector in the direction of the equilibrium magnetic field, B = |B| is the equilibrium magnetic field magnitude, and . . . Ens,FS denotes an average over the turbulent ensemble and flux-surface respectively. By representing each fluctuating quantity as a Fourier series in x and y, one can show that equation (1) may be written (a step-by-step derivation is given in appendix B): where Z s,k x ,k y = δp s,k x ,k y /δφ k x ,k y . Here the total flux Q s has been decomposed into a sum of discrete flux components per positive binormal wavenumber, Q s,k y = . Circumflexes are used to denote Fourier amplitudes and * signifies a complex conjugate. The averages over θ and t are defined by equations (A.5) and (A.6), and B unit is an effective magnetic field where ψ is the poloidal magnetic flux divided by 2π, q is the safety factor and r denotes a given flux-surface. Equation (2) shows that each flux component Q s,k y can be considered as a sum over k x of averaged terms comprised of two factors: the squared magnitude of the potential fluctuations |δφ k x ,k y | 2 , and k y Im[Z * s,k x ,k y ]/B unit , which contains the phase difference between the pressure and potential fluctuations.

Fluxes in reduced models
Two types of reduced turbulence models are considered in this work: those with 'exact' linear solvers, which compute the linear behaviour of microinstabilities using a local gyrokinetic code, and those with 'fast' linear solvers, which solve a simplified version of the linear gyrokinetic system to approximate the linear response of the plasma in a greatly reduced time. The results of these solvers are then combined with a saturation rule in order to calculate turbulent fluxes.
While the use of exact linear solvers is too computationally expensive for routine use in reduced models required by integrated modelling, it is useful for the purposes of saturation rule development, as the dynamics of the linear microinstability are computed more accurately. This model type has been used previously [41][42][43], including under the name 'QLGYRO' [44], for which linear CGYRO results were combined with TGLF's SAT1 rule. In the interest of generality, in this paper reduced models will be referred to using the taxonomy linear inputssaturation rule , to clearly specify the constituents of each model. Thus CGYRO-SAT1 refers to the QLGYRO paradigm, TGLF-SAT1 refers to SAT1 with the reduced linear inputs of TGLF, and the QuaLiKiz saturation rule (qlk) with the linear inputs of QuaLiKiz is QuaLiKiz-qlk.
The flux calculation of models using an exact linear solver will first be considered. For each k y one can define an average phase, dubbed the 'weight' W s,k y , such that where the relation is a consequence of Parseval's theorem (appendix C). Note the factor of 1 2 in the final expression of the weights due to historical convention. One can define both a 'nonlinear weight' W NL s,k y and a 'linear weight' W L s,k y , named due to their calculation from the nonlinear gyrokinetic system and the linear gyrokinetic system respectively. The flux obtained from nonlinear gyrokinetics may therefore be written as a 1D sum over binormal wavenumber of nonlinear weights and potentials: To express the flux in a form relevant to reduced models, and assuming only the dominant mode is considered in the linear solver, one multiplies and divides by the linear weight: where Λ s,k y = W NL s,k y /W L s,k y is the 'quasilinear approximation (QLA) function'. This defines an explicit measure of the QLA, which assumes that the average phase between the fluctuations in the linear regime is conserved in the nonlinear regime [45]. If this approximation were to be satisfied exactly, one would observe Λ s,k y = 1 for all species at all binormal wavenumbers, which is typically assumed in current quasilinear models.
Finally, for the purposes of reduced modelling, one does not mathematically describe the potentials |δφ k y | 2 x,θ,t , but rather the potentials normalised to the binormal grid spacing Δk y , as this can be shown to be invariant under a change of grid resolution once past the point of convergence (appendix D). This consideration yields which is the final result for the structure of fluxes in quasilinear models with an exact linear solver. Each term in the sum over k y is comprised of four parts, two of which are obtained from linear simulations, and two of which must be prescribed. The two linearly determined quantities are the adopted binormal grid spacing Δk y and the linear weight, W L s,k y . The two quantities in need of prescription are the grid-independent potentials |δφ k y | 2 x,θ,t /Δk y , calculated by the quasilinear saturation rule, and the QLA function, Λ s,k y .
The flux calculation for the fast model type is similar to equation (8), however as they solve a simplified version of the linear gyrokinetic system, some additional sources of error are incurred due to their approximate calculation of the linear response. These errors can enter through the linear weights and through linear quantities needed in the saturation rule, typically the growth rate of the dominant mode.
Note that the discussion and derivations in this section thus far hold completely analogously for the particle flux Γ s and momentum flux Π s , by simply substituting the pressure fluctuations δ p s for the density fluctuations δn s or velocity fluctuations δv s respectively in equation (1).

Simulation database
Normalising quantities in this work are given in SI units. These normalisations are the deuterium mass m D , the electron temperature T e , the electron density n e , the tokamak minor radius a and the effective magnetic field B unit , defined by equation (3). The reference gyroradius is therefore ρ unit = √ m D T e /eB unit , with ρ * = ρ unit /a, and the deuterium sound speed is c s = T e /m D . Other useful normalisations include those of frequency c s /a, particle flux Γ GBD = ρ 2 * n e c s , energy  flux Q GBD = ρ 2 * n e T e c s and the electrostatic potential φ unit = ρ * T e /e = B unit ρ 2 unit c s /a . A database of 43 nonlinear gyrokinetic simulations was generated using CGYRO [9]. The database is primarily centred around the GA-std case, defined by a/L T i = a/L T e = 3.0, a/L n = 1.0, T i /T e = 1.0,ŝ = 1.0, q = 2.0, a/c s ν ee = 0.1 and circular Miller flux-surface geometry with r 0 /R 0 = 1/6, for tokamak major radius R 0 . Definitions of equilibrium geometry quantities correspond to those found in [46]. Kinetic electrons and a single ion species are used for all cases. The three isotopes that have been simulated are H, D and T, with m i /m D values of 0.5, 1.0 and 1.5 respectively. No rotation is included. All simulations are predominantly electrostatic, however include δA fluctuations with a small plasma beta of β e,unit = 0.05% to allow for an increased time-step with negligible effect on the fluxes [9]. Additional parameters considered include the elongation κ and the Shafranov shift Δ = dR 0 /dr, and w is the box size integer which relates the radial and binormal domains, L x = L y w/(2πŝ). Changes in temperature gradient scale lengths were kept constant between the ions and electrons a/L T i = a/L T e . In table 1, the tokamak parameters that differ from the GA-std baseline are shown for the database.
The resolutions used in this work for the nonlinear simulations are N y = 40 binormal modes, N x = 224 radial modes, N θ = 32 parallel grid-points, N ξ = 16 pitch-angle grid-points and N u = 8 energy grid-points. The density of binormal modes is greater than that typically used in studies of similar cases [13], as it was found during convergence tests that this lower resolution can cause fluxes to be under-predicted. The box size integer w has a value of 4 for all simulations except those changing the magnetic shear from its baseline value ofŝ = 1. For these cases, the box length integer was also changed so as to keepŝ/w constant and thus the radial domain length unchanged. All simulations were conducted at the ion scale up to k y ρ i = 1.0, where ρ i = m i /m D ρ unit , to keep the radial and binormal domains constant relative to the main ion gyroradius.
This binormal grid set-up was also used for the quasilinear model simulations of TGLF and QuaLiKiz shown in figure 1. Finite aspect-ratio Miller geometry was used for all simulations other than those of QuaLiKiz, which used large aspectratio circular geometry.
Nonlinear simulations were given sufficient time to saturate, running to at least (c s /a)τ > 1000, where τ is the total simulation time. Time averages of fluctuating data were performed by taking three separate measurements with consecutive time-windows of size 0.28τ between 0.16τ and τ . From these three measurements a mean and standard deviation were calculated for the final result. These windows were chosen to avoid the initial linear regime of the simulations, and to provide a representative standard deviation of the statistical fluctuations. Unless otherwise noted, all error bars shown in this work are the standard deviations obtained from this method.
For each nonlinear case 39 linear simulations were also conducted, corresponding to the 39 non-zero binormal modes present in their respective nonlinear simulations. An altered radial domain was used with N x = 128 and w = 1, due to the difference in grid requirements for convergence between nonlinear and linear runs.
A single nonlinear simulation of the GA-std case in D was repeated with N x = 896 keeping w = 4, for use in figure 7. This extended domain case plays no other part in the work.
Six additional linear simulations were also performed scanning over relative concentrations of D and T in mixed DT plasmas for the GA-std and a/L n = 3.0 cases in section 5.5.2. The values simulated are n D /n e = {0.25, 0.5, 0.75}, with n T /n e = 1 − n D /n e .

Model comparison and isotope scaling diagnosis
In this section, the results from the CGYRO database are compared with those obtained from CGYRO-SAT1, TGLF-SAT1  , the result of applying the α A metric at each k y ρ i is shown for the four models, as well as the value that would be seen if the fluxes followed gyroBohm scaling (α A = 0.5). The error bars shown for all models in (b) are the uncertainties in the fitted parameter α A . and QuaLiKiz-qlk 4 . This is done to observe from where in the nonlinear gyrokinetic data the non-trivial isotope scaling originates, as per the flux breakdown in equation (8), as well as to identify why the current reduced models are not reproducing the NL CGYRO results. As an example case this section will focus on the three a/L n = 3.0 simulations, due to the anti-gyroBohm scaling present in the fluxes.
To analyse the isotope scaling of these models the metric α A is introduced, such that for flux data in H, D and T, one 4 The results of TGLF-SAT2 are included in figure 2 only. The behaviour of SAT2 with isotope mass is very similar to that of SAT1, as demonstrated by the similarities in scaling metric between the fluxes of TGLF-SAT1 and TGLF-SAT2 in figure 2. This is in part due to the lack of consideration of isotopes other than deuterium in their development, as well as the similarity of their saturated potentials' functional form. Beyond figure 2 therefore only the results of SAT1 are shown, with any conclusions regarding the isotope scaling of SAT1 also applicable to SAT2. may fit the data with a function of the form where the values of C A and α A are found via best fit to the data points. An advantage to this metric is that one number α A describes the isotope scaling for three isotopes, whereas previously used metrics [11,17] have been based around taking the difference between two fluxes, and thus at least two numbers have been needed for an approximately equivalent description. The values of α A are also intuitive: if one measures an α A value of α A ≈ 0.5, one knows that the case follows approximate gyroBohm scaling. If one measures α A ≈ 0.0, then the fluxes do not vary with isotope, and for α A < 0.0, the case exhibits anti-gyroBohm scaling. Outside of these limiting cases, one can quantify to what degree the case diverges from gyroBohm scaling: for two hypothetical cases of α A = 0.4 and  α A = 0.1, then both exhibit positive scaling, however the second 'deviates more' from gyroBohm than the first, by a defined quantitative amount. Figure 2 shows the total ion energy fluxes for the GA-std case and the a/L n = 3.0 case against A for the four models, as well as TGLF-SAT2. Fitted to each data set is the result of the metric fit with the measured values of α A displayed in the legend. Positive isotope scaling is seen in the GA-std case (figure 2(a)) for all five models, as one would expect for ITG-dominated turbulence representative of the quasilinear models' training datasets 5 . QuaLiKiz remains close to the gyroBohm result.
For the a/L n = 3.0 case, anti-gyroBohm scaling is observed in the NL CGYRO data, whereas the three fast quasilinear models all continue to exhibit positive isotope scaling, α A > 0. For CGYRO-SAT1 one finds α ≈ 0.0, implying that the use of the linear solver of CGYRO compared with the reduced linear solver of TGLF is having an influence on the isotope scaling. As an exact linear solver type model, the resulting difference in isotope scaling between CGYRO-SAT1 and NL CGYRO can only come from either the QLA function, which is assumed to be constant in SAT1, and/or the functional form of the saturation rule.
To probe this discrepancy further, the decomposition of the total fluxes into their flux components Q s,k y for the a/L n = 3.0 case is shown in figure 3(a). Here the flux components for NL CGYRO are plotted against k y ρ i for the three isotope simulations, such that they have a shared k y -axis. To quantify the isotope scaling of these flux components the isotope scal-ing metric α A can again be used, however now as a function of k y ρ i , to quantify how the scaling of the flux components varies across the spectrum. Hence for each value of k y ρ i , the three flux component data points are fitted using equation (9) with the resulting α A measurements forming an 'α A line', as shown in black in figure 3(b). This exercise is repeated for CGYRO-SAT1, TGLF-SAT1 and QuaLiKiz-qlk. Also shown in figure 3(b) is a reference line at α A = 0.5, corresponding to the expected result if the flux components followed gyroBohm scaling.
For the NL CGYRO α A line, the isotope scaling is not uniform across the spectrum as one may expect, but instead two key features can be observed: an 'offset' from the gyroBohm value, most obviously seen in the region of k y ρ i > 0.3, and a variation in the isotope scaling with k y ρ i between 0.0 < k y ρ i < 0.3. It is also in this low k y region that the flux component magnitudes are largest, and hence contribute most to the total flux and its isotope scaling. Upon consideration of the other cases in the database these features can be shown to be general, with differing offsets and degrees of steepness in the low k y region. Because the larger k y ρ i region's flux contribution is essentially negligible in comparison, figure 3 implies that to accurately capture the isotope scaling of the total flux, one must capture the shape of the flux component spectrum around the peak, as well as the isotope scaling characteristics observed in the low k y region of the NL CGYRO data of figure 3(b).
Considering now the constituents of the flux components relevant to reduced models, as per equation (8), the two quantities in need of prescription are the QLA functions and the potentials. The results of analogous α A fitting exercises are shown in figures 4 and 5 for the ion energy QLA function and the saturated potentials respectively, along with their reference lines expected from gyroBohm scaling arguments. Looking at the α A line of the QLA function in figure 4(b), a small variation of α A ∼ −0.1 is seen in the region of low k y . Reduced models are not shown on this plot as they all assume the QLA function to be exactly 1, and so would be simply aligned with the gyroBohm-predicted result of α A = 0.0.
While there is some non-trivial isotope scaling in the low k y region of the QLA function, this can be seen to be relatively small when compared to the difference between the NL CGYRO result and the gyroBohm-predicted scaling of the saturated potentials, shown in figure 5(b). The crucial variation in isotope scaling in the low k y region observed in the flux components is also seen to originate here. Hence for this case, the deviation from gyroBohm scaling of total fluxes in nonlinear gyrokinetics originates primarily in the saturated potentials, specifically in the region of low k y . Upon consideration of the database more broadly this can be shown to be a general result for the dataset.
Turning to the results of the current quasilinear models in figure 5(b), TGLF-SAT1 and CGYRO-SAT1 both appear to recreate a portion of the isotope scaling variation in the low k y region, indicating a reasonably accurate spectral shape for SAT1. This implies that it is their offsets that are primarily responsible for the total flux scalings seen in figure 2(b), with the different values attributed to the difference in linear solver.
QuaLiKiz-qlk on the other hand does not exhibit this continuous variation, due to the comparatively simple functional form of its spectral shape. The majority of the scaling in the higher k y region can also be seen to approximately lie at the gyroBohm level for the potentials. These results indicate that neither SAT1 or the QuaLiKiz saturation rule are fully capturing the relevant physics that describes the variation in the saturated potentials with isotope, whether as a consequence of missing the variation in the low k y region, the correct offset from the gyroBohm result, or a combination of the two.
To summarise the findings of this section, then in order for quasilinear models to capture the isotope scaling of turbulent fluxes seen in nonlinear gyrokinetics, one must have a saturation rule that accurately predicts the spectral shape of the potentials around the peak, as well as captures the α A line characteristics observed in figure 5(b). The model must therefore have sufficient functional complexity to capture the variation of the scaling with k y in the low k y region, and an accurate prediction of the offset. A new saturation rule, derived in light of these observations, will now be considered.

New saturation rule
This section details the construction of the new saturation rule SAT3, which will for the first time recreate the properties of the isotope scaling seen in the previous section. A summary of the model is presented in section 5.7.
In the calculation of quasilinear fluxes, saturation rules are only strictly required to predict the 1D grid-independent saturated potentials, |δφ k y | 2 x,θ,t /Δk y , however in the interest of generality the two dimensional spectrum in k y and k x will first be considered, |δφ k x ,k y | 2 θ,t . One can obtain the 1D potentials needed from the 2D spectrum simply by summing the potentials over k x , as in equation (5).

2D spectrum
For all cases in the database observed and at all values of k y , some common features exist regarding the 2D potential slices in k x , a representative example of which is shown in figure 6. All spectra considered are approximately even functions about a single peaked value, and tend to 0 for large |k x |. Due to the absence of any symmetry-breaking effects [47] in the database, the spectra are always observed to peak at k x = 0. It has been shown however that in at least the case of non-zero E × B shear a shift in the peak can be produced [48][49][50], and so a non-zero peak position K(k y ) is considered in the following.
The foregoing observations can all be accommodated into a functional description of the spectra by Taylor expanding the inverse of the potentials with the form where C i (k y ) are k y -dependent Taylor coefficients yet to be determined. Evaluating equation (10) at k x = K, one finds the inverse of the potentials at the peak. By truncating the expansion at O(k 4 x ), taking out a factor of C 0 , relabelling the coefficients C i /C 0 → C i and inverting the equation one obtains The physical interpretation of the coefficients C 1 , C 2 and K can be determined from considerations of the first three k x moments of the 2D spectrum. The zeroth order moment is simply the 1D potential, given by equation (5). The first and second order moments define the mean value of k x k x and the radial width of the spectrum By approximating the summations over k x in equations (5), (12) and (13) as integrals via b x=a f (x i )Δx ≈ b a f (x)dx and using the expression given by equation (11) for the 2D potential spectrum, one can evaluate the resulting integrals over k x analytically (appendix E). Assuming a sufficiently small Δk x and a sufficiently large limit, one can show and thus equation (11) becomes The prefactor of the k 2 x term can be interpreted in relation to the normalised fall-off from the spectrum peak. Evaluating equation (15) at k x = k x ± σ k y , one finds: Equation (15) is similar to that used in TGLF's SAT1 and SAT2 [1,31], in which the potentials are modelled as a squared Lorentzian, which assumes σ 2 k y C 1 k y = 2 for all k y in all cases 6 . Equation (15) generalises this assumption, retaining a degree of freedom in the description of the fall-off of the spectrum. The measured NL CGYRO values of σ 2 k y C 1 k y for the GA-standard case in D are shown in figure 7(a) exhibiting strong variation against k y , particularly in the low k y region, indicating the value of this generalisation. The quality of adherence to the data for equation (15) is demonstrated in figures 7(b) and (c), using the radiallyextended GA-std D simulation. Calculating the moments from the raw data, it can be observed that the functional form holds extremely well over a wide domain and many orders of magnitude in range. To explore the different cascade regimes predicted by this equation, the simplifying notation here is introduced such that equation (15) is written: where and C = σ 2 k y C 1 k y . Assuming √ C 1, three distinct regions of scaling are predicted 7 : which are seen to be present in the data, as evidenced by the model-data agreement between the limiting regions marked in figure 7(c).
Having modelled the k x -dependence for the spectrum, the k y -dependence will now be considered. Four as-yet unmodelled k y -dependent quantities are present in equation (15): the 1D potential |δφ k y | 2 x,θ,t , the spectrum peak |δφ k x = k x ,k y | 2 θ,t , the peak position k x and the radial spectral width σ k y . In order to have a full 2D potential model, one must uniquely constrain these quantities by providing four equations describing them. This is the chosen method of TGLF SAT1 [31], for which one of the four equations is the squared Lorentzian assumption. However, flux calculations ultimately only use the 1D potential, |δφ k y | 2 x,θ,t , and so technically this is the only quantity that needs to be prescribed for a reduced model. This potential 7 As one reduces the value of √ C to ∼1, and further to √ C 1, the P 0 /CX 2 scaling is found to be suppressed. could be modelled directly as a 1D function of k y , such as is done by QuaLiKiz [51], but attempting to relate the remaining spectral quantities to one another first can help to establish a more firm physics basis.

Saturation equations
5.2.1. Radial spectral width parameterisation. The following relation was discovered predominantly via empirical experimentation with the database, and describes a seemingly fundamental relation between the zeroth radial moment and the second order radial moment of the 2D spectrum. It is observed that the zeroth moment is very well modelled by the equation for k y > 0, where c 0 and c 1 are case-dependent parameters. When allowing these parameters to be fitted to the NL CGYRO data, as shown for various cases in figure 8, it can be seen that the two curves adhere to one another extremely closely 8 . Per case, this constitutes only two fitted parameters for 39 data points, rendering there no question of over-fitting. Moreover, the measured value of the exponent c 1 is found to be strongly consistent across cases, as displayed in figure 9, with an approximate value of c 1 = −2.42 for the database. This observation is perhaps the most important point of this work, as it appears to describe a general and robust relation between two hypothetically-independent moments of the 2D potential spectrum. Somewhat counter-intuitively it suggests that the area under the radial spectrum, related to the zeroth moment, is independent of its peak value, and instead only depends on the width. This observed relation is understood to be novel, and a physical mechanism for why this is the case is yet to be put forward. It is this observation that forms the core of the new saturation rule.

Parameter considerations.
Given sufficient ability to model the case-dependent values of c 0 and c 1 , equation (19) provides one of the four equations necessary to uniquely define the 2D potential spectrum. However, for every parameter introduced by such equations, a linear-physics based model for its case-dependent calculation will be required. Because equation (19) implies that the 1D potential is independent of the peak spectrum value and k x , the choice is made in this work to model σ k y only. Doing so will leave the 2D spectrum with an arbitrary peak value and location, but will have both its zeroth and second moments defined.

Model for the radial spectral width.
To inform the model for σ k y , a representative plot is shown in figure 10. Equation (19) implies two important qualities of σ k y : firstly, that the characteristic peak observed in the 1D potentials should correspond to a minimum in σ k y at the same position in k y , which is indeed seen, and is denoted k min . This is the first time this observation has been made, in part due to the well-resolved fluctuation averages obtained from extended Figure 10. Example of σ ky ρ unit data obtained for the GA-standard case in D (blue). In red is the model for the widths given by equation (20), with the parameters a, b and c fitted to the data. Also marked are k min and k P , the positions of the minimum and piecewise connection point respectively. simulation times and the increased density of binormal gridpoints used in the database. The second quality is that, because the values of the potentials away from the peak are comparatively small and contribute negligibly to the overall flux, the accurate modelling of σ k y in this higher k y region is less fundamental to successful flux prediction.
To incorporate the observed minimum into the model a quadratic polynomial is used. A quadratic over the entire k y domain capturing this minimum can however become too large in the middle k y region, and so a piecewise function is constructed, with a first-order polynomial used past a certain point, k P . This is taken to be k P = 2k min . The two regions are connected by imposing continuity of the function, as well as continuity of the gradient, at k y = k P : Figure 11. Demonstration of the calculation of k max and γ max from a linear growth rate spectrum, here for the a/L n = 3.0 case in H. Starting from γ ky (solid black), one divides this spectrum by k y to obtain γ ky /k y (dashed green). The maximum of this curve is then found via quadratic interpolation, with the k y value at which this occurs being defined as k max . Returning then to γ ky , the value of this curve at k max is defined as γ max .

Figure 12.
Fitting exercise of the two quantities modelled as proportional to k max across the NL CGYRO database: k min (black circles), and c/b (red squares), used in defining the model for the spectral shape σ ky /σ ky=k 0 (equation (23)).
where a, b and c are coefficients to be modelled. Here the function is assumed to extend to infinity. By combining equations (19) and (20), one obtains an equation for the 1D potentials solely as a function of k y and the parameters {c 0 , c 1 , a, b, c}. To make progress in modelling these parameters, they shall first be defined in terms of more physically meaningful quantities.

Recasting the coefficients
The exponent c 1 is dimensionless and has already been fitted to the nonlinear database in figure 9, and so does not need to be recast. An expression for c 0 can be obtained by evaluating equation (19) at some given point k 0 , to be determined: (21) giving the overall saturation model as Equation (22) expresses the 1D potentials as a product of their magnitude at a given point k 0 , |δφ k y =k 0 | 2 x,θ,t /Δk y and a k y -dependent function describing the shape of the spectrum, σ k y /σ k y =k 0 c 1 . These quantities are denoted the saturation level and the spectral shape respectively. The variation with k y observed in the potential α A plot ( figure 5(b)) comes entirely from the spectral shape, with the offset mainly attributed to the saturation level.
As the spectral widths appear only in the above normalised form for the saturation model, a reduction in the degrees of freedom is provided. Assuming 0 < k 0 k P , one can divide through by one of the expansion coefficients in the spectral widths, chosen here to be b: and so the number of unknown coefficients reduces from 3 to 2, now requiring only the ratios a/b and c/b. A more transparent physical interpretation of a/b can be obtained by considering the definition of the minimum of the spectral widths. Imposing a minimum at k min in equation (20), one finds dσ k y dk y k y =k min = 2ak min + b = 0 and therefore a/b = −1/(2k min ). With the coefficients now recast, this leaves three quantities to be modelled, for a given choice of k 0 : k min , c/b, and |δφ k y =k 0 | 2 x,θ,t /Δk y . Note, as of yet no appeals to linear physics have been made, the model derived thus far has come solely from considerations of nonlinear gyrokinetic spectra. By taking this bottomup approach, the validity of the underlying functional forms of the saturation rule equations is guaranteed, up to the hypothetical quality observed when the parameters are fitted to the data. This foundation implies that as linear physics based approximations for c 1 and the three quantities above improve, the model will tend towards being as accurate as when fitted in this way.

Approximating the parameters from linear physics quantities
For TGLF's saturation rules SAT1 and SAT2, both of which are based on the saturation mechanism of zonal mixing, it is argued that the linear growth of the present plasma instabilities γ k y is damped by the competing influence of the zonal  ) and (b)). Subfigure (c) displays the marker type used for each case, with the labels corresponding to those shown in table 1. Data points of the same colour connected by a line indicate simulations of the same equilibrium but of different isotopes. In (a), the cases with a dominant ITG instability scale approximately with B 2 unit γ 2 max /k 5 max , however for those that are TEM-dominated (cases f, x, y and z, shown in (b)), a scaling of B 2 unit γ 2 max ρ unit /k 4 max is found to be in much better agreement, indicating a difference in saturation level between the two mode types. flow mixing k y V ZF , where V ZF is the zonal flow velocity. The wavenumber at which the drive-to-damping ratio of these two processes is maximised therefore provides an estimate of the location of the peak of the turbulent transport, as well as a characteristic length scale of the nonlinear saturation from the linear system. This wavenumber is denoted k max , and is defined by d dk y γ k y k y k y =k max = 0 (25) with the corresponding growth rate γ max providing an estimate of the characteristic time scale of the saturation A graphical example of the determination of these quantities for a given equilibrium is shown in figure 11. The new saturation rule parameters k min , c/b, and |δφ k y =k 0 | 2 x,θ,t /Δk y will now be modelled entirely from linear physics, by assuming proportionality to the dimensionallyconsistent combination of the characteristic linear quantities γ max and k max , as well as the equilibrium quantity B unit . Considering first the dimensionality of b and c individually from equation (20), one finds b to be dimensionless and c to have dimensions of k y , giving c/b dimensions of k y . The quantity k min also has dimensions of k y , and so per the method above these are both modelled as proportional to k max . The database-fitting exercise is carried out in figure 12, resulting For the saturation level |δφ k y =k 0 | 2 x,θ,t /Δk y , the value of k 0 is chosen to be k 0 = 0.6k min , as it is at this position that the model scatter for the following is found to be minimised. The dimensionally-consistent combination of linear quantities for the saturation level is ∝ B 2 unit γ 2 max /k 5 max . Note that this is a similar form to that derived from considerations of balance between linear growth and turbulent E × B advection [52].
When tested against the database, B 2 unit γ 2 max /k 5 max is found to be a good model for the cases in which the dominant mode is the ITG (figure 13(a)). However, for those with TEMs present, B 2 unit γ 2 max /k 4 max in dimensionless units is found to be in much better agreement, indicating a difference in physical saturation between the two mode types ( figure 13(b)). The TEM saturation in dimensional units is therefore assumed proportional to γ 2 max ρ unit B 2 unit /k 4 max . It is here explicitly that a large cause of the difference in isotope scaling between the two mode types can be seen, and the reason why previous saturation models have failed to recreate the isotope scalings seen in TEM-dominated turbulence.
To explain why this factor of k max ρ unit affects the isotope scaling, consider first this quantity in a system with adiabatic electrons. In such a situation, all quantities are exactly gyroBohm scaled, such that for three linear simulations in H, D and T the value of k max occurs at the same value of k y ρ i . Because ρ i ∝ √ A, one finds in the adiabatic electron case that k max ∝ A −0.5 . In the simulations of this work, for which kinetic electrons have been used, the scaling of k max with A generally becomes slightly more positive at around α A ≈ −0.3, which is sufficient to capture the differences between the isotope scalings of the two mode types' saturation levels.
The above kinetic electron physics will be captured in the saturation model by using two different saturation levels for the ITG and TEM, such that one has 12.7B 2 unit γ 2 max ρ unit /k 4 max for the TEM, and 3.3B 2 unit γ 2 max /k 5 max for the ITG, where the proportionality constants are taken from figure 13. In order to decide which saturation level to use for a given simulation, the linear physics must be considered to reveal whether the turbulence is ITG-or TEM-dominated. For SAT3, the ratio between the magnitude of the linear energy weights of the electrons and ions is used, |W L e,k y |/|W L i,k y | = |W L e,k y /W L i,k y |, due to the disparate behaviour in the ratio of the species' energy fluxes for the two mode types [53]. For ion-dominated turbulence, in which the electron energy fluxes are comparatively small, one expects the ratio of |W L e,k y /W L i,k y | to also be small. When in the regime of TEM turbulence however, the electron turbulent energy flux increases to approximately the level of the ions. The ratio of |W L e,k y /W L i,k y | can be calculated from linear physics, and so represents a useful metric to select between the two saturation levels. If new classes of modes were present, this aspect of the saturation rule would likely need to be revisited and extended further.
A plot of |W L e,k y /W L i,k y | for all cases in the database against k y /k max is shown in figure 14, with ITG cases in red and TEM cases in blue. It can be seen that the TEM cases cluster around a value of |W L e,k y /W L i,k y | ≈ 1, whereas ITG cases are mainly grouped around 0.4.
A transition function is now defined for the two modes, as a function of the weight ratio evaluated at k y = k max . At values below |W L e,k y /W L i,k y | k y =k max = 0.8 the ITG scaling is used, and at values of 1 and above the TEM scaling is used, with a firstorder polynomial in-between to connect the two regions. The mode transition function M(x; x 1 , x 2 , y 1 , y 2 ) is introduced such that which allows the potentials evaluated at k 0 to be written It can be shown that measuring the sign of the linear frequency at a position in k y can also be used to differentiate between the two mode types, however the above was chosen to attempt to avoid discontinuous changes in flux at the point of a mode transition.

Model extensions
Modern quasilinear models are required to include the effects of equilibrium E × B shear, a mechanism that can both generate turbulent momentum transport, as well as suppress turbulence in other channels. This effect is incorporated through an equilibrium input parameter, the shearing rate γ E , and the consequences of non-zero E × B shear must be explicitly defined in the saturation rule. Although no such cases were considered in the database, one can simply isolate the E × B effect of a previous rule and incorporate it into the new model. Using an apostrophe to denote said previous rule, taken in this work to be that of TGLF's SAT2 [1], one defines the effect of shear F s,k y as the ratio of flux components with shear to those without The flux components of the new saturation rule are then obtained simply by multiplying F s,k y by the flux components of the new rule without flow shear, Q s,k y (γ E ) = F s,k y (γ E )Q s,k y (γ E = 0). Taking the ratio of the flux components as opposed to the potentials bypasses any complications arising from differences in definitions between the two saturation rules. Looking at TGLF SAT2, one finds which is the function incorporated into SAT3. HereW L s,k y ,k x is the quasilinear weight defined by TGLF, which is evaluated at a single k x rather than a sum over k x . The two constants are α x = 1.21 and σ x = 2, with k x0 = 0.32k y k max /k y 0.7 γ E /γ max and where g xx = |∇x| 2 .

Mixed plasmas.
For all simulations in the database of this work a pure plasma was used. For generality, quasilinear models must be able to operate with multiple ion species. The only aspect of SAT3 that is explicitly affected by this generalisation is the ratio of the magnitude of the linear weights |W L e,k y /W L i,k y |, used as an argument in the mode transition function, as this becomes ambiguously defined in a plasma with multiple ion species. This ambiguity is resolved by changing the denominator of the linear weight ratio to the sum over ion weights, |W L e,k y /W L i,k y | → |W L e,k y / i W L i,k y | for ion species i. By conducting a scan over relative concentrations of D and T in the GA-std and a/L n = 3.0 cases linearly (figure 15), this new ratio is shown to be invariant with relative density and thus recreate the expected behaviour.

Quasilinear approximation functions
Having described the saturated potentials, the QLA functions are now considered, defined in equation (7). In general for electrostatic turbulence there are 3n s of these functions, where n s is the total number of species present, with one existing for each combination of the 3 velocity moments and species. Historically these functions have seen a comparatively small amount of focus compared to the potentials, and are typically modelled as a constant [51]. By plotting these functions explicitly from the CGYRO data, one can determine to what degree a constant QLA is a reasonable assumption to make.
All cases in this database use kinetic electrons and a single ion species, giving 6 functions. Of these 6, the two momentum QLA functions are trivially zero, due to there being no momentum transport in the cases considered, and the two particle functions are identical, as a consequence of ambipolarity. This leaves three as non-trivial: the two energy functions for the ion and electrons, and one of the particle functions.
Plots of these three QLA functions against k y ρ i are shown in figure 16 for all cases in the database. The vast majority of the energy functions for both species have a similar shape across the spectrum, and most importantly exhibit relatively small variation in the region where the flux components are largest (k y ρ i ∼ 0.2, evidenced in figure 3(a)). A similar description is seen for the particle function, although with more sporadic variation in some cases. The assumption that the QLA could be modelled as constant was found to be a reasonable approximation for the database, with the constants for the model being set by minimising the scatter between the NL CGYRO flux data and the CGYRO-SAT3 flux data. The values of these constants for the electron and ion fluxes were found to be similar, however were stratified by mode type and moment. These are Λ Γ ITG = 1.1, Λ Γ TEM = Λ Q TEM = 0.6 and Λ Q ITG = 0.75. To capture these differences, the QLA functions are expressed in terms of the mode transition function (equation (27)), such that for the particle flux Any other forms of transport are assumed to have Λ s,k y = 0.8. While a mode-dependent constant remains a reasonable model for the contribution of the QLA functions to the overall flux in this database, it is not perfect, missing for example the isotope scaling seen in figure 4. In future studies more exotic tokamak equilibria may cause the QLA functions to deviate further from those seen here, potentially necessitating an effort to try to predict their shapes from linear physics.

Summary of new saturation model, SAT3
The entirety of SAT3 is collected here for reference. The fluxes of the model are constructed via where Λ s,k y is the QLA function, W L s,k y is the linear weight for species s, F s,k y describes the effect of E × B shear, |δφ k y | 2 x,θ,t /Δk y is the saturated potential, and Δk y is the binormal grid spacing of the simulation. The model for the saturated potentials is    (35) where M is the mode transition function which captures the disparate saturation levels between ITG-and TEM-dominated turbulence. The wavenumber k max is defined as the position of the maximum of the linear growth rate divided by k y , and γ max = γ k y =k max . The reference magnetic field B unit is given by equation (3), with ρ unit = √ m D T e /eB unit , and where c/b = −0.751k max , k min = 0.685k max , k 0 = 0.6k min and k P = 2k min . The QLA functions Λ s,k y vary depending on velocity moment and mode type. For the particle flux and energy flux these are with other fluxes taking Λ s,k y = 0.8. Finally, the function F s,k y describes the effect of E × B shear and is given by [1] whereW L s,k y ,k x is the quasilinear weight defined by TGLF, which is evaluated at a single k x rather than a sum over k x . The two constants are α x = 1.21 and σ x = 2, with k x0 = 0.32k y k max /k y 0.7 γ E /γ max and where g xx = |∇x| 2 and B is the equilibrium magnetic field magnitude.

Results
The scatter plots of the fluxes obtained from NL CGYRO against the results of CGYRO-SAT3 are shown in figure 17 for the ion energy fluxes, electron energy fluxes and particle fluxes. TEM-dominated cases are marked by circles (labelled f, x, y and z in figure 13(c)), with the remainder being ITGdominated. Data points connected by a line denote simulations of the same equilibrium but of different isotopes. A metric for the quality of the model agreement can be calculated by taking the average percentage error, for the energy flux and particle flux, where N is the number of simulations in the database. The values obtained for the three plots are displayed in their respective subfigures. For comparison, the equivalent scatter plots for a rescaled CGYRO-SAT1 model are shown in figure 18, with additional fitted prefactors for each flux type, labelled CGYRO-SAT1 * . 9 Looking at figures 17 and 18 a reduction in the average percentage error between CGYRO-SAT1 * and CGYRO-SAT3 is seen for the three flux types. CGYRO-SAT1 * and CGYRO-SAT3 can be seen to perform similarly in the ITG-dominated cases, as may be expected, however great improvement is shown in the isotope scaling and magnitude of the TEM cases, owing to the modelling of the difference in saturation level between the two mode types present in SAT3. This is demonstrated explicitly in figure 19, which exhibits the ion and electron energy fluxes against isotope mass for the ITG-dominated GA-std case and the TEM-dominated a/L n = 3.0 case compared between the three models.
A selection of energy flux scans with various key tokamak parameters is shown in figure 20, comparing NL CGYRO and CGYRO-SAT3. The density gradient scan is a recreation of a subset of the data points from figure 1, now with larger NL CGYRO fluxes, demonstrating the recreation of the positive isotope scaling for the ITG-dominated GA-std cases at low density gradient, the grouping of the fluxes for the transition case, and the anti-gyroBohm scaling at the high density gradient TEM-dominated a/L n = 3.0 case. How this a/L n = 3.0 case varies with collisionality is then displayed in figure 20(b), from which it is seen that the correct scaling is maintained across a large range of collisionalities. Finally the ion and electron heat fluxes against matched temperature gradients (a/L T i = a/L T e ) are shown in figure 20(c). The general trend in both isotope scaling and magnitude can be observed to agree with the NL CGYRO data, however the fluxes appear to be somewhat under-predicted near the point of threshold. The generality of SAT3's behaviour in this region presents an area to be investigated further, as this is a key region of parameter space for experimental conditions.
To connect these results with the observations made in section 4, figure 21 shows a recreation of figure 3(b), exhibiting the isotope scaling of the ion energy flux components Q i,k y in the a/L n = 3.0 case for the different models, now with the data for CGYRO-SAT3 included. The improved low k y ρ i variation and offset in the isotope scaling can be seen, as a consequence  figure 13(c). The line to denote perfect agreement between the models is shown in black, with the respective relative errors also shown on each plot, defined by equations (43) and (44).  of the strong spectral shape and the differing TEM saturation level in SAT3.

A note on future improvements
The SAT3 model presented above is an accurate model for the enhanced CGYRO database generated, and is promising for extrapolation within the most common ITG and TEM turbulence regimes found in experiment. It does not however represent a complete description of turbulent plasma transport, and will of course perform less reliably in a parameter space far from its training database. The approach taken in SAT3's development however naturally presents a methodology with which to diagnose and improve on such future discrepancies. That is, for some hypothetical simulation that this model catastrophically fails to recreate, an established methodology can be employed to effectively diagnose and attempt to rectify the issue.
This approach starts by taking care to preserve the role of each physical element that constitutes the model flux calculation, so as to keep separate the contributions of the QLA, the  weights, and the potentials. This decomposition of the reduced model clarifies which aspect of the reduced model is responsible for the recreation of each part of the nonlinear flux, and each can be considered in turn when attempting to diagnose future model disagreements with NL gyrokinetic results.
If the discrepancy is found to originate in the saturation rule, then one can first test whether the underlying functional relations (equations (19) and (20)) still hold when their parameters are fitted to nonlinear data. If not, these will need to be amended. Otherwise, one continues to the question of the linear modelling of the parameters c 1 , k min , c/b and |δφ k y =k 0 | 2 x,θ,t /Δk y , each of which can be considered modularly. If a more robust method for approximating k min from the linear data is discovered in the future for example, one can replace/amend that aspect of the model without requiring change elsewhere.

Conclusion
The isotope scaling of fluxes in local nonlinear gyrokinetics and quasilinear models has been considered. It was confirmed that existing quasilinear models generally struggle to capture the isotope scaling of fluxes seen in nonlinear gyrokinetic simulations, in part due to the historical lack of focus on isotopes other than deuterium in the development of their saturation rules and an incomplete understanding of the isotope effect. The origin of this non-trivial isotope scaling in nonlinear gyrokinetics was demonstrated to originate primarily in the magnitude of the saturated electrostatic potentials, with current quasilinear models failing to replicate this behaviour due to missing one or both of the following effects in their saturation rules: the variation in the isotope scaling with k y in the low-k y region, and the 'offset' from the gyroBohm-predicted level, which is sensitive to the dominant mode type.
In developing a new saturation rule to capture this behaviour the general form of the 2D potential spectrum in k x and k y was first considered, generalising the previous assumption of a squared Lorentzian form. The 2D model adhered to the data extremely well across a range of decades in radial wavenumber.
The new 1D saturation model SAT3 was then developed on a database of 43 simulations, including both ITG-and TEM-dominated cases. This database is similar in scope to those used for previous TGLF saturation models, however now includes simulations in different isotopes. A summary of the SAT3 model is given in section 5.7. An accurate model for the spectral shape was built using a robust novel relationship between the 1D potentials |δφ k y | 2 x,θ,t /Δk y and the radial spectral widths σ k y that was discovered in the NL gyrokinetic simulation results, which appears to capture a conserved quantity in electrostatic turbulence. The resulting parameters of SAT3, k min , c/b, c 1 and |δφ k y =k 0 | 2 x,θ,t /Δk y were modelled using linear quantities inspired from a previous work. The database demonstrated different saturation levels for ITG and TEM turbulence, which motivated a saturation model for each mode type with a transition function between them. The QLA functions were considered explicitly, however ultimately constant factors to discriminate between mode type and moment were used.
The new model has been shown to better capture the isotope scaling of the cases in the database, particularly in the cases of TEM-dominated turbulence, while performing at least as well as existing quasilinear models in other parameters, as exhibited via comparison with CGYRO-SAT1 * in figures 17 and 18. Generalisations of the model outside of the dataset have been implemented, namely the effect of E × B shear and multi-ion plasma operation. Having been constructed from a database of predominantly ion-scale, electrostatic core plasmas, dominated by either ITG or TEM turbulence, SAT3's applicability is naturally most valid in these regions of parameter space. Caution should therefore be exercised in the use of SAT3 away from these areas, such as plasmas for which electromagnetic effects or electron-scale dynamics are expected to be significant, in the pedestal, or in the presence of dominant modes other than the ITG or TEM. Generalisation of SAT3 into these regimes will require comparison to additional nonlinear gyrokinetic simulations. An algorithm for performing such generalisations, based on the systematic comparison of the constituent parts of SAT3 to nonlinear gyrokinetic data, is discussed in section 6.1.
In future work SAT3 will be paired with the linear solvers of TGLF and QuaLiKiz for use in integrated modelling suites, to be validated against experimental data from recent isotope campaigns on JET and other machines. While a degree of error will be incurred in flux prediction by moving from an exact linear solver to a fast linear solver, this effect is expected to be small, with fast linear solvers having shown good agreement with linear gyrokinetics in the most common experimentallyrelevant regimes [28]. Upon carrying out the averages in x and y, the orthogonality of the exponentials leaves only the terms with k x = −k x , k y = −k y : Because the quantities involved in this calculation are real, their Fourier components satisfy δf −k x ,−k y = δf * k x ,k y , which when applied to the potentials gives where V k x ,k y has been introduced to reduce clutter in the following. Observing V k x ,k y =0 = 0, equation (B.4) may be written Using the result k x V k x ,−k y = k x V −k x ,−k y , as well as the property V * k x ,k y = V −k x ,−k y : This may be expressed more succinctly using the definition of the real part of a complex number, Re[z] = (z + z * )/2: Here the real property of the fields has introduced a symmetry to the terms, such that the flux contribution from the negative binormal modes is equal to that of the positive. In order to express the right-hand side of equation (B.8) in terms of the magnitude of the electrostatic potentials and the phase difference between the fluctuations, one can define Z s,k x ,k y such that for every Fourier mode δp s,k x ,k y = Z s,k x ,k y δφ k x ,k y . Multiplying both sides of this relation by δφ * k x ,k y , taking the complex conjugate and isolating the imaginary part yields Im[δp * s,k x ,k y (θ, t)δφ k x ,k y (θ, t)] = Im[Z * s,k x ,k y ]|δφ k x ,k y | 2 . The total flux may therefore finally be written which is equation (2).

Appendix C. Parseval's theorem
Parseval's theorem relates the average of an absolute squared function in real space to the sum of the absolute squares of its where k x = 2πn x /L for all integers n x . Multiplying both sides by f * (x) and averaging over the domain leaves only the k x = k x terms, producing the result.

Appendix D. Grid-resolution dependence
The flux Q s in equation (2) is a physical quantity, resulting in part from a spatial average over the x and y domains. It should therefore be independent of the binormal wavenumber grid resolution Δk y , once past a certain value required for convergence. It then follows that the flux components Q s,k y must depend on the grid resolution, as if one doubles the resolution, one doubles the number of terms in the sum while maintaining a constant total.
Multiplying and dividing the right-hand side of equation (2)  and thus because the area under Q s,k y /Δk y is independent of the grid resolution, so too must Q s,k y /Δk y be itself. Analogously for the potentials, the volume and time average of the squared potentials in real space must be independent of Δk y . Using Parseval's theorem, and therefore |δφ k y | 2 x,θ,t /Δk y must also be independent of Δk y . Now using equation (5) resulting in |δφ k x ,k y | 2 θ,t /Δk y Δk x being the grid-independent version of the 2D potentials.

Appendix E. 2D spectrum moment integrals
Here the first three moments of the 2D potential spectrum (equations (5), (12) and (13)) are integrated analytically to obtain expressions for the coefficients C 1 , C 2 and K in equation (11). Starting with the zeroth moment, defined by equation (5), then by approximating the summation as an integral and inserting equation (11) for the 2D potential spectrum, one obtains (E.1) Taking out a factor of C 2 from the denominator, the integral may be written where D = |δφ k x =K,k y | 2 θ,t /(C 2 Δk x ), E = 1/C 2 , F = C 1 /C 2 and u = k x − K. The denominator can then be factorised via E + Fu 2 + u 4 = u 2 + G u 2 + H , where and F = G + H, E = GH. Note that for the integral to have no singularities one requires G, H > 0, and hence E, F > 0. This factorised form can then be separated using partial fractions as Using the substitution u = √ H tan(v) and u = √ G tan(v) respectively for the two integrals, equation (E.3) evaluates to Squaring this equation, using the relations GH = 1/C 2 , G + H = C 1 /C 2 and re-arranging, one finds