Redundancy-selection trade-off in phenotype-structured populations

Realistic fitness landscapes generally display a redundancy-fitness trade-off: highly fit trait configurations are inevitably rare, while less fit trait configurations are expected to be more redundant. The resulting sub-optimal patterns in the fitness distribution are typically described by means of effective formulations. However, the extent to which effective formulations are compatible with explicitly redundant landscapes is yet to be understood, as well as the consequences of a potential miss-match. Here we investigate the effects of such trade-off on the evolution of phenotype-structured populations, characterised by continuous quantitative traits. We consider a typical replication-mutation dynamics, and we model redundancy by means of two dimensional landscapes displaying both selective and neutral traits. We show that asymmetries of the landscapes will generate neutral contributions to the marginalised fitness-level description, that cannot be described by effective formulations, nor disentangled by the full trait distribution. Rather, they appear as effective sources, whose magnitude depends on the geometry of the landscape. Our results highlight new important aspects on the nature of sub-optimality. We discuss practical implications for rapidly mutant populations such as pathogens and cancer cells, where the qualitative knowledge of their trait and fitness distributions can drive disease management and intervention policies.


Introduction
Understanding the interplay between neutrality and selection is considered one of the major challenges in the contemporary theory of biological evolution [1,2,3,4,5], aiming to bridge the gap between two historically antipodal theories [6]. When neutrality is considered concomitantly with selection, sub- 5 optimal behaviours, that cannot be captured by purely neutralist or selectionist approaches, are expected to emerge due to their interplay [7,8,9,10,11]. Less fit phenotypes are able to outperform the fittest ones, if they are endowed with higher 'mutational robustness' due to some degree of neutrality. This effect is sometimes referred to as the 'survival-of-the-flattest' effect, in iconic opposition ulated (and not the fittest one only). In the polymorphic regime, it is possible to map the low-level genotype dynamics onto the high-level phenotype dynamics only if mutations satisfy a specific condition [22], that is when their rates 30 depend only on the resulting (mutant) phenotype, regardless of the starting (parent) genotype. Although this demanding condition holds for many models of molecular phenotypes, the implications of its violation are much less clear [23].
Phenotype-structured populations belong to the polymorphic category. In 35 such populations, individuals are characterised by (typically) one quantitative trait which is related to reproductive success (fitness) [24]. A common way to model phenotype-structured populations is to describe the quantitative trait of interest by a continuous variable (although discrete versions are possible). Then, mutations are often described by diffusion operators acting on the space of phe- 40 notypes. Such properties allow the deterministic mutation-selection dynamics of the population to be described by means of integro-differential equations.
However, diffusion-like mutations do not generally satisfy the special condition [22]; hence, in presence of a degenerate mapping, the two levels of de-45 scription (phenotypes and fitness) cannot be disentangled and are likely to be different, thus conveying potentially different information about the evolutionary state of the system. In this work, we will study the interplay between neutrality and selection in such rapidly mutating systems. 50 Phenotypes will be composed of both selective traits (on which fitness depends) and neutral traits (on which it does not), so that the dynamics will be captured by simple fitness landscapes featuring redundancy. Redundancy will be minimally modelled by considering two-dimensional landscapes, where a selective and a neutral trait interact by virtue of a universal redundancy-selection 55 trade-off. Nonetheless, the nature of such trade-offs will be mechanistically different: in the symmetric case, neutrality stems from the property that fitness is given by a combination of the traits composing the phenotype, such combi-nation being degenerate; instead, in the asymmetric case neutrality stems from explicitly considering a completely neutral trait concomitantly with a completely 60 selective trait. Then, redundancy is due to the inherent geometry of the resulting phenotype space, rather than to the degeneracy of the fitness function. For these reasons, we consider the two cases to be suited to qualitatively distinct biological contexts: for instance, the symmetric landscape dates back to the Fisher Geometric Model and has been widely employed in the field of molecular evolution, where the existence of a target optimal configuration of traits is assumed, and any mutation away from it is deleterious [25,26,27].
In this work, we will compare phenotype and fitness distributions of populations evolving on both symmetric and asymmetric landscapes. We will derive 70 exact equations governing the resulting fitness dynamics, and compare them to effective formulations. We will show that, despite the fitness distribution on asymmetric landscapes resembling that on symmetric ones, the nature of the two marginal dynamics is crucially different. Particularly, we will demonstrate that in presence of asymmetries between selective and neutral traits, the land- 75 scape's geometry generates contributions that cannot be captured by effective formulations. Finally, we will discuss some biological contexts, where a proper characterisation of neutral contributions to marginal dynamics may be of crucial importance. 80

Redundant fitness landscapes.
In molecular evolution, redundancy of genotype-phenotype maps stems from the basic fact that the number of possible genotypes is much larger than that of observed phenotypes, so that such maps must be degenerate. These map-85 pings are also generally strongly biased: some phenotypes are encoded by very few genotypes, whereas most genotypes are organised in networks (that is sets of genotypes connected by a single mutation) that are neutral (i.e. uniformly equally fit), as they map onto the same few phenotypes [28,29]. It has been argued that this bias should be regarded as a universal feature of any kind of fit-90 ness landscapes [30]: ultimately, highly fit individuals are so because they have a phenotype better suited than others to their environment, but such higher functionality will stem from a 'specific' (possibly rare) genomic configuration.
Hence, a trade-off holds between redundancy and fitness, so that very fit phenotypes would typically not be also highly redundant. 95 Indeed, in their iconic two-dimensional representation introduced by Wright [31], smooth fitness landscapes exhibit a hill-shaped topography: every phenotype is assigned a height proportional to its fitness, hence the optimum is represented by the top of the hill (see panel a of Fig. 1, adapted from [32]).

100
Neutrally related phenotypes, i.e. those sharing the same fitness value, are located at the same height, so that a height contour represents a neutral subset.
Since the length of a contour (i.e. the size of the neutral subset) grows with distance from the summit, very fit phenotypes are rare, whereas less fit ones tend to be more abundant. Hence a redundancy-fitness trade-off occurs, akin 105 to that of genotype-phenotype maps.
In order to account for the redundancy-fitness trade-off, we shall consider two-dimensional landscapes, but generalisations to higher dimensions are possible. Let P 2 be the phenotype space, and its elements p = (x, y) ∈ P 2 be 110 the possible phenotypes; the components x, y represent respectively the value of the two quantitative traits defining the phenotype. Each phenotype p maps into its corresponding fitness value f = F (p) according to the smooth fitness function F (p); the particular choice of F (p) determines the fitness landscape of the system. Two phenotypes p and q are defined to be neutrally related if 115 they share the same fitness, that is if F (p) = F (q). Then, a neutral subset with fitness value f is the collection of all neutrally related phenotypes p with fitness F (p) = f . For the sake of simplicity we will consider only single-peak landscapes, which have been employed in a variety of biological contexts [33], the study of more complex topographies going beyond the scope of this work.

120
Redundancy of the landscape is ultimately due to the degeneracy of the fitness function F . Here, we shall compare two possible versions of such degeneracy, symmetric (panel b Fig. 1) and asymmetric (panel c Fig. 1). In panel b of Fig. 1, phenotypes are identified by the trait coordinates p = (x, y). However, their fitness F (p) depends only on the distance r(x, y) from the centre. Phenotypes lying on the circle of radius r will share the same fitness value regardless of their angular position θ, thus forming neutral subsets. Hence, from the pair of trait variables x and y, we can construct a pair of (respectively) selective and neutral variables (r, θ), with which both the phenotype and the fitness dynamics can be described. The phenotype distribution of a population evolving on the symmetric landscape is described by the function n(x, y) in the original traits coordinates, or equivalently by n(r, θ) in the corresponding polar coordinates.
Given the circular symmetry, the marginal fitness distribution N s (r) is obtained by integrating the phenotype distribution over the angular coordinate θ, that is the radial distribution. We remark that the landscape exhibits the aforementioned redundancy-fitness trade-off, as the size of neutral subsets varies (linearly in our minimal model) in opposition to fitness.

125
In the asymmetric case, we assume that the traits x and y directly express, respectively, selective and neutral effects. So the x axis will represent the selec- The size of neutral subsets depends on the choice of B(x): taking a monotonically decreasing function of x leads to the desired redundancy-fitness trade-off, equivalent to the symmetric landscape.

Replicator-Mutator Equation (RME).
The deterministic integro-differential formulation of the mutation-selection 130 dynamics dates back to the 'continuum-of-alleles' model introduced by Crow and Kimura [34,35], and can be derived from stochastic mechanistic models via appropriate continuum limits [24,36]. Throughout the work, with the generic term 'individuals' we refer to the replicating units displaying phenotypic heterogenity, upon which natural selection and mutations act, be they RNA sequences, 135 bacteria or more complex forms of life.
We consider an infinite asexual population. Finite size effects, leading to genetic drift, are thus neglected. The state of the population at time t is determined by the phenotype distribution n(p; t). Individuals change their phe-140 notype due to mutation and selection: changes due to mutations are modelled by the Laplacian operator ∇ 2 , that is the local diffusion operator acting on the phenotype space P 2 , with mutation coefficient µ; concomitantly, changes due to selection occur at rate γ, and are modelled by the usual replicator term popular in Evolutionary Game Theory [37]. The deterministic temporal evolu-145 tion of the phenotype distribution n(p; t) for a large population is given by the Replicator-Mutator Equation (RME henceforth): subject to the conditions, P2 n(p; t) dp = 1 and with F [n(p; t)] denoting the average fitness of the population at time t: The conditions 4 correspond to the two physical constraints satisfied by the system: conservation of the total population at every time, because neither mu- The mathematical conditions for which the RME has stationary solutions 155 have been extensively studied [38,39]. However, explicit analytical solutions are rare because they are hard to obtain (see e.g. [40,41,42] Note that, although Eq. 3 contains the timescale γ −1 and the diffusive coefficient µ, the stationary solution will depend on only one relevant parameter δ = γ µ , that determines the relative importance of selection and mutation. In the 165 following, we will make simplifying assumptions for the space P 2 and the fitness function F (p), in order to facilitate analytical calculations on the model. This will allow us to derive useful forms for both the phenotype and the marginal fitness distributions, and compare the differences between symmetric and asymmetric landscapes.

Simulations
All the analytical results are confirmed by simulating the corresponding finite size stochastic agent-based dynamics. As expected, consistency with the deterministic description is obtained when the population size is very large (order of 10 5 individuals). The study of finite size effects is possible [46], although 175 it goes beyond the scope of the paper. Simulations have been performed with Java-based language "Processing", and detailed information can be found in the Supplementary Material, section E. The Processing codes are freely available here.

180
Trait distribution on non-redundant landscapes.
Let us first consider a simple one-dimensional case where the fitness landscape is not redundant. This case will provide the baseline results for comparison with the dynamics on redundant landscapes, to elucidate the effects of the redundancy-fitness trade-off.

185
Let the variable x ∈ P 1 = [0, 1] be the single quantitative trait of interest. Let F (x) be a non-degenerate monotonically increasing function, such that x = 1 represents the optimal trait, while x = 0 the least fit one. Clearly, since F (x) is not degenerate, the corresponding fitness landscape is not redundant; Trait distribution on redundant landscapes.
In redundant landscapes, the phenotype distribution n(x, y; t) evolves in time 210 according to the two-dimensional RME. In general, it is not possible to find an exact closed solution for the stationary distribution. However, in some cases it is possible to obtain spectral solutions. In the following, we shall consider an asymmetric landscape with triangular shape, that is for  In Fig. 3, we explore the differences between the phenotype distributions n(x, y) and the marginal fitness distributions N a,s (f ), at stationarity. The former describes the full distribution of traits over the two-dimensional space P 2 .
By contrast, the latter describes the one-dimensional distribution of fitness val-225 ues f , and is obtained by integrating the former over the neutral variables.
In panels a-d of Fig. 3, we plot the analytically obtained phenotype distributions on the trait plane (x, y): for the asymmetric case, the iso-density with where B k (z) is the k th Bernoulli polynomial of the variable z. This approximation then predicts that the average fitness of the population φ at stationarity increases linearly with selection pressure, according to: This approximation also predicts the emergence of intermediate local maxima and minima in the marginal fitness distribution for δ th 14 (see Sup-

Marginal fitness dynamics
For the symmetric landscape, the marginal fitness distribution N s (f ) is obtained performing the temporal derivative of Eq. 1, and replacing the correspondent RME (details in the Supplementary Information, section D). We find with For an asymmetric landscape of general boundary B(x), the marginal fitness distribution N a (f ; t) is obtained performing the temporal derivative of Eq. 2, and replacing the correspondent RME (details in the Supplementary Information, section C). In this case, we obtain (recall that f = x): with where the prime notation indicates the derivative with respect to the selective 270 variable f . The dynamics of the marginal fitness distribution in the symmetric (Eq. 9) and asymmetric (Eq. 11) landscape, display significant differences, which are discussed in detail below.

Discussion
In this work, we have considered both symmetric (Fig. 1, panel b) and asym-275 metric (Fig. 1, panel c) fitness landscapes. Both cases display selective degrees of freedom (namely x and r), and neutral degrees of freedom (namely y and θ), which are entwined by a general redundacy-fitness trade-off. However, the different nature of the trade-off generates differences, that are detectable at the marginal fitness dynamics level. Here we shall discuss the consequent analogies 280 and differences, as well as their practical implications.
Contrary to their non redundant counterpart (Fig. 2), we have shown that redundant landscapes display a dual behaviour, depending on the dynamics' level of description: full phenotype distributions exhibit survival-of-the-fittest 285 patterns (Fig. 3, panels a-d), where most of the population lies in proximity of the landscape optimum; on the other hand, their correspondent marginal fitness distributions may exhibit sub-optimal patterns (Fig. 3, panels e-f ), where most of the population displays less fit but more redundant traits (Fig. 4).
For triangular geometry, we have calculated the marginal fitness distribution 290 (Eq. 6) and the average fitness value (Eq. 7), in the weak selection approximation. We observe that the above formulae provide a good estimate of the state of the system up to δ 30, above which they break down due to second order selective effects (for details, see Supplementary Material, section C and Supplementary Figure 2). This approximation might also be used as a baseline 295 result to measure landscape's geometric deviations from the triangular shape.
Acknowledging this duality of behaviours, can help improving the fields in evolutionary epidemiology [48,49] and cancer dynamics [50,51], where pathogens are modelled as phenotype-structured populations, and the information on the 300 state of the distributions can be used to design treatment policies.
For example, in a viral or bacterial population, suppose that x quantifies the resistance to a drug or antibiotic, so that larger x confers higher fitness to its carriers [52]. Then, one might expect the population to be dominated by individuals with highest resistance (i.e. optimal fitness), and a therapy would if such a selective trait is entwined with another, neutral one (i.e. not affecting the resistance) via a redundancy-fitness trade-off, then the distribution will very likely be dominated by individuals with sub-optimal resistance, and the 310 therapy would erroneously target non-redundant traits, with the possibility of unwittingly helping sub-optimal strains to mutate and become fitter.
On the other hand, suppose that an experimentalist measures the growth rates in a rapidly mutant population as a function of x, and obtains a profile similar to panels e-f of Fig. 3, with a peak in the distribution at an intermediate 315 value x =x with 0 <x < 1. Then they might erroneously conclude thatx confers the optimal fitness value, whereas, in fact, the traitx dominates the population due to its redundancy, rather than due to a selective advantage. In where the interplay between neutrality and selection would be described by either/both a modified 'mutational operator' M eff [N (f ; t)], and/or a modified 'effective fitness' function F eff (f) (which is also similar to the case of slowly mutant populations). However, the above effective formulation is not general, and is not appropriate unless the landscape is symmetric.
In this work we have derived the marginal fitness dynamics, by explicit integration over the landscape's neutral degrees of freedom. In the symmetric landscape, marginalisation leads to a new drift term ∂ ∂f v(f ), where v(f ) plays the role of a velocity field pushing individuals away from the optimum. This 340 contribution is referred as a 'mutational entropy' biasing mutations due to redundancy of the landscape [25,27]. Thus, the marginal dynamics Eq. 9 is consistent with the effective RME formulation Eq. 13, with: being the new effective mutational operator.
However, in asymmetric landscapes with generic boundary profile B(x), marginal-345 isation generates contributions of different nature. In Eq. 11, mutations and competition are still captured by, respectively, a local diffusion term and a replicator term. However, marginalisation generates the new contributions F 1 (f ; t) and F 2 (f ; t). The magnitude of such terms depends on the landscape's geometry, that is on the slope B (f ) and curvature B (f ) of the boundary profile.

350
Moreover, from Eq. 12 we observe that these contributions depend on the full phenotype distribution n(f, y; t), thus making the marginal dynamics Eq. 11 an inohomogeneous differential equation. Indeed, the effective formulation Eq. 13 relies on homogeneous differential equations, and it cannot be equivalent to the inhomogeneous one Eq. 11 derived by marginalisation. Therefore, neutral con-355 tributions deriving from asymmetric landscapes cannot be identified as 'effective operators' acting on the fitness level of description.
This imposes severe limitations on the utility and exactness of effective formulations, for phenotype-structured populations. Indeed, our calculations have shown that solving the high-level fitness dynamics still requires the knowledge 360 of the underlying low-level trait details, and that this issue will occur whenever asymmetries in the trait-space are present.
The new terms due to asymmetry, F 1 (x; t) and F 2 (x; t), have the appearance of effective source contributions to the dynamics, analogous to a spontaneous 365 generation of individuals, if interpreted in the context of a lower-dimensional (non-redundant) fitness landscape. Note that the marginal one-dimensional profiles, shown in Fig. 3 panels e-f, display a non-zero gradient at the boundaries of the fitness domain, which would require a flux to be present in a truly onedimensional system. This feature cannot be present in profiles generated by 370 one-dimensional RME models, due to the physical constraints (as, we recall, the total population size is conserved and the system has no flux boundary conditions), unless they are introduced ad hoc. We call these emerging sources effective because they are generated by the asymmetry in the neutral degrees of freedom, that are unobserved at the marginalised fitness level.

Conclusions
In this work, we have investigated the RME dynamics of phenotype-structured populations, on minimally redundant landscapes. This kind of dynamics is widely employed in many biological (and other) research areas: population genetics [56], pathogenic evolution [52,57,58], RNA evolution [25], game theory 380 [42,59], language evolution [60]. Its application depends on the identification of rapidly mutating quantitative traits, responsible for phenotypic heterogeneity in the individuals composing the population. Examples of such traits are cytotoxic-drug resistance [61], pathogenic virulence [52,58] and transmission [57], antigenic types [62,63] and hosts' resistance to infection [64]. between transmission and virulence [69], that, in fitness terms, might relate to trade-offs akin to the redundancy-selection one.
Asymmetric landscapes also emerge whenever the phenotype space effectively available is bounded by Pareto-like fronts, outside of which lie all those phe-405 notypic configurations that long-term evolution has excluded, due to their systematic inefficiency [70,71]. Such trait-spaces have been proposed to explain observed patterns in gene regulation [72], and bacterial growth [73]. Triangularshaped landscapes, that herein have been used to facilitate calculations, have actually been observed in animal morphology [74,75,76]. In game theory, tri-410 angular geometries also characterise three-strategies games [77], and have been recently observed to emerge in a numerical study of a rapidly mutant version of the Ultimatum Game [78]. Ultimately, the experimental quantification of the landscape's asymmetries in the neutral directions is as important as that of selective traits.

415
In our theoretical work, selection has been introduced by explicitly considering a fitness landscape F , and an arbitrary competition rate γ. However, in applied contexts, the fitness landscape emerges from the mechanistic interactions associated with the quantitative trait under analysis, whose measurable param-420 eters combine to form effective competition rates [79,80]