1 1 Specialisation of ribosomes in gonads through paralog-switching 2 3

Ribosomes have long been thought of as homogeneous, macromolecular machines but recent evidence suggests they are heterogeneous and their specialisation can regulate translation. Here, we have characterised ribosomal protein heterogeneity across 5 tissues of Drosophila melanogaster. We find that testis and ovary contain the most heterogeneous ribosome populations, and that specialisation in these tissues occurs through paralog-switching. For the first time, we have solved structures of ribosomes purified from in vivo tissues by cryo-EM, revealing differences in precise ribosomal arrangement for testis and ovary 80S ribosomes. Differences in the amino acid composition of paralog pairs and their localisation on the ribosome exterior indicate paralog-switching could alter the ribosome surface, enabling different proteins to regulate translation. One testis-specific paralog-switching pair is also found in humans, suggesting this is a conserved site of ribosome specialisation. Overall, this work allows us to propose possible mechanisms by which ribosome specialisation can regulate translation.


28
Protein synthesis is essential across the tree of life and undertaken by the highly conserved 29 macromolecular complex of "the ribosome". mRNA translation is regulated at many levels, but until 30 recently the ribosome itself was not thought to be part of this control system. Recent studies have 31 suggested that ribosomes can contribute to gene expression regulation, through specific changes in 32 their composition, i.e. specialisation [1][2][3]. These specialised ribosomes are thought to contribute to 33 the translation of specific mRNA pools; but the mechanism by which this takes place is yet to be 34 understood.

35
Previous analysis in a variety of organisms (mouse [1], yeast [4], and humans [5]) has shown 36 that the composition of ribosomes is not homogeneous. In fact, specialisation of ribosomes is 37 thought to be able to occur through a) additional protein components  3 specifically RP paralogs has been reported to vary in a tissue specific manner. To profile potential 102 differences in expression in D. melanogaster we analysed publicly available RNA-Seq data across 103 various developmental time points and tissues. Hierarchical clustering of RP mRNA abundances 104 across these different biological samples reveals variations in expression of RP mRNAs between 105 tissues, with a cluster of RPs with much higher expression in the testis compared to other tissues (Fig   106   1A). This includes RpL22-like, a paralog of RpL22 previously reported as a testis specific ribosomal 107 protein [7]. These results suggest the presence of testis-specific translational machinery.

108
To determine whether these different RPs are translated and incorporated into ribosomes we 109 assessed the protein composition of ribosomes from these same tissues and cells; testes, ovaries, 110 heads (mixture of male and female), embryos (0-2hr) and S2 cells (derived from embryo). Ribosomal 111 complexes were purified using sucrose gradients and ultracentrifugation ( Fig 1B). Both 80S and 112 polysome complexes were isolated. The relative amounts of ribosomes existing as 80S or polysome

119
To understand differences in ribosome composition between the tissues, protein abundances of 120 ribosomal proteins were subject to hierarchical clustering ( Fig 1D). A cluster of proteins emerged, 121 which were enriched in the testis 80S ribosomes compared to 80S ribosomes from other tissues. This 122 cluster included RpL22-like, RpL37b, RpS19b, RpS10a and RpS28a, RpS15Ab. There was also an ovary 123 80S enriched cluster of ribosomal proteins, RpL24-like, RpL7-like and RpL0-like ( Fig 1D). PCA of 124 protein abundances by ribosomal protein revealed that the majority of RPs (75/93) form a group 125 together, suggesting they are incorporated in all ribosomes. The expression of RpL22-like, RpL37b, 126 RpS19b, RpS10a and RpS28a clusters together, as their incorporation pattern across the different 127 tissues is similar and this is driven mainly by their differential presence in testis 80S (Fig 1E, inset).

128
The same can also be seen in the ovary enriched proteins (Fig 1E, inset).

129
When ribosomal protein abundances are plotted between different 80S complexes, we report 130 that the largest differences are from paralogs rather than canonical RPs (Fig 1F & G). Comparison of 131 testis 80S and head 80S shows 6 paralogs (RpL22-like, RpL37b, RpS19b, RpS10a, RpS28a and 132 RpS15Ab) are highly enriched in the testis 80S compared to head (Fig 1F), whilst RpS11 is enriched in 133 the head 80S. Comparison of testis 80S and ovary 80S reveals that whilst the majority of ribosomal 134 proteins correlate between the two gonads, the same paralogs enriched in testis 80S compared to 135 head were also enriched compared to ovary ( Fig 1G). RpL24-like, RpL7-like, RpL0-like and RpS5b are 136 all far more abundant in ovary 80S ribosomes than in the testis ( Fig 1G). Overall, specialisation 137 seems most common in the gonads and we identify both testis-and ovary-enriched paralogs.

139
Ribosomal protein paralogs contribute to ribosome heterogeneity 140 There are 13 pairs of RP paralogs in the D. melanogaster genome and from our TMT data we 141 can see the majority are both expressed and incorporated into 80S ribosome in at least one of the 142 analysed tissues or the developmental time point of the embryo. Hierarchical clustering of these 143 paralogs re-emphasises the existence of gonad specific ribosomal complexes (Fig 2A). To understand 144 the relationship between each of the two paralogs we used the mass spectrometry data to quantify 145 relative abundances of the two paralogs with the matched pairs within the various tissues.

146
Interestingly, for the majority of these proteins one of the paralogs seems to be dominant in terms 147 of its presence in 80S ribosomes ( Fig 2B). Strikingly the testis differs in composition the most when 148 compared to the other samples (Fig 2A & B). In total, we find ~60% of testis 80S ribosomes contain 149 RpL22-like rather than RpL22. These patterns were seen with both TMT experiments (Sup 2A). For 5 150 paralog pairs the second paralog is most abundant in the testis, and low in other samples (RpL22-151 like, RpL37b, RpS19b, RpS10a, RpS28a), we term these 'testis-enriched paralogs'. A similar situation 4 is seen for 4 paralog pairs where the second paralog is most abundant in the ovary (RpL24-like,

157
Differences in ribosome composition are mainly the result of selective protein incorporation

158
To understand the expression of RP paralogs, we analysed mRNA-Seq levels of each of the 159 paralog pairs (Sup 2B). When relative paralog pair expression is profiled as a percentage on the basis 160 of RNA-Seq, it is clear that differences in protein composition of ribosomes is not purely driven by 161 transcriptional control of paralog genes (Sup 2B). We directly compared RP RNA expression (RNA-162 Seq) and RP protein incorporation (ribosome-TMT) identifying when RP incorporation into the 163 ribosome does not correlate with mRNA expression level ( Fig 2C). Specifically, RpL24-like is 164 transcribed across all tissues at substantial levels (Sup 2B) and there is no difference in mRNA level 165 between ovary and testis. However, RpL24-like is far more abundant in ovary 80S than testis 80S (Fig   166   2C) and is only represented in ovary 80S ribosomes ( Fig 2B). RpL34a is very lowly incorporated into 167 all ribosomes (Fig 2B) but its mRNA is expressed across tissues at substantial levels (Sup 2B). The 168 opposite is true for RpS15Ab, whose RNA levels are similarly low between testis and ovary but is 169 preferentially incorporated into testis 80S ( Fig 2C). RpL7-like is expressed at the RNA level broadly in 170 substantial amounts (Sup 2B) but is only incorporated into ribosomes at very low levels compared to 171 RpL7 (<10%). Of note, the differential incorporation of RpL22-like into testes ribosomes compared to 172 ovaries is driven by a transcriptional difference between the two tissues ( Fig 2C, Sup 2B).

175
There is conflicting evidence as to the functionality or translational activity of monosomes 176 (80S ribosomes), some suggest that these ribosomes are actively translating [30] whilst others 177 suggest that not all 80S ribosomes are engaged in active translation [31]. To determine if there was 178 any difference in ribosome composition between monosomes and polysome complexes, we 179 compared the two by TMT. In general, there is very little difference in RP composition between 80S 180 ribosomes and polysomes, e.g. testis (Fig 3A), head (Sup 3A). However, there are two paralogs 181 enriched in the ovary 80S compared to the ovary polysome, RpL7-like and RpL24-like (Fig 3B). Such a 182 large enrichment of these two paralogs in 80S complexes suggests that they potentially represent 183 ribosome complexes whose activity is being regulated. Therefore, these 80S complexes may not be 184 as translationally active as the polysome complexes. When the composition of ovary and testis 185 polysomes are compared we identify 6 testis-enriched RPs, which are all paralogs; RpL22-like,

186
RpL37b, RpS19b, RpS10a, RpS28a and RpL15Ab ( Fig 3C). In fact, these are the same proteins 187 enriched in testis 80S compared to ovary 80S ( Fig 1G). In this comparison we also identify a group of 188 proteins slightly enriched in the ovary polysomes; RpL37a, RpL22, RpS5b, RpL0-like and RpL40 (Fig   189   3C). Compared to the testis paralogs this fits well with the paralog switching between RpL37a/b,

190
RpS5a/b and RpL22/RpL22-like. When the relative composition of polysomes for paralog pairs was 191 determined the overall pattern was similar to 80S (Sup 3B). Differential incorporation within paralog 192 pairs ( Fig 3D) highlights the main differences between 80S and polysomes are associated with 193 ovaries, and are RpL24/24-like, RpL7/7-like.

195
Cryo-electron microscopy of testis and ovary ribosomes reveals a mechanism for inactivation of 196 testis 80S ribosomes

197
To understand the molecular implications of the paralog switching events we identified by 198 mass spectrometry, we sought to solve structures of different ribosome populations. Ribosomal 199 complexes were isolated in the same way as was previously described for TMT by sucrose gradient 200 centrifugation ( Fig 1B), with an additional step to concentrate purified samples (see Methods).

201
Imaging the sample by cryo-electron microscopy (cryo-EM) confirmed that the ribosome complexes 202 were highly pure and concentrated (Sup 4A). Testis 80S ribosomes were applied to grids and a 203 5 dataset containing ~47,000 particles was collected. Three-dimensional classification of this testis 80S 204 dataset identified a single structurally distinct class of 80S ribosomes, which was refined to an 205 average at 3.5 Å resolution (Fig 4A and Sup 4B). This provided a substantial improvement to the only 206 other D. melanogaster ribosome cryo-EM average at 6 Å resolution, from embryos [29]. We 207 performed a similar experiment with ovary 80S ribosome preparations, collecting a dataset 208 containing ~200,000 particles, and resulting in an average at 3.0 Å resolution (

209
These averages allowed us to generate atomic models for testis and ovary 80S ribosomal complexes 210 (Sup Table 1).

211
Comparison of the testis and ovary averages revealed that the main difference between 212 them was at the P/E tRNA site (Fig 4A and B). While the ovary 80S average did not contain any 213 densities in this region, the testis 80S average contained densities that did not correspond to a tRNA 214 ( Fig 4A, circle). As a comparison, the previously published D. melanogaster average contained 215 densities for an E-tRNA and for elongation factor 2, both of which are not present in our averages.

216
By combining information from the testis 80S structure and the corresponding TMT data, we 217 identified this density to be CG31694-PA (Fig 4C), which is highly abundant in the testis 80S 218 complexes (10,451 normalised abundance, see Methods; 54 th most abundant protein in testis 80S).

219
CG31694-PA is an ortholog of IFRD2, identified in translationally inactive rabbit ribosomes as being 220 bound to P/E sites of ~20% 80S isolated from rabbit reticulocytes [32]. Strikingly, in the reticulocytes 221 the presence of IFRD2 is always accompanied by a tRNA in a noncanonical position (termed Z site), in 222 the testis 80S average no tRNA was found in this region. In mammals IFRD2 is thought to have a role 223 in translational regulation during differentiation. Differentiation is a key process during 224 spermatogenesis within the testis, and in this context it is unsurprising to have found this protein in 225 the testis 80S. CG31694-PA has considerable amino acid sequence conservation with IFRD2, 32%

238
Functional implications of paralog switching event in gonads 239 By mapping the paralog switching events onto our ribosome structures we identified three 240 clusters of paralogs undergoing switching. 1) Paralogs within the small subunit, including RpS19a/b 241 and RpS5a/b, map to the head of the 40S near the mRNA channel (Fig 5A & B). 2) Paralogs within the 242 large subunit tend be surface-exposed. Specifically, RpL22/RpL22-like and RpL24/RpL24-like locate 243 towards the back of the ribosome (Fig 5C & D). 3) Paralogs that are located in ribosome stalks, RpLP0 244 and RpL10A, potentially interacting with the mRNA during translation (Fig 5E). Of note, the small 245 subunit paralogs are close to the mRNA channel, pointing towards functional differences in mRNA 246 selectivity of the ribosome.

247
By comparing the atomic models for testis 80S and ovary 80S, we identified differences 248 between switched paralogs (Table 1). Specifically, the three paralogs with the greatest proportion 249 (RpL22-like, 60% abundant in testis 80S; RpS19b, close to 50% abundant in testis 80S; and RpS5b, 250 over 50% abundant in ovary 80S; Fig 2B) showed the largest differences in their atomic models 251 ( Fig 6A-F). Additionally, of the paralogs that do not switch between testis 80S and ovary 80S, RpS28b 252 showed the largest differences (Fig 6G & H). This is probably due to its proximity to CG31694-PA 253 ( Fig 6I).

6
Comparing the amino acid sequences of each paralog pair it is possible to predict that they 255 might contribute different functionality to the ribosome (

257
Unfortunately, the most different region between RpL22 and RpL22-like (i.e., the N-terminal region;

303
The paralogs we find switching in the gonads are localised in three clusters; a) the head of 304 the 40S near the mRNA channel, b) the surface-exposed back of the large subunit and c) ribosome 7 stalks, potentially interacting with the mRNA during translation. The position of these three clusters 306 provides potential explanations of how specialisation is achieved, mechanistically. Differences in 307 amino acid sequence and precise position of the testis and ovary specialised paralogs (Fig 6C-F) can 308 potentially affect the interaction of the mRNA and the ribosome, specifically during initiation when 309 40S ribosomes are recruited to the 5' end of mRNAs. The back of the 60S where RpL22 and RpL22-310 like are located, would provide an ideal site for additional protein factors to differentially bind to 311 ribosomes containing these proteins. This is particularly true for this paralog pair, which has the 312 lowest sequence identity between each other, 45%. The termini of these proteins are likely to be 313 dynamic given the lack of density for them in our structures. Our phylogenomic analysis suggests 314 that the modulation of this part of the exterior ribosome surface is in common across many 315 organisms, and that the generation of paralogs has occurred independently three times for RpL22.

316
Therefore, this potential mechanism might regulate the ribosome across many eukaryotes. Although 317 paralogs are not conserved across a range of organisms, and many are limited to Drosophilids, there 318 are many organisms with many RP paralog pairs, including human (19 pairs) and Arabidopsis (80 319 pairs). Therefore, these potential mechanisms of ribosome regulation could be conserved, if not the 320 precise details.

321
The result we find here, that the gonads are important sites of ribosome heterogeneity and 322 specialisation, further indicates how important mRNA translational regulation is in the testis and 323 ovary. Many other testis-specific translation components exist to enable tight regulation such as 324 eIF4-3 [24] and it is now clear that RP paralog switching also plays a part in this regulation.

325
The importance of the paralog-switching event between RpS5a and Rp5b has recently been 326 functionally characterised in the Drosophila ovary [33]. Females without RpS5b produce ovaries with 327 developmental and fertility defects, whilst those without RpS5a have no defects. RpS5b specifically 328 binds to mRNAs encoding proteins with functions enriched for mitochondrial and metabolic GO 329 terms in the ovary, suggesting ovary RpS5b containing ribosomes translate this specific pool of 330 mRNAs [33]. It will be interesting to see how widespread this finding is for RpS5b, since this is a 331 frequently switched paralog: we find that 50% of ovary 80S ribosomes contain RpS5b, whilst 45% of 332 embryo 80S and 30% of testis 80S also contain RpS5b. It has been known for some time that 333 mutations in RpS5a produce a Minute phenotype (including infertility), so it seems likely that these 334 two paralogs both have biologically important roles in the fly. RpS5a and RpS5b have also been seen 335 to exhibit tissue-specific expression in A. thaliana, in a developmentally regulated manner [13].

336
atRpS5a was suggested to be more important than atRpS5b during differentiation, because of its 337 expression pattern, but the regulation mechanism remains elusive in A. thaliana.

338
The function of the RpL22 and RpL22-like paralog pair in Drosophila testis has been explored 339 and it has been suggested that the two proteins are not functionally redundant in development or

354
Several of the RPs that have gonad specific paralog pairs (including RpS19, RpS5, RpS10, RpS28 355 and RpL22 [37,38]) have been linked with human diseases, specifically Diamond-Blackfan anemia 8 and cancer (Table 2). Thus, it will be important to uncover their contribution to mRNA translation 357 regulation and work in vivo using Drosophila could help understand how they contribute to the 358 translation of specific mRNAs.

359
One of the few canonical RPs we found to be differentially incorporated was RpS11 in the head 360 80S ribosomes. RpS11 phosphorylation, in humans, has been found to be linked to Parkinson's 361 disease [39] and higher levels of RpS11 correlate with poorer prognosis in glioblastoma patients [40].

362
Therefore, understanding RpS11 levels in Drosophila head could provide a mechanism of future 363 exploration for dissecting the molecular mechanisms by which RP mutations result in human 364 disease.

365
Altogether our data reveal ribosome heterogeneity occurs in a tissue specific manner. Paralog-366 switching events are most abundant in the gonads and our structural analysis has provided insights 367 into how this switch might regulate translation mechanistically. Additionally, our evolutionary data 368 suggest specialisation may represent a conserved mechanism of translation regulation across

381
Tissue harvest 382 ~300 pairs of ovaries were harvested from 3-6 day old females in 1X PBS (Lonza) with 1 mM DTT 383 (Sigma) and 1 U/µL RNAsin Plus (Promega) and flash frozen in liquid nitrogen. ~500 (rep 1) and 384 ~1000 (rep 2) pairs of testes were harvested from 1-4 day old males in 1X PBS with 4 mM DTT and 1 385 U/µL RNAsin Plus and flash frozen in groups of ~10 pairs. ~500 heads (50:50 female:male, 0-4 days 386 old) per gradient were isolated by flash freezing whole flies and subjecting them to mechanical shock 387 to detach heads. Heads were passed through 1 mm mesh filter with liquid nitrogen and transferred 388 to Dounce homogeniser for lysis. ~500 µL of 0-2 hour embryos/gradient were obtained from cages 389 after pre-clearing for 2 hours. Laying plates comprised of 3.3% agar, 37.5% medium red grape juice 390 compound (Young's Brew) and 0.3% methyl 4-hydroxybenzoate, supplemented with yeast paste of 391 active dried yeast (DCL) and dH20. Embryos were washed in dH20 and embryo wash buffer 392 (102.5 mM NaCl (Sigma), 0.04% TritonX-100 (Sigma) and then flash frozen with minimal liquid. ~120 393 x10 6 cells/gradient were treated with 100 µg/mL cycloheximide (Sigma) for 3 minutes before were not detected and therefore calculated to be 0%, several failed to pass our standard thresholds 439 but were included in this analysis for completeness. Analysis of TMT data and hierarchical clustering 440 was performed in R.

447
For cryo-EM, 400 mesh copper grids with a supporting carbon lacey film coated with an ultra-thin 448 carbon support film < 3 nm thick (Agar Scientific, UK) were employed. Grids were glow-449 discharged for 30 seconds (easiGlow, Ted Pella) prior to applying 3 µL of purified ribosomes, and  Table 1).

458
Image processing 459 Initial pre-processing and on-the-fly analysis of data was performed as previously described [43].

460
Image processing was carried out using RELION 2.0/2. classification. Particles contributing to the best 2D class averages were then used to generate an 465 initial 3D model. This 3D model was used for 3D classification, and the best 3D classes/class were 466 3D refined, followed by per-particle CTF correction and Bayesian polishing [47]. Post-processing 467 was employed to mask the model, and to estimate and correct for the B-factor of the maps [48].

468
The testis 80S map was further processed by multi-body refinement, as previously described [49].

469
The final resolutions were determined using the 'gold standard' Fourier shell correlation 470 (FSC = 0.143) criterion (Sup Table 1). Local resolution was estimated using the local resolution

534
A reduced sampling of the metazoan Rpl22 family was used to generate a phylogeny was 535 performed using taxonomically-representative dataset containing 50 Rpl22 genes from 30 animals 536 and S. cerevisiae. This dataset was aligned using the same four methods described above, and all 537 alignments were judged to be mutually discordant (differences of 19-37%) using MetAl [59]. The

538
MUSCLE alignment had the highest column-based similarity score assigned by norMD (0.702) and 539 was selected for further analysis. As above, this alignment was trimmed using TrimAl's gappyout                                  Testis 80S S14b S28b Testis 80S  show the testis atomic model fitted into the EM density. The models are rainbow colored from n-terminus (blue) to c-terminus (red). (B, D, F & H) show the comparison between the testis 80S (green) and the ovary 80S (red) atomic models. (I) Area around mRNA channel, which in testis 80S is occupied by an alpha-helix from CG31694-PA. S14b (dark green), S28b (light green) and S5a (blue) from testis 80S are nearby. Ovary 80S paralogs (S14b, S5b and S28b) are superimposed, in red. The main differences between the pdb models, circled, are in regions close to CG31694-PA.