Structural Basis of Mammalian Respiratory Complex I Inhibition by Medicinal Biguanides

The molecular mode of action of metformin, a biguanide used widely in the treatment of diabetes, is incompletely characterized. Here we define the inhibitory drug-target interaction(s) of a model biguanide with mammalian respiratory complex I by combining cryo-electron microscopy and enzyme kinetics. We explain the unique selectivity of biguanide binding to different enzyme states. The primary inhibitory site is in an amphipathic region of the quinone-binding channel and an additional binding site is in a pocket on the intermembrane space side of the enzyme. An independent local chaotropic interaction, not previously described for any drug, displaces a portion of a key helix in the membrane domain. Our data provide a structural basis for biguanide action and enable rational design of novel medicinal biguanides.


Materials
All chemicals were purchased from Merck Millipore or Fisher Scientific unless otherwise stated. IM1761092 was synthesized as described (27) and provided directly by ImmunoMet Therapeutics in accurately weighed aliquots of the monochloride salt powder with a cited purity of 96.3%. We further confirmed the identity of the compound by proton NMR and direct infusion mass spectrometry (see below and Fig. S1), and determined the purity by quantitative NMR to be ~93% (see below and Fig. S1B). For biochemical and structural studies, IM1092 powder was dissolved according to its molecular mass to 50 mM concentration in DMSO or 10 mM concentration in gel filtration buffer containing DDM, as appropriate.

Chemical analyses of IM1092
Liquid chromatography mass spectrometry (LC-MS) was performed using an LCMS-8060 mass spectrometer (Shimadzu, UK) coupled to a Nexera UHPLC system (Shimadzu, UK). 1 pmol IM1092 was injected into a 15 μl flow-through needle and separation was achieved using a SeQuant ZIC-HILIC column (3.5 μm, 100 Å, 150 x 2.1 mm, 30 °C column temperature; Merck Millipore, UK) with a ZIC-HILIC guard column (200 Å, 1 x 5 mm). A flow rate of 200 μl min -1 was used with mobile phases of A) 0.1% (v/v) formic acid in H2O and B) 0.1% (v/v) formic acid in acetonitrile. A gradient program of 0-0.1 min, 80% B; 0.1-4 min, 80-20% B; 4-10 min, 20% B, 10-11 min, 20-80% B; 11-15 min, 80% B was used, with separation over the first 5 mins and column cleaning and equilibration from 5-15 min. The mass spectrometer was operated in positive ion mode and analytes were detected as an untargeted scan (100-1000 m/z). After subtracting the background from a sample-free trace only a single peak was observed, wherein the major species observed had a mass/charge of 366 Da and the calculated molecular weight for IM1092 is 365.60. IM1092 was analyzed by infusion MS and MS/MS using a QTrap (AB Sciex) by diluting IM1092 to 5 μM in 50% acetonitrile and 0.1% trifluoroacetic acid and infusing using a syringe at 3 μL min -1 . In full scan mode, the major peak was found to have a mass of 366.40 Da, consistent with the parent IM1092 compound, and MS/MS fragmentation with a collision energy of 40 eV produced masses consistent with fragmentation of IM1092 (Fig. S1C).
IM1092 was analyzed by proton NMR in DMSO-d6 using a Bruker AVIIIHD spectrometer with BBFO SmartProbe, operating at a nominal proton frequency of 400 MHz and a calibrated sample temperature of 298 K. The chemical shift axis was referenced to residual DMSO-d5 at 2.508 ppm (corresponding to TMS = 0). 16 scans were acquired with a spectral width of 20 ppm, an acquisition time of 4.1 s (corresponding to 32k complex points), a 90-degree excitation pulse of 11.1 µs and a recycle delay of 14 s (measured to be greater than 7 times the longest T1 of any sample peak). The data were zero filled once and exponential multiplication with LB = 0.3 Hz was applied before Fourier transformation and phase and baseline correction. Quantification was performed relative to an external reference sample of 0.1% ethylbenzene, adjusted for the measured internal tube volumes in the reference and sample tubes by quantitative deuterium NMR of the solvent signal. The measured concentration was 3.99 mM against a calculated concentration from the weighed sample (1.15 mg) and solvent (0.78956g) masses of 4.31 mM, implying a purity of 93%. The proton NMR spectrum is consistent with the presented chemical structure and with the described shifts for this compound (27). The shifts observed here were ¹H NMR (DMSO, 400 MHz): δH 7.85 (1H, d, J = 8.1 Hz), 7.51 (1H, s), 6.99 (1H, d, J = 7.9 Hz), 3.40-3.31 (2H, m), 2.80-2.75 (2H, m) ( Fig 1B).
Cellular oxygen consumption measurements HEPG2 C3A (ATCC CRL10741) cells were a kind gift from AstraZeneca and were authenticated by STR profiling (Eurofins). Cells were grown in Dulbecco's modified Eagle's medium (DMEM, Thermo Fisher Scientific) containing 10 mM glucose supplemented with 10% fetal bovine serum (Thermo Fisher Scientific) at 37 °C in 5% CO2. Per well, 2×10 4 cells were plated into an XFe96 (Agilent) cell plate and incubated for 24 h at 37 °C in 5% CO2. The medium was exchanged for assay buffer containing DMEM, 10 mM glucose, 1 mM pyruvate, 2 mM glutamine (Agilent), and the cells placed in a CO2-free incubator at 37 °C for 60 min. OCRs (oxygen consumption rates) were measured in a XFe96 Seahorse extracellular flux analyzer; 2 μM rotenone was used to inhibit complex I at the end of the assay. To calculate the normalized rotenone-sensitive OCR rates, the rotenone-insensitive rates (determined at the end of the experiment) were subtracted, and the traces normalized to 100% at the measurement before IM1092 addition. Data at each timepoint were then normalized relative to the control value to account for gradual changes in the control rate during the assay. IC50 values were calculated using the normalized rates, following 6 hours of incubation with IM1092.
Preparation of bovine mitochondrial membranes Fresh cow hearts were purchased from an abattoir (C. Humphreys and Sons, Chelmsford, U.K.). They were chopped into 2-inch cubes on site and immediately placed into cold buffer for transport. For the single heart used for the inhibitor-free complex I dataset the buffer (AT buffer) contained 10 mM Tris-HCl (pH 7.4 at 4 °C), 75 mM sucrose, 225 mM sorbitol, 1 mM EGTA, and 0.1% (w/v) fatty acid-free bovine serum albumin. For the two hearts used for the biguanide-inhibited dataset it contained 10 mM Tris-HCl (pH 7.55 at 20 °C), 250 mM sucrose and 0.2 mM EDTA. Mitochondria for the biguanide-inhibited dataset were prepared as described previously (60), whereas for the inhibitor-free dataset the tissue was minced and then blended in the Waring blender (15 s) in AT buffer (the addition of 2 M Tris base was omitted) and the centrifugation steps were modified (1,000 x g for 10 min, 4 °C, the supernatant filtered through muslin then re-centrifuged at 20,000 x g for 27 mins, 4 °C). The mitochondria were frozen for storage either as pellets or following resuspension to 10 mg-protein mL -1 in 20 mM Tris-HCl (pH 7.55 at 20 °C), 1 mM EDTA and 10% (v/v) glycerol. To prepare membranes, mitochondria at 5 mg-protein mL -1 in resuspension buffer were sonicated with a Qsonica (3 x 5 s bursts at 65% amplitude) then centrifuged at 74,000 x g for 1 hr, 4 °C (23) and the membranes resuspended to ~10 mg-protein mL -1 in the same buffer. Larger scale preparations for kinetic assays started with the tissue from eight hearts in 10 mM Tris-HCl (pH 7.55 at 20 °C), 250 mM sucrose and 0.2 mM EDTA, and membranes were prepared using a Waring blender (60, 61).
Preparation of bovine complex I Complex I for cryo-EM analyses was prepared by a method adapted from that of Jones et al. (62). Starting from ~50 mg membrane protein, membranes were diluted in resuspension buffer, solubilized in 1% n-dodecyl β-D-maltoside (DDM, Glycon), centrifuged at 80,000 x g for 20 min and filtered (0.2 µM). The complexes were then separated by ion-exchange chromatography (Q-Sepharose, GE Healthcare, as described previously (62)) and complex I-containing fractions pooled and concentrated using a 50 kDa MWCO spin column (Merck Millipore). The samples were clarified using a Spin-X 0.22 µm centrifugal filter (Corning) then applied to a Superose 6 Increase 5/150 (inhibitor-free dataset) or 10/300 column (biguanide-inhibited dataset) (GE Healthcare) and eluted in 20 mM Tris-HCl (pH 7.14 at 20 °C), 150 mM NaCl and 0.05% DDM. Complex I elutes as a monodisperse peak at ~ 1.6 or 13 mL from the superose 6 increase 5/150 or 10/300 columns, respectively (Fig. S3). The peak fractions for the biguanide-inhibited dataset were concentrated using a 50 kDa MWCO spin column (Merck Millipore) and the protein concentration quantified using a nanodrop UV-vis spectrophotometer (e280 = 0.2 mg -1 mL); IM1092 was then added from a 5 mM stock solution in the same buffer to a final concentration of 350 µM for a ~15 min incubation at 4 °C before grid freezing. Both final protein concentrations were ~3 mg mL -1 . Complex I for enzyme kinetics was prepared using the same method but starting with ~300 mg membrane protein and a first centrifugation of 8,500 x g for 12 min; the purified protein was concentrated to ~20 mg mL -1 and frozen in the presence of 30% glycerol for storage at -80 °C. 10 µg of purified complex I was reduced with 10 mM dithiothreitol and analyzed on a Novex Tris-Glycine 10-20% gel alongside Precision Plus Protein™ Kaleidoscope pre-stained protein standards (BioRad) according to the manufacturer's instructions. Preparations of complex I for cryo-EM and for enzyme kinetics were judged to be >90% pure by SDS PAGE (Fig. S2).
Kinetic measurements NADH:O2 oxidoreduction by membranes was measured at 10 µg-protein mL -1 using 200 μM NADH and 1.5 μM horse heart cytochrome c (Sigma Aldrich) in aerated buffer containing 10 mM Tris-SO4 (pH 7.5 at 32 °C) and 250 mM sucrose or 50 mM Bis-tris propane (pH values as indicated). For assays with deactivated membranes at pH 9, 400 μM NADH was added because a catalytic lag of ~20 mins was observed prior to achieving linear catalysis. Where appropriate, 1 μM antimycin A and 15 μg mL -1 AOX (prepared as described previously (63)) were added to assays. NADH:decylubiquinone (NADH:DQ) oxidoreduction by complex I was measured at 0.5 µg-protein mL -1 using 200 μM NADH and 200 μM DQ in buffer containing 0.075% soy bean asolectin (Avanti Polar Lipids), 0.075% CHAPS (Merck Chemicals) and 20 mM Tris-HCl (pH 7.5 at 20 °C). All assays were performed at 32 °C and initiated by addition of NADH. Rates were measured by linear regression of the maximal slope and quantified by the absorbance of NADH at 340-380 nm (ε = 4.81 mM −1 ·cm −1 ). IM1092 was added from DMSO stock solutions, with appropriate DMSO controls. IC50 values observed for the NADH:DQ assay are higher than for the NADH:O2 assay because the larger hydrophobic phase volume present in the NADH:DQ assay effectively dilutes the IM1092. To test the reversibility of IM1092 binding, aliquots of complex I (2.5 mg mL -1 ) with or without 350 μM IM1092 were incubated at 4 °C for 15 min, then diluted into the NADH:DQ assay as described above. Preparation of deactive membranes was achieved by heating membranes at 10 mg mL -1 at 37 °C for 20 mins in the absence of NADH. Preparation of active membranes was achieved by incubating 'as prepared' membranes at 10 mg mL -1 at 4 °C with 2 mM N-ethylmaleimide (NEM) for 1 hr in the absence of NADH; residual activity is only attributed to the active population of the complex. The sensitivity of membrane activity to NEM was determined to evaluate the active/deactive ratio by incubating the membranes (2 mg-protein mL -1 ) with 2 mM NEM at 4 °C for 20 mins then measuring the NADH:O2 activity relative to a DMSO control. The NEM-insensitivities were 86.4 ± 7.4 and 52.0 ± 0.4% (n = 3) for the membranes used for the inhibitor-free and biguanide-inhibited datasets, respectively. IC50 curves were fitted using log(inhibitor) vs. normalized response with variable slope in Prism 8.
Nano-DSF and protein aggregation and assays Purified complex I was diluted at 4 °C to 100 μg mL -1 or 2.5 mg mL -1 in Tris-HCl pH corrected to either 7.14 or 7.96 at room temperature and IM1092 was added immediately prior to loading into capillaries for analysis in a Prometheus NT.48 (NanoTemper). The temperature was increased from 15 to 95 °C (3 °C min −1 ). Protein stability was measured using the 350/330 nm ratio and protein aggregation was measured in back-reflection mode as scattering. In all cases, due to complexity in the traces, the non-derivatized data was used to discern an overall melting temperature (Tm) as the temperature at which the protein was overall 50% aggregated or denatured, by fitting to a Boltzmann sigmoid in Prism 8.

Confirmation of protein sequences
The complex I sample used for the biguanide-inhibited dataset was treated with iodoacetamide, precipitated with ethanol, then digested with chymotrypsin or trypsin in 50 mM NH4HCO3 and analyzed by LC-MS as described previously (64). Briefly, peptides were fractionated by liquid chromatography using a gradient of 5-40% acetonitrile in 0.1% (v/v) formic acid and analyzed on a Q-Exactive Plus Orbitrap mass spectrometer (Thermo Scientific). Data were acquired from 400 to 1600 m/z for precursor ions and the ten most abundant precursor ions fragmented by HCD in nitrogen. Peptide fragmentation data were searched against the mammalian sequences from SwissProt (January 2020) using MASCOT and allowing for missed cleavages (5 for chymotrypsin, 4 for trypsin) with error tolerant searches allowing for variable oxidation of methionine and cysteine carbamidomethylation. For subunit NDUFS2, a tryptic peptide with mass 1074.6662 Da exhibited a fragmentation pattern consistent with a Q>R substitution at position 129 compared to Uniprot sequence P17694 (Mascot score of 32, 95% confidence threshold of 15). In addition, a matching preparation from the same batch of mitochondria was analyzed by ESI for intact protein masses (64). A mass of 36,708.6 Da correlating to subunit NDUFA10 with a N>K substitution compared to Uniprot sequence P34942 (calculated substituted mass 36,706.88 Da) was observed, and a matching substitution at position 255 confirmed by error-tolerant searches of chymotryptic peptides (a mass 1583.8028 Da with a consistent fragmentation pattern as observed previously (65), Mascot score of 50, 95% confidence threshold of 26). These data explain cryo-EM density discrepancies observed at these two positions, so all models here contain Arg and Lys, respectively.
Mitochondrial DNA was extracted from 100 μL (1.8 mg-protein mL -1 ) of the mitochondria used for the biguanide-inhibited dataset using a Qiagen DNeasy blood and tissue kit according to the manufacturer's instructions. The regions of the mitochondrial DNA that encode the seven ND subunits were amplified from 10 ng of template DNA by PCR (Q5 high fidelity polymerase, 30 cycles, 67.5 °C annealing temperature) using the primers GGTGCAACCGCTATCAAAGG and AAGCAGCTTCAATTCTGCCG for ND1-2 and CGACGGAGTTTACGGCTCAA and AGCTCCGTTTGCGTGTATGT for ND3-6. The products were treated with a QIAGEN PCR cleanup kit and subjected to Sanger sequencing with the following primers: Sequencing data gave good coverage in both forward and reverse directions and uncovered eight nucleotide differences from the reference genome NC_006853.1 but only one led to an amino acid substitution (V411I in subunit ND5). Both the variant and reference sequence nucleotide were observed suggesting either heteroplasmy or a difference between the two cow hearts in the sample. Therefore, the reference sequence was used for structure modeling.
Cryo-EM grid preparation, data acquisition and processing PEG-thiol derivatized UltrAuFoil® gold grids (R 0.6/1, Quantifoil Micro Tools GmbH) were prepared and frozen with complex I as described previously (24) at 4 °C, using an FEI Vitrobot IV (Thermo Fisher) with blotting for 9.5 s at blotforce setting -10. Screening images were taken using a Talos Arctica at 200 kV at 73k x magnification, -2.5 μm defocus with a 100 μm objective aperture and 50 μm C2 aperture, and a Falcon 3 detector in linear mode. The final datasets were collected from a single grid each using a Gatan K3 detector and a BioContinuum GIF energy filter (biguanide-inhibited dataset) or a K2 detector with a Quantum GIF energy filter (inhibitor-free dataset) mounted on an FEI 300 keV Titan Krios with a 100 μm objective aperture, 70 μm C2 aperture and EPU software version 2.6 (biguanide-inhibited dataset) or 2.4 (inhibitor-free dataset) at the UK National Electron Bio-Imaging Centre (eBIC). The energy filter was operated in zeroenergy-loss mode with slit width 20 eV. Biguanide-inhibited data were collected in superresolution mode at 1.072 Å pixel -1 (nominal magnification 81,000×) with defocus range -1.0 to -2.4 μm in 0.2 μm increments and the autofocus routine run every 10 μm. The dose rate was 23.7 electrons Å -2 s -1 with 1.7 s exposure captured in 40 frames (total dose ~40 electrons Å -2 ). Data were acquired as one shot per hole in aberration-free image shift (AFIS) mode with 5 s delay after stage shift and 3 s delay after image shift. Inhibitor-free data were collected in counting mode at 1.056 Å pixel -1 (nominal magnification 47,600×) with defocus range -1.5 to -3.1 μm in 0.2 μm increments and the autofocus routine run every 10 μm. The dose rate was 4.2 electrons Å -2 s -1 with 12 s exposures captured in 25 frames (total dose ~50 electrons Å -2 ). Data were acquired as one shot per hole with 10 s delay after stage shift.
The inhibitor-free dataset was processed using RELION-3.0 (66) (Fig. S5). First, beam-induced movement for 2,370 micrographs was corrected using RELION's implementation of MotionCor2, both with and without dose weighting. CTF estimations were taken from non-dose weighted micrographs using . Micrographs that contained non-vitreous ice were manually excluded. Model-based autopicking yielded a total of 56,908 particles, which were extracted from dose-weighted micrographs and CTF corrected with an amplitude contrast of 0.1. 2D classification yielded 46,507 particles, which were re-extracted and subjected to 3D classification with searches from 7.5° until 0.9°, yielding 32,668 good particles. Particles were 3D refined and subjected to Bayesian polishing with the first 12 frames, yielding a map at 3.3 Å, then classified with searches from 7.5 to 1.8° into two classes. Cycles of 3D refinement and CTF refinement (for beamtilt and per particle defocus, astigmatism and B-factor correction) produced two maps, one that resembled the previously described active state (22, 25) at 3.1 Å (18,231 particles), and one with poor density for subunit NDUFA11 and a more 'open' conformation that resembled the previously described slack state (22) at 3.5 Å (14,437 particles). All resolutions are defined using the FSC = 0.143 criterion.
The biguanide-inhibited dataset was processed using RELION-3.1 (66, 68) (Figs. S4, S7 and S8). First, beam-induced movement for 17,203 micrographs was corrected for using RELION's implementation of MotionCor2, both with and without dose weighting. CTF estimations were made from non-dose weighted micrographs using CTFFIND4 (69). Micrographs that contained thick or non-vitreous ice were excluded by fitting the CTF up to 4 Å (to include any diffuse water ring) and removing particles with CTF figure of merit <0.04, as well as estimated resolution >6 Å, and CTF astigmatism outliers (<20 and >500), leaving 14,931 good micrographs. Model-based autopicking using a previous bovine complex I model picked a total of 907,090 particles that were extracted with down sampling to 2.41 Å/pix from dose-weighted micrographs and CTF corrected with an amplitude contrast of 0.1. 2D classification yielded 864,847 particles, which were reextracted at 1.072 Å/pix. The output from 3D refinement was used as the input for crude 3D classification with an angular sampling interval of 1.8° resulting in 671,489 good protein particles that refined to 2.3 Å. 3D refinement was run using --pad 1 and early-stage refinements also utilized --maxsig 2000 to limit the orientations considered and speed up the process. Particles were subject to rounds of CTF and 3D refinement to correct for anisotropic magnification, beamtilt, trefoil and higher order aberrations, as well as per-particle defocus, astigmatism, B-factor-weighting and spherical aberration. As the images were obtained using AFIS, two scripts (optics_add.py and optics_split.py, https://github.com/afanasyevp/afis) were used to separate the data into 45 optics groups to allow correction for any residual relative beamtilt differences. The data were globally classified into three classes containing 148,441 (22%), 292,296 (44%) and 230,746 (34%) particles respectively, all of which displayed a preferred orientation unusual for DDM solubilized complex I, and presumably induced by biguanide addition. Therefore, to improve particle orientation distributions, particles were filtered by rlnMaxValueProbDistribution using a script (part_angdist_eq.py, https://github.com/attamatti/orientation_equlization) with 1,000 bins and a standard deviation of 1.2. Particle motion was further corrected using Bayesian polishing during which the data were rescaled to 0.731 Å/pix. After removal of duplicates, a total of 598,297 particles were CTF-refined and refined together with solvent flattening to a resolution of 2.11 Å based on the FSC = 0.143 criterion. This map was highly fragmented in the distal membrane region because the particles comprised several distinct states (see below). Global classification back into three classes using an angular sampling of 0.47° yielded classes of 120,325 (20%), 287,942 (48%) and 190,030 (32%) particles, correlating to the previously described active, deactive and slack states (22,24,25).
Targeted subclassification i) Quinone-binding channel (Fig. S7) The preliminary deactive map (see above) was manually edited in Chimera (70) to remove density outside of the proximal membrane domain region. To generate this and subsequent masks for local classifications, RELION was used to apply a lowpass filter of 10 Å with extension of 10 pixels and a soft edge of 10 pixels, unless otherwise stated. Focused classification on the Q-channel region was performed without alignment using a T (regularization parameter) value of 100 using a mask generated from a molmap density of a crudely fitted IM1092 PDB molecule, generated in Chimera. Two major classes were identified with densities in the Q-channel, one with a density resembling IM1092 (70% of particles) and one with a long density feature (16%), plus two minor classes (8% and 5%) at lower resolution and without any obvious ligand densities occupying the Q-channel. The class containing the long density, likely endogenously-bound ubiquinone-10, was not investigated further. The particles in the major 70% class were reverted to include the density of the entire complex, followed by consensus refinements, and then global classification with angular searches at 0.47° into three classes corresponding to the active, deactive and slack states (48,367, 226,483 and 143,470 particles, respectively). The active class was not resolved further and is named the Active-1092-i state. ND6-TMH4 was poorly ordered in the slack class and a subclassification on the expected location of this helix allowed two minor classes to be discarded, although the density remained disordered. In the deactive and slack classes, the densities for subunits NDUFS2 (residues 53-60), ND3 (31-46), ND1 (62-64 and 207-216) and NDUFA9 (254-278,186-198 and 323-334) were disordered, as observed previously for deactive complex I (22,24,25). Therefore, these regions were subjected to further subclassification without alignment and with a T value of 100, using a mask that was generated from a molmap of these loops from the preliminary map of the active state. The approach yielded three deactive classes (Deactive-1092-i to iii) and two major slack classes, (Slack-1092-i and ii) (see Fig. S7).
ii) ND5 lateral helix (Fig. S8) Particle subtraction and focused refinement of the distal membrane domain were performed in the same manner as for the proximal membrane domain (see above). This was followed by rounds of focused classification on the ND5 lateral helix without alignment, using a T value of 100. The mask for this was generated from a molmap of PDB 5O31, for the corresponding region of subunits ND5 and B15 found to be partially disordered here, generated without binary extension and with a soft edge of 5 pixels. Three classes were readily identified: one with an intact and well-ordered lateral helix and NDUFB4 loop (termed ordered), one in which the ND5 lateral helix is displaced and the NDUFB4 loop partially disordered (termed displaced), and one with disorder in both the ND5 lateral helix and NDUFB4 loop (termed disordered). The particles in each class were reverted to include the density of the entire complex, followed by consensus refinements, and then global classification with angular searches only at 0.47° to separate the active, deactive and slack populations. The classification thus yielded nine maps in total (see Fig. S8): Active-1092-ii, iii and iv and Deactive-1092-iv, v and vi (ordered, displaced and disordered, respectively), plus three slack classes, one with well-ordered NDUFB4, which were not investigated further because they all contain a poorly defined ND5 lateral helix from residue ~660 onwards.
Preparation of consensus and composite maps To aid modeling, composite maps were generated to improve map quality in the distal membrane domain. The consensus maps were manually segmented into three sections, crudely comprising the hydrophilic, proximal membrane and distal membrane domains, followed by focused refinement. All masks used for solvent flattening, subtraction and post processing were generated using the molmap command in Chimera (70) from a near-complete model, low-pass filtered to 15 Å with a soft edge of 10 pixels. The final globally sharpened consensus and focused maps were corrected for the Material Transfer Function of the detector and sharpened with estimated B-factors in RELION. For the biguanide data, the maximum resolution for Bfactor consideration (--autob_highres) was set to the highest resolution in the RELION localres calculation, to avoid over sharpening. Local half map-based sharpening was applied to consensus maps using phenix.auto_sharpen with a window of 15x15x15 and overlap of 5, setting the resolution to the highest resolution observed in the RELION localres output. A near-complete model was fitted into the locally sharpened consensus map by using all-atom refine in Coot 0.9-pre (71) and this was further rigid body fitted into each of the focus refined maps using Chimera (70). The globally sharpened focused and consensus refinement maps for each class were then further combined together using phenix.combine_focused_maps to generate composite maps of much better clarity than the globally sharpened, or locally sharpened consensus maps in the distal membrane domain ( Fig S9). Composite maps were carefully compared to the consensus refined map in Coot 0.9-pre (71) and Chimera (70) to ensure there were no gross map distortions, especially at map extremities, or artefacts at inter-map boundaries. Local resolution maps and Fourier shell coefficient (FSC) curves from half maps for all maps presented are shown in Figs. S6, 10-15 & 19-24.
Model building An initial bovine active-state model was created by mutating an earlier mouse active-state model (PDB 6ZR2) in Coot 0.9-pre using mutate residue range. The initial model was then rigid body fitted into the consensus locally sharpened Active-10920-ii map using phenix.real_space_refine, followed by all-atom refine in Coot 0.9-pre and manual adjustment of the model. The resulting model then served to align the three focus refined maps for the Active-1092-ii state to generate a composite using phenix.combine_focused_maps. The model was then refined by cycles of manual refinement in Coot 0.9-pre (71) and automated refinement using Phenix-1.18.2 (72), guided by both the composite and consensus locally sharpened maps. Phospholipids and DDM molecules were added manually, and water molecules and ions were added using Coot Find Waters and manual addition, and checked manually for coordination geometry and density fit. This model was then rigid body fitted into each of the other maps, and refined in the same manner, checking and adding waters, phospholipids and detergent as necessary. Automated real-space refinements were performed using Phenix-1.18.2 against the composite maps using 5 macro cycles of minimization_global and local_grid_search, followed by Atomic Displacement Parameter refinement. Secondary structure restraints were not used, and Ramachandran restraints were set to Oldfield for Favored, and Emsley8k for Allowed and Outlier, to prevent genuine outliers being forcefully twisted. Statistics for all models are presented in Tables S2-14. In every case, the EMRinger score was superior for the composite map compared to the globally sharpened or locally sharpened consensus map. For the inhibitor-free active model, the model for Active-1092-ii was treated in the same way but in this case the locally sharpened map was used for refinement and secondary structure restraints were applied. Masked map-model FSC curves were calculated using phenix-1.18.2.
Unbiased searches for biguanide densities As well as manual identification of biguanide-like densities, the ligand search function from Coot 0.9-pre was used to search for additional unassigned densities. The most promising 20 sites were identified using 50 ligand conformers, with the fractions for scoring and correlation set to 0.7, and assessed manually. In addition, difference maps were generated using the phenix.real_space_diff_map command using a near-final model without IM1092 coordinates, and by subtracting maps generated from the models by molmap in Chimera (70) from the globally sharpened consensus maps using relion_image_handler (66) (see . The difference maps were inspected for unassigned densities. A putative density adjacent to ND5-Tyr35 in the Active-1092-iii state that resembles IM1092 was not modelled because no feasible bonding interactions could be identified for it. Fitting of IM1092 into density was assessed by a combination of subjective explanation of the density and cross-correlation (C-Cmask) of the IM1092 molecule by using Phenix-1.18.2. Where C-Cmask was ~0.5 or poorer and the density was subjectively not well explained, IM1092 was not included in the final model.

Statistical methods
For kinetic and scattering assays, all data were recorded with at least three technical replicates and number of replicates and measures of error (95% confidence interval or S.E.M) are indicated, and errors were propagated where appropriate. For cellular oxygen consumption rates, 8-12 technical replicates were used and data are shown as mean values with standard deviations.

Supplementary Text
Background: Characteristics of the active, deactive and slack classes Global classification of cryo-EM datasets for bovine complex I typically resolve three major classes (22,25), and the same behavior is observed here (Fig. S4). The classes are differentiated by both large-scale global features and by the status of individual structural elements. The active classes exhibit the most acute (closed) angle between the hydrophilic and membrane domains (Fig.  S25A), well-ordered structures for all elements of the Q-binding site (including the ND3 TMH1-2, ND1 TMH5-6 and NDUFS2 b1-b2 loops), a specific orientation of NDUFS7-Arg77 and its adjacent loop, and a-helical ND6-TMH3. This class has been described as the 'active' state of mouse, bovine and porcine complex I (22,23,25,73) and as the 'closed' state of ovine complex I (21). The deactive classes, identified previously for the mouse, bovine and porcine enzymes (22-24, 44, 73), exhibit a less acute (more open) angle (Fig. S25A), poorly ordered loops in ND3 (residues 31-46), ND1 (62-64 and 207-216) and NDUFS2 (46-64), a different orientation of NDUFS7-Arg77 and adjacent b-strand, plus a π-bulge in ND6-TMH3. The slack classes also exhibit the same disordered loops, b-strand in NDUFS7 and π-bulge in ND6-TMH3, but large portions of subunit NDUFA11 and the C-terminal lateral helix of ND5 (residues 562-606) are also not resolved (22,25). Classes of ovine complex I referred to as 'open' states resemble both the deactive and slack states of the bovine enzyme (21).

Local-first classification
To classify different sub-states from the final set of selected particles (Fig. S4), we tested different orders of image processing steps. First, the particles were classified globally into the active, deactive and slack states, and then subjected to focused classification with a soft mask around the areas of interest. This is the simplest strategy and was followed previously to classify a dataset of the bovine enzyme in nanodiscs (22). We found the resultant classes here exhibited lack of clarity in the density shapes within the classified regions. Therefore, an alternative 'local-first' strategy was tested in which the order of the classification steps was reversed. The maps exhibited enhanced separation of differently shaped densities in the classified regions, which we ascribe to an improved separation of the local densities due to better signal-to-noise during classification over small density volumes. The local-first strategy we applied (Figs. S7 & S8) started from a consensus refinement of all the (heterogenous) particles followed by: 1) particle subtraction for the domain of interest; 2) focused refinement on the domain of interest to improve alignment; 3) focused classification with a small, soft mask over the area of interest; 4) particle reversion to include density for the entire protein; 5) global classification.
Relative motions of classes in the biguanide-inhibited dataset To improve qualitative descriptions of apparent angle between the hydrophilic and membrane domains as 'open' or 'closed' we used the positions of three Cα atoms (NDUFB3-Ala75, NDUFA1-Val14 and NDUFS1-Glu165) to define the angle 'HD-MD flex' (Fig. S25A). Similarly, we used 'HD-MD twist', defined by the dihedral angle between four Cα atoms (NDUFB3-Ala75, NDUFA1-Val14, NDUFS1-Leu651 and NDUFV1-Ser38) (Fig. S25B), to describe how the hydrophilic domain rotates on the membrane domain, and 'HD-MD tilt', the dihedral angle between NDUFB8-Phe111, ND5-Tyr42, NDUFA13-Leu44 and NDUFV1-Asp292 (Fig. S25C) to describe how it tilts relative to it. These three parameters are most affected by structural elements at the domain interface, which also form the Q-channel, and, accordingly, local classifications of the Q-channel region (Fig. S7) separated different subclasses with both different degrees of Qchannel loop order and subtle differences in these global angles (see Deactive-1092-i, ii and iii and Slack-1092-i and ii in Fig. S23). Between the proximal and distal sections of the membrane domain we describe the two angles 'PMD-DMD flex' (orthogonal to the membrane plane) and 'PMD-DMD bend' (in the membrane plane) using the Cα atoms for NDUFB3-Ala75, ND2-Gln168 and NDUFA1-Val14 (Fig. S25D) and NDUFB3-Ala75, ND2-Met211, and NDUFA1-Val14 (Fig.  S22E), respectively. Classification on the ND5 lateral helix (Fig. S6) reveals the classes with the most ordered lateral helix (Active-1092-ii and Deactive-1092-iv) have the most closed flex and bend angles compared their helix-disrupted counterparts Active-1092-iii and iv, and Deactive-1092-v and vi, respectively (Fig. S25). Overall, the angles between the domains clearly demonstrate the similar global conformations for the substates in each major class, including the clear demarcation of active substates with the smallest HD-MD flex angles.

Q-site loop conformations from analysis of the biguanide-inhibited dataset
In all the active subclasses separated here, as well as in the inhibitor-free active state, the variable structural elements in subunits ND3, ND1, NDUFS2 and NDUFS7 that form the Q-channel are well ordered and essentially identical. In the deactive state these elements (ND3 TMH1-2 loop, ND1 TMH5-6 loop, and NDUFS2 β1-β2 loop) are usually considered disordered, but in the biguanide-bound substates Deactive-1092-i, ii and iii, local-first classification resolved them partially into different conformations. ND3 residues 31-45 and ND1 209-214 remain unresolved, but NDUFS2 46-64 are resolved in Deactive-1092-ii, and ND3 25-31 and ND1 215-219 are resolved in all three states with different degrees of ordering. The biguanide-bound Q-channel in Deactive-1092-ii thus differs from that in Deactive-1092-i and iii, as well as from the DDM-bound Q-channel in deactive bovine complex I in nanodiscs (PDB 7QSM) (22). We note that in the biguanide dataset, no DDM is observed in the Q-channel, and so must be out-competed by the biguanide ligand, and the different ligand identity likely creates order in the local structures. By contrast, in the two biguanide-bound slack substates the NDUFS2 β1-β2 loop (46-64) is in the same 'inward collapsed' state as it is in with a cholate molecule bound in the slack Q-channel in nanodiscs (22). In this case, the related structures with different ligands bound (with the backbone carbonyl of NDUFS2-Gln54 hydrogen bonding to the backbone amide of NDUFS7-Arg77, blocking off the top of the Q-channel) indicate the conformation is specific to the slack state, not a biguanide-induced effect. Fig. S28 compares the three different ordered conformations of the NDUFS2 loop observed. ND1 residues 203-217 are resolved in only one of the two biguanidebound slack states, whereas ND3 20-50 remain poorly resolved in both. All structures published so far are therefore consistent with exposure of ND3-Cys39 to derivatizing agents such as NEM in both the deactive and slack states. Finally, we note that a focus-revert-classify approach applied to the as-prepared (native) ovine enzyme (21) also did not completely resolve the Q-channel elements, and different positions for the loops in NDUFS2 or ND1 were not identified. The maps for deactive substates (Deactive-1092-iv, v and vi) that were not locally classified on lateral helix region display a similar degree of order at the Q-channel loops, to that observed in the native ovine open classes, suggesting a combination of classification strategies may be required in the future to better separate deactive /open substates.
The slack state in the biguanide-inhibited dataset Our biguanide-bound slack classes exhibit a π-bulge in ND4 TMH6. The same feature was described previously for the slack class in bovine nanodiscs (PDB 7QSO) (22), and is also present in some of the open states of ovine complex I (PDB 6ZKV, 6ZKN and 6ZKM for the deactive-open4, and rotenone-open2 and -open3 classes) (21). A Q10 molecule is bound nearby in the nanodiscs structure, and a rotenone molecule is present in the same site in the rotenone-bound open2 and open3 states (21,22). While the slack classes described here contain only a weak and unassigned density in this position, it is thus possible that ligand binding stabilizes the π-bulge. It was not previously noted that the ND4 TMH6 π-bulge rotates Tyr152 out of its position pointing towards the ND5 lateral helix (where it is found in the active and deactive states), into hydrogen bonding distance of the ND4 Glu123-Lys206 ion pair in the slack state (Fig. S17). These two residues are part of the central hydrophilic axis running along the center of the membrane domain and they are crucial for energy transduction. PropKa (74) calculations suggest that the pKa of Lys206 shifts from around 8.5 to 7.5 between the Active-1092-ii/Deactive-1092-iv and Slack-1092-ii states. Analogous residues in subunit ND5 (Tyr174 and Glu145-Lys223) are present in the same positions when the subunit models are overlayed. ND4-Tyr152, via a water molecule, forms part of a hydrogen-bonding network that supports a π-bulge structure in the ND5 lateral helix (as does ND4-Tyr148, in its interaction with the ND5-Glu559 sidechain) perhaps explaining why the lateral helix is not well-ordered in the slack state. It is possible that the π-bulge forms in ND4 TMH6 and Tyr152 rotates between different interacting partners during catalysis.
Composite maps for the biguanide-bound slack states lack clear density for ND6-TMH4, but when a Gaussian filter (width 1) was applied in Chimera (70), a density was revealed that suggests the helix has moved laterally towards the distal hydrophobic domain in a position intermediate between that of the active/deactive bovine states, and that of the Yarrowia lipolytica and Thermus thermophilus enzymes (75, 76). The bovine enzyme, like its ovine counterpart, contains a b-hairpin between ND6-TMH4 and 5, whereas the mouse enzyme contains only a loop (23). In the biguanide-bound slack classes, movement of TMH4 and swiveling around of the b-hairpin have occurred (Fig. S17). The same changes are not observed in inhibitor-free bovine slack structures (here in DDM, or in nanodiscs (22)) in which the helix density appears only in its deactive-state position. Different rearrangements were observed in the heat-treated ovine open4 state, with the b-hairpin residues converted to an a-helical segment and ND6-TMH4 tilting backwards towards ND1, and in the ovine rotenone-open2 and open3 states (21), where the b-hairpin and the first half of ND6-TMH4 appear in the same position as observed here in the active and deactive states. The specific arrangement observed here may thus result from biguanide interactions (in the Q-channel or with nearby phospholipids). As we were unable to produce a final map that contained both this density information as well as high-resolution information, this feature was not built into the deposited PDB models. Instead, a crude approximation is presented in Fig. S17C. It is clear that ND6-TMH4 and the structures between TMH4 and 5 vary in different conditions and different species, but the relevance, if any, of the different structures for catalysis is unclear.  The effect of IM1092 concentration on the protein melting temperature for protein aggregation. E) The effects of IM1092 on aggregation at different pH values. Control (white) and 500 μM IM1092 (orchid) at 120 μg mL -1 complex I. F) NADH:DQ activity of purified bovine complex I without (white) or with (orchid) 350 μM IM1092. IM1092 was either added directly to the assay (left), or preincubated for 15 min. with the protein and diluted into the assay buffer (right). G-J) Cryo-EM images of purified bovine complex I at 2.5 mg mL -1 in buffer containing 20 mM Tris-HCl, 150 mM NaCl and 0.05% DDM (pH 7.14 or 7.96 at 20 °C). Data show means and standard error of three independent technical replicates. Images are from a Talos Arctica operating at 200 kV at 73k x magnification, around -2.5 μm defocus with a 100 μm objective aperture and 50 μm C2 aperture. White scale bars show 50 nm. Aggregation is more severe at both higher pH and higher concentration of IM1092.

Fig. S3. Gel filtration traces for the purification of complex I used for Cryo-EM and SDS PAGE analyses. A)
Complex I used for the uninhibited dataset gel filtrated using a superose 6 increase 5/150 column. B) Complex I used for the IM1092 dataset gel filtrated using a superose 6 increase 10/300 column. The dark grey bar indicates the fractions collected for cryo-EM, and the pale grey bars show fractions collected for SDS PAGE and mass spectrometric analyses. C) SDS PAGE analysis of the outer peak fractions from gel filtration of the protein used for the IM1092 dataset. Purity is estimated at ≥ 90%.

Fig. S4. Data processing scheme for the biguanide-inhibited dataset.
A typical micrograph and selected 2D classes are shown, along with 3D and 2D projections of orientations before and after particle filtering. The red arrow indicates an excluded junk class.   The maps discussed in this study are highlighted by lilac boxes. 'Empty' refers to a lack of connected density in the channel: we note that it is not certain whether the channels are genuinely empty, or if local resolution or classification were insufficient to elucidate ligand heterogeneity. The 'long density' observed overlays well with the Q10 molecule modelled in PDB-7QSK (22) and therefore likely represents endogenous Q10 retained from the native membrane. PDB-7QSK was fitted in this map, and density is shown in mesh for the difference map between this class and a molmap generated density of PDB-7QSK, within 10 Å of the Q10 molecule. The density observed in Active-1092-i described as Unk (unknown) resembles that present in the active-apo class of bovine complex I in nanodiscs (EMD-14133) (22) and may represent a heterogeneous mixture of ligands or ligand-binding poses.            S18. Biguanide-induced distortion and disordering of the ND5 lateral helix and NDUFB4 loop in deactive states, and lateral-helix associated conformational changes in the slack state. A-C) PDB models and densities for the ND5 lateral helix in A) Deactive-1092-iv with sidechain density shown for the composite map, B) Deactive-1092v, C) Deactive-1092-vi. Density maps are colored using ChimeraX (77) by 3 Å proximity to the modelled structure: teal, ND5; orchid, NDUFB4; orange, NDUFA11; dark grey, NDUFB8. D-F) Hydrogen-bonding stabilization of πbulges in the ND5 lateral helix for Deactive-1092-iv, Deactive-1092-v and Deactive-1092-vi. Water molecules are represented by red spheres, black dotted lines show hydrogen bonds: teal, ND5; orchid, NDUFB4; white, ND4. G) Overlay of Deactive-1092-iv (teal) and slack-1092-i (orchid); rotation of ND4 TMH6 in the slack state causes loss of two stabilizing interactions (black dotted lines) with the ND5 lateral helix, but creates additional hydrogen bonds (red dotted lines) with ND4-Lys206 and Glu123.