[PDF] Deep manifold learning reveals hidden dynamics of proteasome autoregulation

Abstract

Full PDF

DDeep manifold learning reveals hidden dynamics of proteasome autoregulation

Zhaolong Wu , Shuwen Zhang , Wei Li Wang , Yinping Ma , Yuanchen Dong & Youdong Mao * State Key Laboratory for Mesoscopic Physics, School of Physics, Peking University, Beijing, China. Center for Quantitative Biology, Peking University, Beijing, China. Intel Parallel Computing Center for Structural Biology, Dana-Farber Cancer Institute, Boston, MA, USA. Department of Cancer Immunology and Virology, Dana-Farber Cancer Institute, Department of Microbiology, Harvard Medical School, Boston, MA, USA. Institute of Chemistry, Chinese Academy of Sciences, Beijing, China. *To whom correspondence may be addressed. Abstract

The 2.5-MDa 26S proteasome maintains proteostasis and regulates myriad cellular processes . How polyubiquitylated substrate interactions regulate proteasome activity is not understood . Here we introduce a deep manifold learning framework, named AlphaCryo4D, which enables atomic-level cryogenic electron microscopy (cryo-EM) reconstructions of nonequilibrium conformational continuum and reconstitutes ‘hidden’ dynamics of proteasome autoregulation in the act of substrate degradation. AlphaCryo4D integrates 3D deep residual learning with manifold embedding of free-energy landscapes , which directs 3D clustering via an energy-based particle-voting algorithm. In blind assessments using simulated heterogeneous cryo-EM datasets, AlphaCryo4D achieved 3D classification accuracy three times that of conventional methods and reconstructed continuous conformational changes of a 130-kDa protein at sub-3-Å resolution. By using AlphaCryo4D to analyze a single experimental cryo-EM dataset , we identified 64 conformers of the substrate-bound human 26S proteasome, revealing conformational entanglement of two regulatory particles in the doubly capped holoenzymes and their energetic differences with singly capped ones. Novel ubiquitin-binding sites are discovered on the RPN2, RPN10 and α5 subunits to remodel polyubiquitin chains for deubiquitylation and recycle. Importantly, AlphaCryo4D choreographs single-nucleotide-exchange dynamics of proteasomal AAA-ATPase motor during translocation initiation, which upregulates proteolytic activity by allosterically promoting nucleophilic attack. Our systemic analysis illuminates a grand hierarchical allostery for proteasome autoregulation. The 26S proteasome is the known largest ATP-dependent protease machinery of ~2.5 MDa molecular weight ubiquitously found in all eukaryotic cells. It exists at the center of the ubiquitin-proteasome system (UPS) that regulates myriad cellular processes, such as protein quality control, cell cycle, gene expression, inflammation and immunity . Dysregulation of the UPS has been broadly associated with carcinogenesis, ageing, neurodegenerative and cardiovascular diseases . The 26S holoenzyme is assembled by a cylindrical 20S core particle (CP) capped with one or two 19S regulatory particles (RPs), each consisting of the lid and base subcomplexes . The ring-like heterohexameric motor of ATPases-associated-with-diverse-cellular-activities (AAA) ATPase in the base subcomplex mechanically unfold polyubiquitylated substrates recognized by ubiquitin receptors in the lid subcomplex. Previous structural data by cryogenic electron microscopy (cryo-EM) have suggested three modes of coordinated ATP hydrolysis in the AAA-ATPase motor that regulate distinct steps of ubiquitin recognition, deubiquitylation, initiation of translocation and processive substrate translocation in the proteasome . However, key intermediate conformations between these states were missing, greatly limiting our understanding of the intrinsic proteasome regulation. The mechanism of the proteasome targeting of substrate-conjugated ubiquitin signals is not fully understood . How polyubiquitylated substrate interactions regulate proteasome activity remains enigmatic . The 26S proteasome is one of the most complex, dynamic and conformationally heterogeneous holoenzyme machinery in cells , challenging the current approaches in cryo-EM structure determination . Visualizing atomic structures of transient, nonequilibrium intermediates connecting the major states of the proteasome has been unsuccessful by conventional cryo-EM analysis . To address this major challenge, we developed a novel deep manifold learning system codenamed AlphaCryo4D that can break such a limitation and enable atomic-level cryo-EM reconstructions of highly dynamic, lowly populated intermediates or transient states. By applying AlphaCryo4D to analyze a large cryo-EM dataset that previously solved the atomic structures of 7 proteasomal RP-CP states , we reconstructed 64 conformers of the human 26S proteasome in the act of substrate degradation up to 2.5-Å resolution. Systematic analyses of those transient or intermediate states in substrate processing pinpoint energetic differences between the doubly and singly capped proteasomes, discover several novel ubiquitin-binding sites in the proteasome, and choreograph single-nucleotide-exchange dynamics of the proteasomal AAA-ATPase motor during translocation initiation. Importantly, the dramatically enriched proteasomal states provide unprecedented insights into an integrative, hierarchical allosteric mechanism of proteasome autoregulation upon polyubiquitylated substrate interactions. Design of deep manifold learning

The conceptual framework of AlphaCryo4D integrates unsupervised 3D deep learning with manifold embedding to learn a free-energy landscape, which directs single-particle cryo-EM reconstructions of rare transient states via an energy-based particle-voting algorithm. In brevity, AlphaCryo4D involves four major steps of data processing (Fig. 1a, Extended Data Fig. 1; see Methods for details): (1) Hundreds to thousands of 3D volumes are bootstrapped with M -fold data augmentation by particle shuffling and Bayesian clustering , during which each particle is used M times in reconstructing M volumes; each of the M copies is called a ‘vote’ of the particle (Extended Data Fig. 1a). (2) 3D feature maps of all volumes are learnt by an autoencoder made of a 3D deep residual neural network and are used to compute a free-energy landscape through manifold embedding (Fig. 1b, d, e, Extended Data Table 1). (3) A string method is used to search the minimum energy path (MEP) on the free-energy landscape , which defines the local energy minima or transition states of interest as the centers of the clustering boundaries for particle voting (Fig. 1f). (4) Each particle is counted for the number of votes that are casted within the voting boundaries on the energy landscape. The particle is classified to the cluster that receives more than M /2 votes of this particle within its voting boundary (Fig. 1f, Extended Data Fig. 1b). Particles that cannot be reproducibly voted into any clustering boundaries for more than M /2 times are ‘voted out’ during this procedure. The resulting 3D classes are expected to be conformationally homogeneous enough for high-resolution cryo-EM refinement . Since many protein complexes exhibit profound conformational changes in different local regions, we also implemented a focused classification strategy of AlphaCryo4D that applies a local 3D mask throughout the entire procedure, often being executed as an iterative step after initial 3D classification by AlphaCryo4D in the absence of any 3D mask (Fig. 1a, Methods). Solving conformational continuum at atomic level

To assess the numerical performance of AlphaCryo4D, we generated three large synthetic heterogeneous cryo-EM datasets with signal-to-noise ratios (SNRs) of 0.05, 0.01 and 0.005. Each dataset includes 2 million randomly oriented single particles computationally simulated from 20 hypothetical conformer models of the ~130-kDa NLRP3 inflammasome protein . These conformers imitate continuous conformational changes of the NACHT domain rotating around the LRR domain over an angular range of 90 ° during inflammasome activation (Fig. 1c). We conducted blind assessments on 3D classification and heterogeneous reconstructions by AlphaCryo4D, without providing any information of particle orientation and conformational identities (Fig. 1d-f, Extended Data Fig. 2). The 3D classification precision of a retrieved conformer was computed as the ratio of the particle number of correct class assignment (based on the ground truth) versus the total particle number in the class. The results of AlphaCryo4D were then compared with several conventional methods, including maximum-likelihood-based 3D (ML3D) classification in RELION and 3D principal component analysis (PCA) in cryoSPARC . In all blind tests, AlphaCryo4D retrieved all 20 conformers and markedly outperformed the conventional methods, with an average of 3D classification precision at 0.83, 0.82 and 0.65 for datasets with SNRs of 0.05, 0.01 and 0.005, respectively (Fig. 2a, Extended Data Fig. 3a-c). By contrast, the conventional methods missed two to six conformers entirely and exhibited 3D classification precisions in the range of 0.2-0.5 in general (Fig. 2b, c, Extended Data Figs. 3d-l, 4). The 3D classification precision appears to be strongly correlated with the map quality and resolution homogeneity across the density map (Fig. 2). All 20 density maps from AlphaCryo4D consistently show homogeneous resolutions at 2.6-2.9 Å between the NACHT and LRR domains (Fig. 2d, g, Extended Data Figs. 3m, 4a). By contrast, all density maps by the conventional methods show lower average resolutions and notably heterogeneous local resolutions, with the NACHT domain exhibiting substantially lower resolution than that of the LRR domain, causing blurred features, broken loops and invisible sidechains in NACHT (Fig. 2e, f, h, i, Extended Data Figs. 3n, o, 4b, c). Thus, the significantly improved 3D classification accuracy by AlphaCryo4D enables 4D reconstructions of conformational continuum at the atomic level. To understand how the 3D classification accuracy is improved in AlphaCryo4D, we further studied the algorithmic mechanism by tracking the statistical distributions of classification precisions over intermediate steps of data processing (Extended Data Fig. 5). We found that the steps of particle shuffling, defining cluster boundaries on the manifold-embedded free-energy landscapes and energy-based particle-voting contribute to ~16%, ~20% and ~40% improvements in the distributions of high-precision 3D classes, respectively, suggesting that our design of the particle-voting algorithm considerably improves the performance of AlphaCryo4D, without which deep learning alone is insufficient to achieve the present level of 3D classification accuracy. Visualizing hidden dynamics of the proteasome

Having conducted the proof-of-principle study of AlphaCryo4D using simulated datasets, we then turned to several experimental datasets . We first applied this approach to analyze an existing cryo-EM dataset of the substrate-free ATPγS-bound 26S proteasome . The AlphaCryo4D-reconstructed energy landscape clearly shows six local energy minima corresponding to the known six conformational states S A , S B , S C , S D1 , S D2 and S D3 , verifying the applicability of AlphaCryo4D in processing real experimental data (Fig. 3a). Next, we used AlphaCryo4D and its focused classification procedure to analyze a larger cryo-EM dataset of the substrate-engaged 26S proteasome (Fig. 3b, c, Extended Data Fig. 6 and Methods) . This dataset was collected on the human 26S proteasome mixed with a Lys63-linked polyubiquitinated substrate Sic1 PY for 30 seconds before ATP was diluted with ATPγS, which was expected to maximally capture any intermediate states before the degradation reaction was completed . The reconstruction of the free energy landscape of the substrate-engaged proteasome suggests that the rate-limiting step of substrate degradation lies in the initiation of substrate unfolding after substrate deubiquitylation, reflected in the highest activation energy barrier between states E B and E C1 (Fig. 3b, c), consistent with a recent fluorescence microscopy study on the proteasome kinetics . Impressively, AlphaCryo4D extracted significantly more conformers of the proteasomes in various forms from the same cryo-EM dataset that previously yielded seven atomic structures of the proteasome . It discovered 20 conformational states of RP-CP subcomplex (Fig. 3b, c, Extended Data Fig. 7, Extended Data Table 2). Eight transient states (designated E A1.2 , E

A2.2 , E

A2.3 , E

B.2 , E

B.3 , E

D2.2 , E

D2.3 and E

D.α5 ) at 3.1-7.5 Å resolution exhibit previously unseen features of ubiquitin that map various ubiquitin-binding sites on the RPN1, RPN2, RPN10 and α5 subunits in the proteasome (Fig. 4). Six sequential intermediate states (designated E

D0.1 , E

D0.2 , E

D0.3 , E

D1.1 , E

D1.2 and E

D1.3 ) at 3.2-3.9 Å resolution bridge a major missing gap between states E C2 and E D1 , revealing how concerted structural transitions in the AAA-ATPase motor during single-nucleotide exchange drive the initiation of substrate unfolding (Fig. 5, Extended Data Fig. 7c, g-i). To understand whether the singly capped (SC) and doubly capped (DC) proteasomes behave differently, AlphaCryo4D reconstructed 8 and 36 conformational states for the pure SC and DC holoenzymes, respectively (Fig. 3d, e, Extended Data Figs. 8, 9). All 8 SC and 29 DC states were refined to 3.7-4.7 Å, with 7 DC states limited to 6-9 Å due to their transient nature and extremely low population (Extended Data Fig. 8b, e, f). Note that the CP gate in the RP-distal side of the 20 states of RP-CP subcomplex is in a heterogeneous conformation of various stoichiometric ratios between closed and open states. By contrast, both CP gates in the 44 states of pure SC or DC reconstructions are in a homogeneous conformation. Besides resolving significantly more conformers, AlphaCryo4D pushes the envelope of the achievable resolution of known states due to its advantage in keeping more particles without sacrificing conformational homogeneity. For example, the structure of state E D2 was improved from previous resolution at 3.2 Å to 2.5 Å, allowing us to investigate how the CP gating allosterically regulates the catalytic activity of the β-type subunits (Fig. 6, Extended Data Fig. 7c, g-i). The local resolution of RPN11-bound ubiquitin in state E A2 was also improved from 5-6 Å to 3.5 Å among many other improvements (Extended Data Fig. 10a). Conformational entanglement

Both the SC and DC proteasomes are abundant in cells . Their structural and functional difference, however, remains elusive. In the 36 DC reconstructions, each conformer is a combination of two RP states. Each CP gate in the DC proteasome appears to be rigorously controlled by their respective proximal RP and is only open when its proximal RP is in an E D -compatible state with five RPT C-terminal tails inserted into the surface pockets on the α-ring (Fig. 3e, Extended Data Fig. 8a-d). Consistently, the gate on the RP-free side of the CP in the SC proteasome is closed in all states, indicating the CP gate opening on the RP-controlled side does not remotely open the other CP gate on the RP-free side (Extended Data Fig. 8d). To understand whether the conformational states of two RPs in the DC proteasome allosterically influence each other, we compare the state distribution matrix obtained by AlphaCryo4D to the prediction of a control model assuming two completely uncoupled RP states in the DC proteasome (Fig. 3f-h). Although the overall pattern of the experimental distribution roughly agrees with the model prediction, a notable number of DC states exhibit substantial deviation, indicating that the conformational dynamics of the two RPs in the same proteasome are profoundly entangled together (Fig. 3g, h). Intriguingly, the most prominent positive deviations are seen in the state distributions of E B -E B , E B -E D2 , and E D2 -E D2 combinations (Fig. 3g), suggesting that the deubiquitylation state E B and the translocation state E D2 of one RP are allosterically coupled to and mutually upregulated by states E B and E D2 of the opposite RP. Coupled actions of deubiquitylation and translocation between two opposite RPs are expected to optimize the efficiency of substrate processing. Moreover, all the DC states along the diagonal of the state distribution matrix are more or less upregulated (indicated by red in Fig. 3g, h), suggesting that both RPs in the same proteasome synergistically promote each other to transit into downstream states in a symmetrical fashion. Notably, the largest discrepancy of the distribution of RP states between the SC and DC proteasomes also lies in state E B (Fig. 3b-d). The state population of E B is significantly lower in the SC than in the DC proteasome. This observation verifies the conclusion drawn from the above analysis on the DC proteasomes alone that state E B is upregulated by the conformational entanglement effect in the DC holoenzyme, suggesting that the SC proteasome lacks the functional synergy between two opposite RPs seen in the DC proteasome (Fig. 3h). Altogether, these results implicate that the DC proteasome could be energetically more efficient in substrate deubiquitylation and degradation than the SC proteasome. Novel ubiquitin-binding sites

Unexpectedly, in states E

A1.2 , E

A2.2 , and E

B.2 , a diubiquitin-like density is consistently found on RPN2, whereas a monoubiquitin-like density is observed at the apex of the von Willebrand factor type A (VWA) domain of RPN10 in states E

A2.3 , E

B.3 and E

D2.3 (Fig. 4a, Extended Data Fig. 10b-f). In state E

D2.2 , the diubiquitin-like density, however, appears to contact both RPN2 and RPN10, and can be modelled with a Lys63-linked diubiquitin (Extended Data Fig. 10b, f). The RPN10 ubiquitin-interacting motifs (UIMs), however, were not reliably observed presumably due to their highly flexible nature (Extended Data Fig. 10f). The populations of these states are very small and represented in only a few thousand particles or 0.1-0.3% of the entire dataset, indicating their transient nature. The local resolutions of the RPN2- and RPN10-bound ubiquitin-like densities are lower than the rest of the reconstructions and other ubiquitin densities on RPN1, RPN11 and α5 subunits. Rigid-body fitting of the ubiquitin structure into these ubiquitin-like densities confirms the agreement of the overall density shape with the ubiquitin structure (Fig. 4a, Extended Data Fig. 10b, c, f). The ubiquitin-binding site on RPN2 is centered on three helices harboring an acidic surface around residues Tyr502, Glu530, Asp531 and Ser569 (Fig. 4b-e). Structural comparison suggests that the ubiquitin-binding site on RPN2 is homologous to the T1 site of RPN1 (Fig. 4c, d) . We therefore name it ‘the RPN2 T1 site’. Although there is little conservation in amino acid sequences between the RPN1 and RPN2 T1 sites, their structural and electrostatic properties are highly conserved (Fig. 4d, Extended Data Fig. 10g-j). Comparison with the NMR structure of Lys48-linked diubiquitin-bound RPN1 T1 site helices further confirms the conservation and similarity of the ubiquitin-binding modes between the RPN1 and RPN2 T1 sites (Fig. 4c). Consistent with this finding, previous studies have observed weak interactions of ubiquitin-like (UBL) protein with RPN2 by cross-linking experiments . As there is also ubiquitin bound to RPN11 in states E A2.2 , and E

B.2 , a Lys63-linked triubiquitin chain can be fitted into the RPN2- and RPN11-bound ubiquitin densities by modeling the rotation of Lys63 linkages, suggesting that the RPN2 T1 site may function as a coreceptor site for ubiquitin transfer from RPN10 to RPN11 to facilitate deubiquitylation (Fig. 4g). Given that at least one RPN10 UIM-bound ubiquitin invisible in these cryo-EM maps would be expected to exist, the observation of triubiquitin on RPN2 and RPN11 structurally rationalizes why tetraubiquitin is an optimal degradation signal for the proteasome . By focused 3D classification via AlphaCryo4D, we refined the RPN1 with ubiquitin binding its T2 site in state E A1 to 3.7 Å, which was previously reconstructed at moderate resolution (~5 Å) (Extended Data Fig. 10d). This allows us to visualize the RPN1 T2 site in atomic detail, revealing that the ubiquitin-contacting surface is composed of residues Asp423, Leu426, Asp430, Glu458, Cys459, and Asp460 from two helices (Fig. 4c, d), representing a footprint that is slightly larger than the yeast Rpn1 T2 site for Ubp6 interaction . Previous data have suggested that RPN1 T2 site is a receptor site for UBL proteins like Ubp6 and USP14 with lack of evidence for ubiquitin binding . Our direct observation of ubiquitin binding at this site revises such a conception, suggesting that the RPN1 T2 may be also a primary ubiquitin receptor site. In state E D2.2 , we can observe a ubiquitin density bound to the RPN1 T2 site and another weaker ubiquitin density at the α5-subunit ~6 nm beneath the RPN1 T2 site (Fig. 4a, Extended Data Fig. 10b). To understand if the ubiquitin-binding site on the α5 subunit is used more broadly, we combined all the E D -compatible states and focused the 3D classification of AlphaCryo4D on the α5-bound ubiquitin. This yielded a much larger 3D class, named state E D.α5 (~4% of the entire dataset) that improved the ubiquitin-bound CP structure to 3.1 Å, and the α5-bound ubiquitin density to 4-4.5 Å (Extended Data Figs. 7e,). The ubiquitin-binding site in the α5 subunit is composed of acidic or hydrophobic residues Glu183, Val184, His186, Ser188 and Glu193 in a loop connecting two helices, highly resembling the mode of ubiquitin binding to an RPN13 loop (Fig. 4f, Extended Data Fig. 10e). The electrostatic property of this site is also similar to those of the RPN1 T1, T2 and RPN2 T1 sites that are all acidic, in charge complementarity with the basic ubiquitin surface around residues Ile44 and His68 (Extended Data Fig. 10g-n). We note that these new ubiquitin-binding sites all reside in the vicinity of primary receptor sites. The RPN2 T1 site is very close to the RPN10 UIMs, whereas the α5 site is near the RPN1 T2 site. Modelling of a Lys63-linked tetraubiquitin chain by fitting its terminal ubiquitin into the cryo-EM density suggests that a single tetraubiquitin can span across the α5 subunit and the RPN1 T2 site (Fig. 4h). Given that RPN1 appears to allosterically regulate the AAA-ATPase motor conformation (Fig. 5a, b), tetraubiquitin binding to both the RPN1 T2 site and α5 subunit could allosterically stabilize the open-CP states (E D0 , E D1 and E D2 ). It is unclear whether these low-affinity ubiquitin-binding sites would directly involve in primary targeting of substrate-conjugated polyubiquitin chains by the proteasome in cells. However, it is conceivable that the polyubiquitin chains could be first captured by the high-affinity sites and then moved to the nearby low-affinity sites. Working together, these ubiquitin-binding sites could remodel polyubiquitin chains and deliver the peptide-proximal ubiquitin to the deubiquitylation site (Fig. 4g) as well as assist ubiquitin recycle after isopeptide bond hydrolysis (Fig. 4h). We expect that our structural definition of these novel ubiquitin-binding sites would stimulate future functional tests required to understand their biological importance in cells. Single-nucleotide exchange dynamics

The six intermediate states (E

D0.1 , E

D0.2 , E

D0.3 , E

D1.1 , E

D1.2 and E

D1.3 ) during initiation of substrate unfolding and translocation exhibit highly coordinated movements in the RP relative to the CP (Fig. 5a-c, Extended Data Fig. 11), with prominent rotations in the RPN1 N-terminal that detaches from RPT6 in states E D0.1 , E

D0.2 , and E

D0.3 and re-binds RPT6 in state E

D1.1 , E

D1.2 and E

D1.3 . All six conformations share the same architecture in their substrate interactions with pore-1 loops from four RPT subunits (RPT2, RPT6, RPT3 and RPT4), which remain largely unchanged in overall conformation (Fig. 5d). These structures illustrate that the RPT2 re-association with the substrate proceeds that of RPT1 after both RPT1 and RPT2 are dissociated from the substrate in state E C . They also share the same pattern of nucleotide states, with ATP bound to RPT1, RPT2, RPT6 and RPT3 and ADP bound to RPT4. There are poor, partial nucleotide densities in the RPT5 nucleotide-binding pocket, which give rise to the highest B-factor (250-500) when fitted with ADP, suggesting that RPT5 undergoes ADP release in these states (Extended Data Fig. 11k). Markedly, the pore-1 loop of RPT1 gradually moves toward the substrate from a distance of ~16 Å to ~3 Å, whereas the pore-1 loop of RPT5 gradually moves away from the substrate from a distance of ~3 Å to ~16 Å (Fig. 5d, f). The small AAA subdomain of RPT5 moves inward together with the large AAA subdomain of RPT1, as the large AAA subdomain of RPT5 rotates away from the substate (Extended Data Fig. 11f-i). Consistently, the arginine fingers (R-finger) of RPT2 are progressively moved toward ATP bound in RPT1, closing up the nucleotide-binding pocket (Fig. 5e, g). In contrast, the nucleotide-binding pocket of RPT5 is gradually shrunk but is not fully closed by the R-fingers of RPT1, whereas the nucleotide-binding pocket of RPT4 is gradually opened up (Fig. 5g). As a result, the pore-1 loops of RPT1 and RPT5 are only in contact with the substate in the last and first states E D1.3 and E

D0.1 , respectively. Both are dissociated from the substate in states E

D0.2 , E

D0.3 , E

D1.1 , and E

D1.2 . These gradual movements allow us to assign their temporal sequences as illustrated (Fig. 5a-e), indicating that these states represent intermediate conformations accompanying single-nucleotide exchange that occurs in the RPT5 (Fig. 5h). Interestingly, the outward flipping of RPT5 appears to complete its major conformational change in state E

D0.3 , much faster than the inward motion of RPT1, which completes its conformational change in state E

D1.3 . This observation suggests that ATPase re-engagement with the substrate is the rate-limiting step in the single-nucleotide exchange kinetics of the AAA-ATPase motor. Our data clarify that the CP gate is open once the pore-1 loop of RPT2 binds the substrate in state E

D0.1 , as five RPT C-tails are inserted in the α-pockets in all six states (Extended Data Fig. 8c). Altogether, these structures vividly characterize at the atomic level how the “hand-over- hand” actions of RPT subunits are executed by coordinated conformational dynamics in the AAA-ATPase motor during single-nucleotide exchange and generates mechanical force unfolding the substrate. Although the sequential model of coordinated ATP hydrolysis around the AAA-ATPase ring has been hypothesized in numerous similar systems , our results provide the first direct evidence for sequential intermediate states of hand-over-hand actions within a single cycle of nucleotide exchange. Hierarchical allosteric regulation of proteolytic activity

Both atomic structures of the substrate-bound CPs in the open- and closed-gate states E A and E D2 show a segment of substrate polypeptide binding the catalytically active residue Thr1 in the β2 subunit, whereas no substrate density is observed in the other two catalytic subunits β1 and β5. In both states, the substrate is bound in a β-sheet conformation between residues 20 to 22 and 46 to 48 in the β2 subunit (Fig. 6a, b, Extended Data Fig. 12), reminiscent of the conformations of CP inhibitors binding the β5 subunits in the crystal structures of the CP . The hydroxyl oxygen atom in the catalytic Thr1 is ~2-Å closer to the substrate mainchain in state E D2 compared to state E A . This structural difference is expected to result from the CP gate opening that exerts a profound allosteric impact on the entire CP structure. To understand how such a subtle structural change influences the proteolytic activity, we carried out first-principles calculations of the quantum-mechanical interactions involved in the nucleophilic attack of the catalytic residue Thr1 on the scissile peptide bond in the substrate, using the density functional theory (DFT) with the generalized gradient approximation . To initiate the nucleophilic attack, the carbonyl group needs to be bought to the proximity of the electrophile Thr1-Oγ to facilitate the quantum-mechanical interactions . We calculated the charge density difference induced around the interacting pair after proton transfer from Thr1-Oγ to Thr1-NH and plotted the contour maps (Fig. 6c, d, Extended Data Fig. 12e, f). It is seen that the substrate-interacting pair of atoms in state E D2 start to form a bonding region of slight positive charge difference linking the carbonyl group of the substrate and the Thr1-Oγ. By contrast, the pair are too far apart to interact and are well separated by a charge depletion non-bonding region in state E A . Indeed, a recent study has found that the nucleophilic attack is the rate-limiting step in the acylation barrier using the yeast CP structure in the gate-closed state . Our result suggests that CP gate opening allosterically promotes the nucleophilic attack by lowering the activation energy barrier of acylation and induces hybridization of the molecular orbitals, hence upregulating the reaction rate of proteolysis. In support of our structural finding, previous studies have observed indirect evidence for coupling of the CP gating with the proteolytic activity . Taken together, our systematic structural analyses point to a grand allostery model for the proteasomal autoregulation of proteolytic activity through a three-tier hierarchy of inter-molecular interaction networks (Fig. 6e), in which all subunit conformations are allosterically coupled. The first tier of autoregulation is between the CP gate in the α-ring and the catalytic sites of the β-subunits. Opening of the CP gate allosterically upregulates the proteolytic activity of the three β-subunits. The second tier of autoregulation lies between the substrate-bound AAA-ATPase motor and the rest of the proteasome. The AAA-ATPase motor bridges the ubiquitin-binding sites and the CP via direct interactions of the RPT coiled-coil (CC) domains with RPN1, RPN2 and RPN10 harboring ubiquitin-binding sites. As a result, three modes of coordinated ATP hydrolysis appear to regulate the intermediate steps of substrate processing , in which initiation of substrate unfolding is allosterically coupled with CP gating during E C2 to E D0.1 transition (Fig. 5). The third tier of autoregulation is via dynamic interactions between ubiquitin signals and a collection of ubiquitin receptor sites and deubiquitinases. This is manifested as the dual roles of ubiquitin-binding sites in recognizing ubiquitin signals and in transmitting such recognitions to allosteric regulation of AAA-ATPase conformations that control the CP gate. Ubiquitin binding and remodeling destabilizes the resting state of the proteasome and promotes its transition to states E B , E C and ultimately E D . On the other hand, a polyubiquitin chain binding multiple sites in the proteasome could have profound allosteric and conformational selection effects (Fig. 4g, h). Such three-tier, hierarchical allosteric regulations may occur mutually with both RPs and are expected to underlie the conformational entanglement of the two RPs in the DC proteasome (Fig. 3e), which was not clearly observed in the absence of substrates . This grand allostery mechanism might have been evolved to optimize the proteasome function in response to extremely complicated intracellular challenges and is manifested as various phenotypic effects, such as UBL-induced proteasome activation , long-range allosteric regulation of RP by CP inhibitors and allosteric coupling of two opposite α-ring gates in the archaeal CP . Concluding remarks In this work, motivated by solving the hidden dynamics of proteasome autoregulation, we developed AlphaCryo4D, a novel framework of deep manifold learning, that enables 3D reconstructions of nonequilibrium conformational continuum at the atomic level. In retrospective, finding novel ubiquitin-binding sites in the proteasome, choreographing sequential intermediates within a single cycle of nucleotide exchange, and detecting conformational entanglement of the two RPs in the DC holoenzyme as well as subtle allosteric effects of polypeptide hydrolysis by CP gating, each itself presents a formidable challenge. Being able to accomplish all these missions on a single experimental dataset by solving 64 proteasomal states demonstrates that AlphaCryo4D is a major advancement of deep learning in solving 3D heterogeneity problems in cryo-EM structure analysis and showcases tremendous potentials of deep learning in transforming cryo-EM and molecular imaging. More importantly, AlphaCryo4D revealed ‘previously invisible’ mechanistic details in nonequilibrium transient interactions of the proteasome with polyubiquitylated substrates and offered unprecedented insights into the grand hierarchical allostery underlying the proteasome activity and function. With recent improvements in cryo-EM instruments and time-resolved sample preparation methods , we expect that current development and future widespread applications of AlphaCryo4D or similar approaches will considerably boost the capability of cryo-EM imaging as a de novo discovery tool and transform modern research in physical and molecular biology.

References

1 Mao, Y. Structure, Dynamics and Function of the 26S Proteasome. In: Harris J.R., Marles-Wright J. (eds) Macromolecular Protein Complexes III: Structure and Function.

Subcell Biochem , 1-151, doi:10.1007/978-3-030-58971-4_1 (2021). 2 Dong, Y. et al. Cryo-EM structures and dynamics of substrate-engaged human 26S proteasome.

Nature , 49-55, doi:10.1038/s41586-018-0736-4 (2019). 3 He, K., Zhang, X., Ren, S. & Sun, J. Deep Residual Learning for Image Recognition. , 770-778, doi:10.1109/CVPR.2016.90 (2016). 4 van der Maaten, L. & Hinton, G. Visualizing Data using t-SNE.

J Mach Learn Res , 2579-2605 (2008). 5 Dashti, A. et al. Trajectories of the ribosome as a Brownian nanomachine.

Proc Natl Acad Sci U S A , 17492-17497, doi:10.1073/pnas.1419276111 (2014). 6 Frank, J.

Three-dimensional electron microscopy of macromolecular assemblies : visualization of biological molecules in their native state . 2nd edn, (Oxford University Press, 2006). 7 Punjani, A., Rubinstein, J. L., Fleet, D. J. & Brubaker, M. A. cryoSPARC: algorithms for rapid unsupervised cryo-EM structure determination.

Nat Methods , 290-296, doi:10.1038/nmeth.4169 (2017). 8 Zivanov, J. et al. New tools for automated high-resolution cryo-EM structure determination in RELION-3.

Elife , doi:10.7554/eLife.42166 (2018). 9 Nakane, T., Kimanius, D., Lindahl, E. & Scheres, S. H. Characterisation of molecular motions in cryo-EM single-particle data by multi-body refinement in RELION. Elife , doi:10.7554/eLife.36861 (2018). 10 Galves, M., Rathi, R., Prag, G. & Ashkenazi, A. Ubiquitin Signaling and Degradation of Aggregate-Prone Proteins. Trends Biochem Sci , doi:10.1016/j.tibs.2019.04.007 (2019). 11 Husnjak, K. et al.

Proteasome subunit Rpn13 is a novel ubiquitin receptor.

Nature , 481-488, doi:10.1038/nature06926 (2008). 12 Shi, Y. et al.

Rpn1 provides adjacent receptor sites for substrate binding and deubiquitination by the proteasome.

Science , doi:10.1126/science.aad9421 (2016). 13 Zhang, S. & Mao, Y. AAA+ ATPases in Protein Degradation: Structures, Functions and Mechanisms.

Biomolecules , doi:10.3390/biom10040629 (2020). 14 Chen, S. et al. Structural basis for dynamic regulation of the human 26S proteasome.

Proc Natl Acad Sci U S A , 12991-12996, doi:10.1073/pnas.1614614113 (2016). 15 Zhu, Y. et al.

Structural mechanism for nucleotide-driven remodeling of the AAA-ATPase unfoldase in the activated human 26S proteasome.

Nat Commun , 1360, doi:10.1038/s41467-018-03785-w (2018). 16 de la Pena, A. H., Goodall, E. A., Gates, S. N., Lander, G. C. & Martin, A. Substrate-engaged 26S proteasome structures reveal mechanisms for ATP-hydrolysis-driven translocation. Science , doi:10.1126/science.aav0725 (2018). 17 Wu, J. et al.

Massively parallel unsupervised single-particle cryo-EM data clustering via statistical manifold learning.

PLoS One , e0182130, doi:10.1371/journal.pone.0182130 (2017). 18 Scheres, S. H. Processing of Structurally Heterogeneous Cryo-EM Data in RELION. Methods Enzymol , 125-157, doi:10.1016/bs.mie.2016.04.012 (2016). 19 Scheres, S. H. et al.

Disentangling conformational states of macromolecules in 3D-EM through likelihood optimization.

Nat Methods , 27-29, doi:10.1038/nmeth992 (2007). 20 Hinton, G. E. & Salakhutdinov, R. R. Reducing the dimensionality of data with neural networks. Science , 504-507, doi:10.1126/science.1127647 (2006). 21 E, W., Ren, W. & Vanden-Eijnden, E. String method for the study of rare events.

Phys Rev B , 052301 (2002). 22 E, W., Ren, W. & Vanden-Eijnden, E. Simplified and improved string method for computing the minimum energy paths in barrier-crossing events. J Chem Phys , 164103, doi:10.1063/1.2720838 (2007). 23 Sharif, H. et al.

Structural mechanism for NEK7-licensed activation of NLRP3 inflammasome.

Nature , 338-343, doi:10.1038/s41586-019-1295-z (2019). 24 Bard, J. A. M., Bashore, C., Dong, K. C. & Martin, A. The 26S Proteasome Utilizes a Kinetic Gateway to Prioritize Substrate Degradation.

Cell , 286-298 e215, doi:10.1016/j.cell.2019.02.031 (2019). 25 Asano, S. et al.

Proteasomes. A molecular census of 26S proteasomes in intact neurons.

Science , 439-442, doi:10.1126/science.1261197 (2015). 26 Saeki, Y., Sone, T., Toh-e, A. & Yokosawa, H. Identification of ubiquitin-like protein-binding subunits of the 26S proteasome.

Biochem Biophys Res Commun , 813-819, doi:10.1016/s0006-291x(02)02002-8 (2002).

27 Schreiner, P. et al.

Ubiquitin docking at the proteasome through a novel pleckstrin-homology domain interaction.

Nature , 548-552, doi:10.1038/nature06924 (2008). 28 Puchades, C., Sandate, C. R. & Lander, G. C. The molecular principles governing the activity and functional diversity of AAA+ proteins.

Nat Rev Mol Cell Biol , 43-58, doi:10.1038/s41580-019-0183-6 (2020). 29 Lowe, J. et al. Crystal structure of the 20S proteasome from the archaeon T. acidophilum at 3.4 A resolution.

Science , 533-539, doi:10.1126/science.7725097 (1995). 30 Schrader, J. et al.

The inhibition mechanism of human 20S proteasomes enables next-generation inhibitor design.

Science , 594-598, doi:10.1126/science.aaf8993 (2016). 31 Perdew, J. P., Burke, K. & Ernzerhof, M. Generalized Gradient Approximation Made Simple.

Phys Rev Lett , 3865-3868, doi:10.1103/PhysRevLett.77.3865 (1996). 32 Groll, M. et al. Structure of 20S proteasome from yeast at 2.4 A resolution.

Nature , 463-471, doi:10.1038/386463a0 (1997). 33 Saha, A., Oanca, G., Mondal, D. & Warshel, A. Exploring the Proteolysis Mechanism of the Proteasomes.

J Phys Chem B , 5626-5635, doi:10.1021/acs.jpcb.0c04435 (2020). 34 Osmulski, P. A., Hochstrasser, M. & Gaczynska, M. A tetrahedral transition state at the active sites of the 20S proteasome is coupled to opening of the alpha-ring channel.

Structure , 1137-1147, doi:10.1016/j.str.2009.06.011 (2009). 35 Haselbach, D. et al. Long-range allosteric regulation of the human 26S proteasome by 20S proteasome-targeting cancer drugs.

Nat Commun , 15578, doi:10.1038/ncomms15578 (2017). 36 Eisele, M. R. et al. Expanded Coverage of the 26S Proteasome Conformational Landscape Reveals Mechanisms of Peptidase Gating.

Cell Rep , 1301-1315 e1305, doi:10.1016/j.celrep.2018.07.004 (2018). 37 Collins, G. A. & Goldberg, A. L. Proteins containing ubiquitin-like (Ubl) domains not only bind to 26S proteasomes but also induce their activation. Proc Natl Acad Sci U S A , 4664-4674, doi:10.1073/pnas.1915534117 (2020). 38 Kim, H. T. & Goldberg, A. L. UBL domain of Usp14 and other proteins stimulates proteasome activities and protein degradation in cells.

Proc Natl Acad Sci U S A , E11642-E11650, doi:10.1073/pnas.1808731115 (2018). 39 Yu, Z. et al.

Allosteric coupling between alpha-rings of the 20S proteasome.

Nat Commun , 4580, doi:10.1038/s41467-020-18415-7 (2020). 40 Nakane, T. et al. Single-particle cryo-EM at atomic resolution.

Nature , 152-156, doi:10.1038/s41586-020-2829-0 (2020). 41 Yip, K. M., Fischer, N., Paknia, E., Chari, A. & Stark, H. Atomic-resolution protein structure determination by cryo-EM.

Nature , 157-161, doi:10.1038/s41586-020-2833-4 (2020). 42 Dandey, V. P. et al.

Time-resolved cryo-EM using Spotiton.

Nat Methods , 897-900, doi:10.1038/s41592-020-0925-6 (2020). 43 Kaledhonkar, S. et al. Late steps in bacterial translation initiation visualized using time-resolved cryo-EM.

Nature , 400-404, doi:10.1038/s41586-019-1249-5 (2019). 44 Zhu, Y., Ouyang, Q. & Mao, Y. A deep convolutional neural network approach to single-particle recognition in cryo-electron microscopy.

BMC Bioinformatics , 348, doi:10.1186/s12859-017-1757-y (2017). 45 Bepler, T. et al. Positive-unlabeled convolutional neural networks for particle picking in cryo-electron micrographs.

Res Comput Mol Biol , 245-247 (2018).

46 Qian, N. On the momentum term in gradient descent learning algorithms.

Neural Netw , 145-151, doi:10.1016/s0893-6080(98)00116-6 (1999). 47 Unverdorben, P. et al. Deep classification of a large cryo-EM dataset defines the conformational landscape of the 26S proteasome.

Proc Natl Acad Sci U S A , 5544-5549, doi:10.1073/pnas.1403409111 (2014). 48 Wehmer, M. et al.

Structural insights into the functional cycle of the ATPase module of the 26S proteasome.

Proc Natl Acad Sci U S A , 1305-1310, doi:10.1073/pnas.1621129114 (2017). 49 Emsley, P. & Cowtan, K. Coot: model-building tools for molecular graphics.

Acta Crystallogr D Biol Crystallogr , 2126-2132, doi:10.1107/S0907444904019158 (2004). 50 Pettersen, E. F. et al. UCSF Chimera--a visualization system for exploratory research and analysis.

J Comput Chem , 1605-1612, doi:10.1002/jcc.20084 (2004). 51 Adams, P. D. et al. PHENIX: a comprehensive Python-based system for macromolecular structure solution.

Acta Crystallogr D Biol Crystallogr , 213-221, doi:10.1107/S0907444909052925 (2010). 52 The PyMOL Molecular Graphics System Version 2.0 (Schrödinger, LLC). 53 Goddard, T. D. et al. UCSF ChimeraX: Meeting modern challenges in visualization and analysis.

Protein Sci , 14-25, doi:10.1002/pro.3235 (2018). 54 Kucukelbir, A., Sigworth, F. J. & Tagare, H. D. Quantifying the local resolution of cryo-EM density maps. Nat Methods , 63-65, doi:10.1038/nmeth.2727 (2014). 55 Heymann, J. B. & Belnap, D. M. Bsoft: image processing and molecular modeling for electron microscopy. J Struct Biol , 3-18, doi:10.1016/j.jsb.2006.06.006 (2007). 56 Dolinsky, T. J., Nielsen, J. E., McCammon, J. A. & Baker, N. A. PDB2PQR: an automated pipeline for the setup of Poisson-Boltzmann electrostatics calculations.

Nucleic Acids Res , W665-667, doi:10.1093/nar/gkh381 (2004). 57 Soler, J. M. et al. The SIESTA method for ab initio order-N materials simulation.

J Phys Condens Matter , 2745-2779 (2002). Figure 1. Conceptual framework of AlphaCryo4D for 4D cryo-EM reconstruction. a , Schematic showing the major conceptual steps of single-particle cryo-EM data processing in AlphaCryo4D. b , Illustration of deep residual learning of 3D feature maps by an autoencoder conjugated to a decoder for unsupervised training. c , The 20 atomic models of hypothetical conformers of NLRP3 in cartoon representations simulate a 90 ° rotation of the NACHT domain relative to the LRR domain based on the structure of NLRP3 (PDB ID: 6NPY). d , Manifold learning of the bootstrapped 3D volumes and their feature maps learnt by the autoencoder. Each data point corresponds to a 3D volume. The color labels the conformer identity of ground truth for the purpose of verification. e , Free energy landscape computed from the manifold shown in d using the Boltzmann distribution. f , Minimum energy path (orange line) calculated by the string method is used to find the approximate cluster centers of 20 conformers. The cluster boundaries for energy-based particle voting are shown as dashed circles. Figure 2. Performance evaluation of AlphaCryo4D for sub-3-Å reconstructions of conformational continuum . a-c , Plots of 3D classification precisions of the 20 NLRP3 conformers from blind assessments on the simulated NLRP3 dataset with SNR of 0.01, using AlphaCyo4D ( a ), PCA-based clustering implemented in CryoSPARC ( b ) and 3D maximum-likelihood (ML3D) method implemented in RELION ( c ). All 3D conformers in panel (a) were reconstructed to 2.6-2.9 Å resolution (Extended Data Fig. 3m). Asterisks mark the missing conformers that were completely lost due to misclassification by PCA and ML3D. d-f , Typical side-by-side comparison of density map quality and local resolution of the same conformer (ID 6) reconstructed by AlphaCyo4D ( d ), PCA in CryoSPARC ( e ) and ML3D in RELION ( f ). The maps are colored by their local resolutions calculated by Bsoft blocres program. g - i , Closeup side-by-side comparison of the same two secondary structures, including a loop (upper row) and a helix (lower row), in the NACHT domain illustrates considerable improvements in local density quality and resolution by AlphaCryo4D ( g ) as opposed to PCA/CryoSPARC ( h ) and ML3D/RELION ( i ). The locations of the loop and helix in the NLRP3 structure are marked by dashed boxes in panel ( f ). The same ground-truth atomic model of Conformer 6 shown in stick representations is superimposed with the density maps shown in blue mesh representations, all from the same perspective. The atomic model was not further refined against each map for visual validation of the map accuracy. Figure 3 . Energic differences of the singly and doubly capped proteasomes . a-c , Free-energy landscape of the substrate-free ATPγS-bound 26S proteasome ( a ), the substrate-engaged singly capped (SC) proteasome ( b ), and the substrate-engaged doubly capped (DC) proteasomes ( c ) computed by AlphaCryo4D. 19 conformational states of the RP-CP subcomplex are marked on the energy landscape. d , Comparison of the state distribution of RP in the SC and DC proteasomes. e , State distribution matrix of the DC proteasome obtained using AlphaCryo4D on experimental data, colored by the particle numbers in the DC states, with the horizonal and vertical axes representing the states of two RPs bound to the same CP. f , State distribution matrix of the DC proteasome predicted by a control model assuming that the two RPs in the same DC proteasome are completely independent of each other or uncoupled. The model assumes the same total number of DC particles and the same probability of observing each RP state as experimentally measured. g , State distribution difference matrix of the DC proteasome by experimental results minus the model predictions assuming uncoupled two RPs, colored by the absolute particle number difference. h , Relative differences of the DC proteasome in unit of percentage. The relative difference is the ratio of absolute difference divided by the particle numbers from the model prediction in each DC state. Figure 4.

Mapping of novel ubiquitin-binding sites in the 26S proteasome . a , Seven lowly populated transient states visualize various ubiquitin-binding sites on the RPN1 T2 site , the RPN2 T1 site, the RPN10 VWA site and the α5-subunit site, in addition to the RPN11 deubiquitylation site and previously observed RPT5 coiled-coil site . For clarity of illustrating moderate-resolution ubiquitin (Ub) features, the cryo-EM maps of these states are low-pass filtered at 8 Å and shown in transparent surface representations superimposed with their corresponding atomic models in cartoon representations. The Ub densities are colored magenta, with the rest of the structures colored yellow. b , Structure of the RPN2 toroidal domain (orange) in complex with Lys63-linked diubiquitin (magenta) modelled from the cryo-EM maps of states E A1.2 , E

A2.2 and E

B.2 , in cartoon representation. c , Structure-based alignment of the human RPN1 and RPN2 toroidal domains shown in color yellow and orange, respectively. The RPN1 T2 site-bound ubiquitin is shown as light blue. The NMR structure (PDB ID 2N3V) of the Lys48-linked dibuiquitin-bound yeast Rpn1 T1 site segment , shown as light green, is aligned with the human RPN1 structure and is superimposed for comparison. d , The entire human RPN1 and RPN2 subunit structures are aligned together and viewed from a perspective that is rotated 90 ° relative to the orientation in panel ( c ), showing that the two subunits are structurally homologous. Insert, electrostatic surface of the RPN2 T1 site showing its acidic nature. e , Closeup view of the RPN2 T1 site composed of three helices in carton representation with the side chains shown in stick representation. The key residues contacting ubiquitin are highlighted with transparent sphere representation. f , Structure of ubiquitin-bound α5 subunit shown in cartoon representation. The side chains of residues involved in the intermolecular interactions are shown in stick representation and are zoomed in on the right. g , A Lys63-linked triubiquitin chain model fitted from the cryo-EM maps of states E A2.2 and E

B.2 is shown in purple cartoon representation and spans from the RPN2 T1 site to the RPN11 deubiquitylation site. The RPN10 VWA-bound monoubiquitin fitted in states E

A2.3 and E

B.3 is superimposed in blue cartoon representation. The proteasome is shown as space-filling sphere representations. h , A Lys63-linked tetraubiquitin chain model derived from the cryo-EM map of state E D2.2 . The two terminal ubiquitin molecules are fitted from their highest-resolution cryo-EM densities available at the RPN1 T2 and α5-subunit sites, whereas the middle two molecules are hypothetically modelled. Figure 5.

Single-nucleotide exchange dynamics of the AAA-ATPase motor . a , Side views of the substate-engaged RP-CP subcomplex in seven sequential conformers between states E C2 and E D1.3 , as shown in cartoon representation and with only half CP shown. b , Top views of the seven sequential conformers, showing the overall lid rotation and the relative rotation between RPN1 and the rest of the RP structure. c , The substrate-bound AAA-ATPase structures of the seven conformers in cartoon representations. The substrates and nucleotides are shown in stick representations. d , Close-up views of the pore-1 loop interactions with the substrate (in yellow stick representation), with the aromatic residues in the pore-1 loops highlighted by transparent sphere representations. The seven snapshots show that the RPT1 pore-1 loop is gradually moved to a position in direct contact with the substrate, whereas the RPT5 pore-1 loop is gradually moved away from the substrate and flipped up and out. e , Close-up views of the RPT1 nucleotide-binding sites in different states, showing its gradual closeup by the R-fingers from RPT2. f , Plots of distances of the pore-1 loops of all RPT subunits to the substrate measured in the atomic structures of 9 sequential conformational states. g , Plots of distances of R-finger to the nucleotides bound to three RPT subunits measured in the atomic structures of 8 sequential conformational states. h , Schematic illustrating the ATP hydrolysis reaction and nucleotide exchange associated with the intermediates during the transition from state E C2 to E D1.3 . Figure 6 . Hierarchical allosteric regulation of proteasome activity . a and b , Structure of the catalytic site with a substrate polypeptide bound at residue Thr1 in the β2 subunit of the CP in the closed-gate state E A at 2.7-Å resolution ( a ) and in the open-gate state E D2 at 2.5-Å resolution ( b ). c and d , Charge density difference contour maps at the proteolytic active site of the β2 subunit computed by the quantum mechanical DFT based on the atomic structures of states E A ( c ) and E D2 ( d ). The contours are plotted in the plane passing through the Thr1-Oγ atom and the nearest carbonyl carbon and oxygen atoms of the substrate. The interactions of the Thr1-Oγ atom and the carbonyl group lead to the nucleophilic attack as the first step of the proteolytic process. Due to the proximity shown in ( b ), a bonding interaction starts to form in the E D2 states, indicated by the slight positive charge difference region (red arrow), whereas there is no such region formed in the E A state and the pair is well separated by a non-bonding charge depletion region (blue arrow). e , An interactome network diagram of the proteasome illustrating the grand allosteric regulation model and highlighting the key inter-subunit interaction pathways for proteasome autoregulation. The base subunits hosting ubiquitin-binding sites and the deubiquitinase RPN11 are shown as the green nodes; the AAA-ATPase subunits as the orange nodes; the α-type subunits as the cyan nodes; the catalytically active β1, β2 and β5 subunits as the crimson nodes. Steal blue lines represent inter-subunit interactions; purple lines represent the interactions with ubiquitin; and red lines represent substrate polypeptide interactions. Bold lines highlight the shortest pathways connecting the three tiers of regulatory subunits that are expected to propagate the allosteric regulation from ubiquitin interactions down to the six proteolytic active sites in the CP. The network diagram is constructed according to all 64 structural models of the substrate-bound proteasome in this study. Methods Overall procedure of AlpahCryo4D

The design of AlphaCryo4D follows a few key ideas and objectives. First of all, we expect to maximally use all available particles and make particle selection via comprehensive, objective criteria such as the reproducibility and robustness in deep-learning-based classification. To this end, we minimize any substantial manual particle rejection before entering the steps of AlphaCryo4D and avoid any subjective particle selection within the AlphaCryo4D procedure. Next, we expect to devise an automated procedure to examine the robustness of 3D classification at single-particle level. We consider that any objective functions in image similarity measurement suffer from certain degree of errors in particle alignment that often lead to a greater degree of misalignment with decreasing the signal-to-noise ratio (SNR). Thus, we expect to devise an integrative procedure in which each particle is analyzed for at least three times to examine the robustness of the 3D classification algorithm with respect to individual particles. We wish to keep particle images that exhibits greater robustness during data processing. Thus, within the entire AlphaCryo4D procedure, the only way to reject particles is via the energy-based particle voting algorithm that is completely objective. In practice, AlphaCryo4D consists of four major steps (Fig. 1a). First, all single-particle images are randomly divided into M +1 groups of equal data sizes. In the step called ‘particle shuffling’, one group of particles is taken out of the dataset to form a shuffled dataset. This procedure is repeated for M + 1 times, each time with a different particle group being taken out, resulting in M + 1 shuffled datasets (Extended Data Fig. 1). Each shuffled dataset is subject to 3D volume bootstrapping separately. In each case, all particles in a shuffled dataset are aligned in a common frame of reference through a consensus 3D refinement, and then clustered into tens to hundreds of 3D reconstructions through Bayesian clustering, which ensures that only structurally similar particles are clustered into the same 3D volumes. In total, thousands of volumes from all shuffled datasets are expected to be bootstrapped through these steps. As a result, each particle image is used in 3D reconstructions of M different volumes after particle shuffling and volume bootstrapping. Second, all bootstrapped volumes are learnt by an autoencoder based on a deep residual convolutional neural network in an unsupervised fashion (Fig. 1b, Extended Data Table 1). A 3D feature map is extracted for each volume and is juxtaposed with the volume data for nonlinear dimensionality reduction by manifold embedding with t-distributed stochastic neighbor embedding (t-SNE) algorithm . Third, the learnt manifold is then used to compute a free-energy landscape via the Boltzmann distribution . A string method is used to search the minimum energy path (MEP) on the learnt free-energy landscape . The local energy minima or the transition states connecting can be defined as the centers of 3D clustering, with a circular range defined as cluster boundaries for subsequent particle voting. Forth, because each particle is used M times during volume bootstrapping, it is mapped to M locations on the free-energy landscape. The mapping of each copy of the particle is called a ‘vote’. By counting the number of votes of the same particle being mapped within the same cluster boundary on the energy landscape, we can evaluate the reproducibility of manifold learning at single-particle level. Each particle is classified to the 3D cluster that receives more than M /2 votes of this particle. If none of the 3D clusters on the energy landscape receives more than M /2 votes of a given particle, the corresponding particle is voted out in the procedure and is excluded for further 3D reconstruction. As a result, each of the final 3D reconstructions includes structurally homogeneous particles that can potentially achieve higher resolution. AlphaCryo4D exhibits several advantages as well as certain limitation. First, it allows a maximal number of particles to be assessed and classified in an integrative procedure based on uniform, objective criteria, such as their reproducibility and robustness in deep learning. Second, the energy-based particle-voting algorithm can potentially rescue certain particles that are prone to be misclassified if processed only once, thus boosting the efficiency of particle usage without necessarily sacrificing quality and homogeneity of selected particles. Third, the use of energy landscape to direct 3D clustering of conformational states has an intuitive physical meaning and allows users to examine kinetic relationships between the adjacent states at the same time of conducting 3D classification and to potentially discover new conformations. All these advantages are well demonstrated in our analysis of both simulated and experimental datasets. The limitation of AlphaCryo4D lies in that it requires generally larger datasets and more computational costs to fully exploit its advantages. We do not expect to achieve considerably better results for a small dataset. Moreover, its outcomes are dependent on the success of initial consensus alignment of all particles during the first step of particle shuffling and volume bootstrapping. Failure in obtaining accurate alignment parameters can lead to futile, erroneous, nonsensical classification in the subsequent steps. Particle shuffling for bootstrapping 3D volumes A key philosophy of the AlphaCryo4D design is to avoid subjective judgement on the particle quality and usability as long as it is not apparent junks like ice contaminants, carbon edge and other obvious impurities. A deep-learning-based particle picking in DeepEM , Torpaz or other similarly performed software is favored for data preprocessing prior to AlphaCryo4D. To prepare particle datasets for AlphaCryo4D, an initial unsupervised 2D image classification and particle selection, preferentially conducted by the statistical manifold-learning-based algorithm in ROME , is necessary to ensure that no apparent junks are selected for further analysis and that the data have been collected under an optimal microscope alignment condition, such as optimized coma-free alignment. No particles should be discarded based on their structural appearance during this step if they are not apparent junks. Any additional 3D classification should be avoided to pre-maturely reject particles prior to particle shuffling and volume bootstrapping in the first step of AlphaCryo4D processing. Pre-maturely rejecting true particles via any 2D and 3D classification is expected to introduce subjective bias and to impair the native conformational continuity and statistical integrity intrinsically existing in the dataset. In raw cryo-EM data, 2D transmission images of biological macromolecules suffer from extremely heavy background noise, due to the use of low electron dose to avoid radiation damage. To tackle the conformational heterogeneity of the protein sample of interest in the presence of heavy image noise, we devised the particle shuffling and volume bootstrapping procedure that incorporates the Bayesian or maximum-likelihood-based 3D clustering . To enable the particle-voting algorithm in the late stage of AlphaCryo4D, each particle is reused M times during particle shuffling to bootstrap a large number of 3D volumes (Extended Data Fig. 1a). First, all particle images are aligned to the same frame of reference in a consensus 3D reconstruction and refinement in RELION or ROME to obtain the initial alignment parameters of three Euler angles and two translational shifts. Optimization for high alignment accuracy in this step must be pursued to make the volume bootstrapping more efficient and to avoid the error propagation to the subsequent steps in AlphaCryo4D. Thus, coarsely classifying the dataset during image alignment, restricting the 3D alignment to a moderate resolution like 10 Å or 15 Å for global parameter search, progressing to small enough angular steps in the final stage of consensus refinement, must be appropriately practiced in order to optimize the initial 3D alignment of all particles. Failure of initial alignment of particles would lead to failure of all subsequent AlphaCryo4D analysis. Next, based on the results of consensus alignment, in the particle-shuffling step, all particles were divided into M + 1 groups, and M was set to an odd number of at least 3. Then the whole dataset was shuffled M + 1 times by removing a different group out of the dataset each time. Each shuffled dataset is classified into a large number of 3D volumes, often several hundreds, by the maximum-likelihood classification algorithm without further image alignment in RELION. This step is repeated M + 1 times, each time on a shuffled dataset missing a different group among the M + 1 groups. Due to the effect of M -fold data augmentation and enhancement by particle shuffling, the outcome of this entire process is expected to bootstrap up to thousands of 3D volumes in total. Each particle is used and contributed to the 3D reconstructions of M volumes, which prepare it for the particle-voting algorithm to evaluate the robustness and reproducibility of each particle with respect to the deep learning-based 3D classification later. For processing a large dataset including millions of single-particle images, it becomes infeasible even for a modern high-performance computing system to do the consensus alignment by including all particles once in a single run due to limitation of supercomputer memory and the scalability of the alignment software such as RELION or ROME, although ROME has been specifically optimized for its scalability on CPU-based supercomputing cluster and RELION has been optimized for GPU-based cluster . To tackle this issue, the whole dataset is randomly split into a number of sub-datasets for batch processing, with each sub-dataset including about one to two hundred thousand particles, depending on the scale of available supercomputing system. This strategy can substantially reduce the supercomputer memory pressure and requirement. In each sub-dataset, all particles were divided into M + 1 groups and subject to the particle shuffling and volume bootstrapping procedure as described above. To balance the computational cost and algorithmic efficiency, we used M = 3 in the entire data processing workflow involved in the current study, implying that every particle was used three times for subsequent computation before particle voting. But a higher M value might be beneficial for smaller dataset or lower signal-to-noise ratio, whose effects are not investigated in the present study due to the limit of computational resources and feasibility. Deep residual autoencoder for 3D feature extraction

To extract the deep 3D features of the bootstrapped volume data in an unsupervised manner, a 3D autoencoder was constructed using a deep Fully Convolutional Network (FCN) composed of residual blocks . The structure of the 3D autoencoder consists of the encoder and the decoder, which are denoted as ℰ and 𝒟 , respectively (Fig. 1b, Extended Data Table 1). The relation between the output 𝒚 and the input 𝒙 of the network can be expressed as: 𝒚 = 𝒟&ℰ(𝒙)), (1) in which 𝒙 is the input 3D density volume with the size of N , where N is the box size of the density map in a unit of pixels. For reconstruction of the 3D volumes and further optimization, the decoding output 𝒚 should be in the same size and range with the input data 𝒙 . In this way, the framework of FCN is established to restore the input volume, using the sigmoid function 𝑆(𝑥) = !!" as the activation function of the decoding layer to normalize the value of 𝒚 into the range (0, 1). Meanwhile, all 3D density maps 𝒙 should be preprocessed as the function (2) before inputted to the deep neural network: 𝑥 +,- ∶= 𝑥 +,- − 𝑥 .+/ 𝑥 .0) − 𝑥 .+/ , 𝑖, 𝑗, 𝑘 = 1,2, … , 𝑁. (2) where 𝑥 .+/ and 𝑥 .0) are, respectively, the minimum and minimax value in all 𝑥 +,- . The distance between the decoded maps and the input volumes can be used for constructing the loss function to train the 3D kernels and bias of the networks. The value distribution of the encoded 3D feature maps 𝒛 = ℰ(𝒙) is expected to be an abstract, numerical representation of the underlying structures in the volume data, which may not necessarily have any intuitive real-space physical meanings. The neural network is capable of extracting such abstract information in the prediction step, with no restriction on the feature maps 𝒛 in the expression of training loss. The loss function is then formulated as: 𝐿(𝜽; 𝒙, 𝒚) = 1𝑁 < =𝑥 +,- − 𝑦 +,- = + 𝜆‖𝜽‖ , (3) where 𝜽 denotes the weights and bias of the network, and 𝜆 is L2 norm regularization coefficient. As the feature of the complex structure is difficult to be learnt from the 3D volume data, the value of 𝜆 is generally set to 0 to focus on the first term of the expression (3) unless overfitting arises. To improve the learning capacity of the 3D autoencoder, residual learning blocks containing 3D convolution and transpose convolution layers are employed in the encoder and decoder, respectively. In each residual block, a convolution/transpose convolution layer followed by Batch Normalization (BN) layer and activation layer appears twice as a basic mapping, then adding with the input to generate the output. The mathematical expression of the 𝑙 th block can be shown as: D 𝒚 𝒍 = ℱ(𝒙 𝒍 ; 𝜽 ) + 𝒙 𝒍 ℱ(𝒙 𝒍 ; 𝜽 ) = 𝒞&𝒞(𝒙 𝒍 ))𝒞(𝒙 𝒍 ) = 𝑎(𝑏(𝑐(𝒙 𝒍 ))) , (4) where 𝒙 𝒍 and 𝒚 𝒍 represent the input and the output of this block, respectively. ℱ(𝑥 ; 𝜽 ) denotes the basic mapping of the 𝑙 th block parameterized by 𝜽 , and 𝒞(𝒙 𝒍 ) is the sequential operation of convolution/transpose convolution 𝑐 , BN 𝑏 and activation 𝑎 . The rectified linear unit (ReLU) function is used in all the activation layers but the last one. In addition, the function (4) demands that the output of mapping function ℱ(𝒙 𝒍 ; 𝜽 ) must have the same dimension as the input 𝒙 𝒍 . If this is not the case, the input 𝒙 𝒍 must be rescaled along with ℱ(𝒙 𝒍 ; 𝜽 ) using a convolution/transpose convolution transformation 𝑐 (𝒙 𝒍 ) with an appropriate stride value, the parameters of which can be updated in the training step. To analyze a large number of volumes, training of the autoencoder neural network should be treated elaborately to obtain suitable kernels and bias. First of all, parallel computation with multiple GPUs must be implemented to reduce the training time. Then the parameters of the network are optimized by the stochastic gradient descent Adam (Adaptive moment estimation) algorithm, in which the gradients of the objective function

𝐿(𝜽; 𝒙, 𝒚) with respect to the parameters 𝜽 can be calculated by the chain rule. Moreover, the learning rate will be reduced to one tenth if the loss function does not decrease in three epochs based on the initial value of 0.01. After trained about 50 epochs, the best model is picked to execute the task of structural feature extraction. Using the unsupervised 3D autoencoder, the feature maps 𝒛 = ℰ(𝒙) encoding the structural discrepancy among the 3D volume data can be extracted automatically without any human intervention. Manifold embedding of free energy landscape

To prepare for the energy landscape reconstitution, each bootstrapped 3D volume was juxtaposed with its 3D feature map learnt by the autoencoder to form an expanded higher-dimensional data point. All the expanded data points were then embedded onto a low-dimensional manifold via the t-SNE algorithm by preserving the geodesic relationships among all high-dimensional data . During manifold embedding, it is assumed that the pairwise similarities in the high dimensional data space and low dimensional latent space follow a Gaussian distribution and a Student’s t-distribution, respectively, which can be formulated as: ⎩⎪⎨⎪⎧𝑝 +, = exp (−=𝒙 + − 𝒙 , = /2𝜎 +2 )∑ exp (−=𝒙 + − 𝒙 , = /2𝜎 +2 ) +9, , 𝑖 ≠ 𝑗𝑞 +, = (1 + =𝒚 + − 𝒚 , = ) (! ∑ (1 + =𝒚 + − 𝒚 , = ) (!+9, , 𝑖 ≠ 𝑗 (5) where 𝑝 +, and 𝑞 +, represent the similarity distributions in the high and low dimensional spaces, 𝒙 - and 𝒚 - ( 𝑘 = 𝑖, 𝑗 ) are the data points of the high and low dimensional spaces, respectively. The parameter 𝜎 is the variance of the Gaussian distribution. In addition, 𝑝 ++ and 𝑞 ++ are both set to zero to satisfy the constraint of symmetry. To find the value 𝑦 - of each data point, an objective function measuring the distance between the similarity distribution 𝑝 +, and 𝑞 +, had to be well defined. Here the relative entropy, also called the Kullback-Leibler (KL) divergence, 𝐾𝐿(𝑃 ∥ 𝑄) = < 𝑝 +, log 𝑝 +, 𝑞 +, , +,, (6) was employed and minimized by the gradient descent algorithm with the momentum method . After the manifold embedding by t-SNR, each 3D volume is mapped to a low-dimensional data point in the learnt manifold. The coordinate system, in which the low-dimensional representation of the learnt manifold is embedded, is used for reconstructing and visualizing the free energy landscape. In accordance with the Boltzmann distribution, the free energy of each data point in the low-dimensional space can be estimated using the particle number included in the original 3D volume corresponding to each low-dimensional data point on the manifold: ΔΔ𝐺 + = −𝑘 : 𝑇ln 𝑁 + ∑ 𝑁 ++ , (7) where ΔΔ𝐺 + denotes the free energy difference of the data point with the particle number of 𝑁 + against a common reference energy level, 𝑘 : is the Boltzmann constant and 𝑇 is the temperature in Kelvin. The free energy landscape was plotted by interpolation of the free energy difference in areas with sparse data. Minimum energy path finding using the string method

The string method is an effective algorithm to find the minimum energy path (MEP) on the potential energy surface . To extract the dynamic information implicated in the experimental free energy landscape, an improved and simplified version of the string method was developed to search the possible state-transition pathway or MEP . Along the MEP on the learnt energy landscape, the local minima of interest could be defined as 3D clustering centers to guide the particle-voting algorithm for 3D classification, on the basis of which particles of the same conformation could be clustered to generate high-resolution cryo-EM density maps (Extended Data Fig. 1b). The objective of the MEP identification in barrier-crossing events lies in finding a curve 𝛾 having the same tangent direction as the gradient of free energy surface ∇𝐺 . It can be expressed as: (∇𝐺) ; (𝛾) = 0, (8) where (∇𝐺) ; denotes the component of ∇𝐺 perpendicular to the path 𝛾 . To approach the objective function (4), two computational steps, named evolution of the images and reparameterization of the string, are iterated until convergence within a given precision threshold. Evolution of the images . After initialization with the starting and ending points, the positions of interval images were updated according to gradient of the free energy at the 𝑡 th iteration: 𝜑 +∗(=) = 𝜑 +(=(!) − ℎ∇𝐺m𝜑 +(=(!) n, (9) with 𝜑 +(=) (𝑖 = 0,1, … , 𝑁) being the 𝑖 th intermediate image at the 𝑡 th iteration ( ∗ denoting the temporary values), and ℎ the learning rate. Reparameterization of the string . The values of positions 𝜑 +(=) (𝑖 = 0,1, … , 𝑁) were interpolated onto a uniform mesh with the constant number of points. Prior to interpolation, the normalized length 𝛼 +∗(=) (𝑖 = 0,1, … , 𝑁) along the path was calculated as: 𝛼 >∗(=) = 0, 𝛼 +∗(=) = 𝛼 +(!∗(=) + r𝜑 +∗(=) − 𝜑 +(!∗(=) r∑ r𝜑 +∗(=) − 𝜑 +(!∗(=) r , 𝑖 = 1,2, … , 𝑁. (10) Given a set of data points (𝛼 +∗(=) , 𝜑 +∗(=) ) , the linear interpolation function was next used to generate the new values of positions 𝜑 +(=) (𝑖 = 0,1, … , 𝑁) at the uniform grid points 𝛼 +(=) (𝑖 =0,1, … , 𝑁) . The iteration was terminated when the relative difference ∑ r𝜑 +(=) − 𝜑 +(=(!) r /𝑁 became small enough. Energy-based particle voting algorithm

The particle-voting algorithm was designed to conduct 3D classification, particle quality control, reproducibility test and particle selection in an integrative manner. The particle-voting algorithm mainly involves two steps (Extended Data Fig. 1b). First, we count the number of votes for each particle mapped within the voting boundaries of all 3D clusters on the free-energy landscape. One vote is rigorously mapped to one copy of the particle used in reconstructing a 3D volume and to no more than one 3D cluster on the energy landscape where the corresponding volume is located. Thus, each particle can have M votes casted for no more than M

3D clusters. If the vote is mapped outside of any 3D cluster boundary, it becomes an ‘empty vote’ with no cluster label. Each non-empty vote is thus labeled for both its particle identify and corresponding cluster identify. For each pair of particle and cluster, we compute the total number ( K ) of votes that the cluster receives from the same particle. Each particle is then assigned and classified to the 3D cluster that receives K > M /2 votes from this particle (Extended Data Fig. 1b). Note that after particle voting, each particle is assigned no more than once to a 3D class, with its redundant particle copies removed from this class. This strategy only retains the particles that can reproducibly vote for a 3D cluster corresponding to a homogeneous conformation, while abandoning those non-reproducible particles with divergent, inconsistent votes. Because the particle-voting algorithm imposes strong constraints on the numeric performance of particles in deep learning, it could lead to particle number insufficiency in the cases of smaller but potentially interested 3D classes. To remedy this limitation, an alternative, distance-based classification algorithm was devised to replace the particle-voting algorithm when there are not enough particles to gain the advantage of particle voting (Extended Data Fig. 1c). In this method, the distances of all M copies of each particle to all 3D cluster centers on the energy landscape are measured and ranked. Then, the particle is classified to the 3D cluster of the shortest distance. A threshold could also be manually preset to remove particles that are too far away from any of the cluster centers. The distance-based classification method can keep more particles, but it ignores the potential issue of irreproducibility of low-quality particles. Thus, it is proven to be less accurate in 3D classification (Extended Data Fig. 5d, i, n). In other words, it trades off the classification accuracy and class homogeneity to gain more particles, which is expected to be potentially useful for small datasets or small classes. By contrast, the energy-based particle-voting algorithm imposes a more stringent constraint to select particles of high reproducibility during classification, resulting in higher quality and homogeneity in the classified particles, which is superior to the distance-based classification method (Extended Data Fig. 5e, j, o). Data-processing workflow of AlphaCryo4D Input: single-particle cryo-EM dataset after initial particle rejection of apparent junks. Output: free energy landscape, MEP, 3D class assignment of each particle and high-resolution density maps. 1.

Bootstrap a large number of 3D volumes through particle shuffling, consensus alignment and Bayesian classification. a.

Split the particles dataset randomly to many sub-datasets, if necessary, for batch processing of particle shuffling and volume bootstrapping. b.

For each sub-dataset, conduct consensus alignment to generate initial parameters of Euler angles and translations. c.

Divide each sub-dataset into M + 1 groups, shuffle the sub-dataset M + 1 times and each time take a different group out of the shuffled sub-dataset, giving rise to M + 1 shuffled sub-datasets all with different collection of particles. d. Conduct 3D Bayesian classification on all the M + 1 shuffled sub-datasets to generate hundreds of 3D volumes, making each particle to contribute to M different volumes. e. Repeat steps (b) to (d) for all sub-datasets. 2.

Extract deep features of all volume data with the 3D deep residual autoencoder. a.

Initialize the hyper-parameters of the 3D autoencoder in Table 1. b.

Train this neural network with the 3D volume data to minimize mean square error between the decoding layer and the input by the Adam algorithm of initial learning rate 0.01. c.

Extract the feature maps of all volumes from the encoding layer. 3.

Embed the volume data to low-dimensional manifolds through the t-SNE algorithm and compute the free-energy landscape. a.

Calculate the pairwise similarities between volumes using their feature-map-expanded volume vectors, and randomly initialize the low-dimensional points. b.

Minimize the KL divergence by the Momentum algorithm to generate the low-dimensional embeddings. c.

Compute the free energy landscape with the low-dimensional coordinates. 4.

Apply the string method to find the MEP on the free energy landscape, along which 3D density maps of the same conformation are clustered together. a. Initialize the transition path with a straight line between the given starting and ending points. b.

Update the transition path according to Equations (9) and (10) until it converges to the expected MEP. c.

Sample the clustering centers along the MEP and calculate the recommended clustering radius. 5.

Classify all particles through the energy-based particle-voting algorithm. a.

Define the local energy minima as the 3D clustering centers and their corresponding cluster boundary for particle voting. b.

For each particle, cast a labelled vote for a 3D cluster when a volume containing one of the M particle copies is located in the voting boundary of the 3D cluster. c. Count the number of votes of each particle with respect to each 3D cluster and assign the particle to the 3D cluster that receives more than M /2 votes from this particle. 6. Refine the 3D density maps to high resolution using particles classified into the same 3D classes.

Blind assessments using simulated datasets

Three simulated datasets with the SNRs of 0.05, 0.01 and 0.005 were employed to benchmark AlphaCryo4D and to compare its performance with conventional methods. Particles were computationally simulated by projecting the 20 3D density maps calculated from 20 hypothetical atomic models emulating continuous rotation of the NLRP3 inflammasome protein. The 20 atomic models were interpolated between the inactive NLRP3 structure and its hypothetical active state, which was generated through homology modeling using the activated NLRC4 structure . The 20 atomic models represent sequential intermediate conformations during a continuous rotation in its NATCH domain against its LRR domain over an angular range of 90 ° . Each conformation is thus rotated 4.5 ° over its immediate predecessor in the conformational continuum sequence. 100,000 simulated particle images per conformational state were generated with random defocus values in range of -0.5 to -3.0 μm, resulting in 2 million particles for each dataset of a given SNR. The pixel size of the simulated image was set to the same as the pixel size (0.84 Å) of the real experimental dataset of NLRP3-NEK7 complex. To emulate realistic circumstances in cryo-EM imaging, Gaussian noises, random Euler angles covering half a sphere and random in-plane translational shifts from -5.0 to 5.0 pixels were then added to every particle image. Each of the three simulated heterogeneous NLRP3 datasets of three different SNRs were analyzed separately by AlphaCryo4D and used to characterize the performance and robustness of AlphaCryo4D against the variation of SNR. In the step of particle shuffling and volume bootstrapping, 2,000,000 particles in the dataset of any given SNR were divided randomly into 10 sub-datasets for batch processing. The orientation of each particle was determined in the initial 3D consensus alignment in RELION, which did not change in the subsequent 3D classification. In this step, the maximum number of iterations of the 3D alignment was set up as 30, with the initial reference low-pass filtered to 60 Å. 3-fold particle shuffling (indicated as × 3 below) was conducted on each sub-dataset for volume bootstrapping. The first round of maximum-likelihood 3D classification divided the input shuffled particle sub-dataset into 5 classes, each of these classes were then further classified into 8 classes. This procedure was repeated on all shuffled particle sub-datasets. The particle shuffling and volume bootstrapping generated 1,372, 1,489, and 1,587 volumes by the datasets with SNRs of 0.05, 0.01 and 0.005, respectively. These volume data were used as inputs for deep residual autoencoder to compute low-dimensional manifolds and free-energy landscapes (Fig. 1d-f, Extended Data Fig. 2a, b). After searching the MEP on the energy landscapes by the string method, 20 cluster centers along the MEP were defined by the local energy minimum along the MEP that represent potentially different conformations of the molecule (Extended Data Fig. 2a-c). The particle-voting algorithm was applied in every cluster to determine the final particle sets for all 3D classes. For the purpose of validation of the methodology and investigation of 3D classification improvement, we labeled each bootstrapped 3D volume with the conformational state that held the maximum proportion of particles in the class and computed its 3D classification precision as the ratio of the particle number belonging to the labelled class versus the total particle number in the volume (Fig. 1d, Extended Data Fig. 2a, b). Moreover, 3D classification precision and its statistical distribution among the bootstrapped data calculated from the ground truths were recorded in the intermediate steps of AlphaCryo4D, including the particle shuffling and volume bootstrapping, setup of voting boundary on the free energy landscape, distance-based 3D clustering and energy-based particle voting (Extended Data Fig. 5). To compare the classification performance of AlphaCryo4D with other methods, the ab-initio classification by 3D PCA analysis in CryoSPARC and the maximum-likelihood-based 3D (ML3D) classification in RELION had been tested on the simulated datasets. For the tests using RELION, we classified all particles directly into 20 classes and hierarchically into 4 × 5 classes, which first divided the dataset into 4 classes, with each class further classified into five sub-classes (Extended Data Fig. 3). In the 3D PCA analysis using CryoSPARC, we first did the consensus alignment of the entire dataset to find the orientation of each particle. Then the alignment and the mask generated from the consensus alignment were used as inputs into the 3D PCA calculation, with the number of orthogonal principal modes being set to 2 in 3D classification, in consistent with the deep manifold learning algorithm. Then the 3D variability display module in the cluster mode was used to analyze the result of 3D classification. The metadata of 3D classification precisions as well as the 3D density maps from all the algorithms applied on the three simulated datasets were collected to conduct statistical analysis, as shown in Fig. 2 and Extended Data Figs. 2-5. Computational costs of AlphaCryo4D

Although the computational cost of AlpahCryo4D is generally higher than the conventional approach, it does not appear to increase drastically and likely falls in an affordable range, while reducing the average cost of computation per conformational state. In a nutshell, we can have a brief comparison of the computational efficiency on the simulated dataset with an SNR of 0.01. In the step of 3D data bootstrapping, we split the dataset into 20 subsets, which contained 100,000 particles each. The 3D consensus alignment of all 2,000,000 particles cost about 75 hours using 8 V100 GPUs interconnected with the high-speed NVLink data bridge in a NVIDIA DGX-1 supercomputing system. Within each subset, the 3D Bayesian classification for one leave-one-group-out dataset cost about 2.5 hours using 320 CPUs, so the total time spent in one subset was about 10 hours using 320 CPUs in an Intel processor-based HPC cluster. In addition, it spent about 3 hours to extract feature via deep neural network using 8 V100 GPUs of the NVIDIA DGX-1 system. In contrast to about 160 hours cost in traditional classification methods, this approach cost a little more than 200 hours using 8 V100 GPUs and 320 CPUs. Taken together, these observations suggest that the computational cost and efficiency of AlphaCryo4D are within the acceptable range considering the output of more higher-resolution conformers reconstructed by these procedures.

Applications to experimental cryo-EM datasets Three experimental cryo-EM datasets were processed using AlphaCryo4D to examine its applicability in processing real experimental cryo-EM data: (1) A small dataset of the ATPγS-bound human proteasome contains only 455,680 particles, with the super-resolution pixel size of 0.685 Å. The size of ATPγS-bound proteasome dataset was enhanced to 455,680 × 3 particles and was separated into 4 sub-datasets for volume bootstrapping. 160 volumes were bootstrapped to compute the free-energy landscape the substrate-free ATPγS-bound proteasome (Fig. 3a). (2) The substrate-engaged human proteasome dataset includes 3,254,352 RP-CP particles (mixed with particle images from both the DC and SC proteasomes) in total, with the physical pixel size of 1.37 Å and the box size of 300 pixels. Sample preparation, cryo-EM imaging and data collection condition as well as data pre-processing prior to 3D classification by AlphaCryo4D have been previously described in detail . To proceed with AlphaCryo4D, the whole dataset was randomly split into dozens of sub-datasets, each with 100,000 particles. Particle shuffling and volume bootstrapping procedures were repeated on each sub-dataset, which enhanced the total data size to 3,254,352 × 3 particles, and yielded 1280 volumes in total that were used to compute the free-energy landscape of the substrate-engaged proteasome (Fig. 3b, c). Particle-voting on this energy landscape detected previously missing states (such as E D0.2, E D0.3 , E

D1.1 and E

D1.2 , etc.) and yielded previously determined states with improved map quality such as E A2 and E D2 . Guided by this initial discovery of new conformational states, extensive focused 3D classifications in AlphaCryo4D were performed, as described in detail in the next section, to discover novel ubiquitin-binding sites and hidden dynamics of proteasomal AAA-ATPase motor during single-nucleotide exchange. (3) The dataset of human NLRP3-NEK7 complex includes 622,562 particles . All particles were divided into 6 sub-datasets for particle shuffling, with the total data size enhanced to 622,562 × 3 images for volume bootstrapping. Given the small molecular size of the complex, the super-resolution pixel size of 0.42 Å and the box size of 240 pixels were employed in image analysis. 480 volumes were bootstrapped to compute the free-energy landscape of the NLRP3-NEK7 complex (Extended Data Fig. 4k, l). The NLRP3-NEK7 dataset exhibits certain degree of orientation preference. While AlphaCryo4D was able to retrieve 3 distinct NLRP3 conformations, the resulting map resolution is limited to ~4 Å. Although this may be already an improvement over the previous analysis of this dataset by the conventional approach, the result suggests that AlphaCryo4D does not necessarily overcome the potential issues of the orientation preference. Focused 3D classification by AlphaCryo4D

To apply AlphaCryo4D to process the experimental dataset of the substrate-engaged human 26S proteasome , a focused 3D classification protocol using AlphaCryo4D was implemented to enhance its capability in discovering transient states of extremely low particle populations and in solving highly dynamic components that may be too small to be analyzed by conventional methods . To study the local features of ubiquitin binding and substrate interactions, 3D masks corresponding to the local densities of interest were applied to all bootstrapped volume data, and the subset of interest on the free energy landscape was extracted to conduct a zoomed-in analysis. In this case, the free energy landscape calculated via masked volumes is expected to reflect the local structural variations of interest (Extended Data Fig. 6). To detect novel ubiquitin-binding sites in the proteasome, local masks were applied to focus on the features around the putative locations of ubiquitin chains. After particle shuffling for data augmentation, 732,666 × 3 particle images of the state E A and E B were utilized for computing the zoomed-in free energy landscape with 280 volumes. These particles were first aligned in a consensus alignment without any mask or resolution limit to restrict alignment parameter calculations. On the focused free energy landscape, only the region within the soft mask of the ubiquitin and its binding site was applied in the volume bootstrapping step, with the signals of the CP of the proteasome subtracted from each raw particle when dealing with the state E A and E B . Accordingly, the particle diameter applied in the CP subtracted proteasome was shrunk to 274 Å, instead of 411 Å in the complete 26S proteasome. When the region around RPN2 masked for focused 3D classification to search for potential ubiquitin-like features, the total particle numbers were enhanced to 208,700 × 3 (E A1 ), 240,475 × 3 (E A2 ), 260,021 × 3 (E B ) and 129,492 × 3 (E D2 ). The free energy landscapes were computed with 80, 80, 80 and 40 masked volumes bootstrapped by upper resolution limitations of 15, 15, 15 and 20 Å, respectively. Several 3D classes with the particle numbers of 192,219, 2,539, 3,982 and 609 were obtained by the particle-voting procedure on the zoomed-in free energy landscape of state E A1 , while the case of state E A2 gave four classes containing 147,108, 5,842, 5,500 and 33,083 particles (Extended Data Fig. 6e, f). Likewise, the particle numbers of 3D classes on the zoomed-in free energy landscape of state E B were 173,931, 5,117 and 9,754 (Extended Data Fig. 6g); and the zoomed-in free energy landscape of state E D2 resulted in four classes having 61,580, 6,192, 24,674 and 1,673 particles, respectively (Extended Data Fig. 6h). To detect and improve RPN1-bound ubiquitin densities, a local mask around RPN1 was used, with 15 Å resolution limit applied in the volume-bootstrapping step. The zoomed-in free energy landscape around state E A1 computed from 120 volumes resulted in five classes with the particle numbers of 128,161, 7,754, 6,557, 6,826 and 3,937. Five 3D classes on the zoomed-in free energy landscape of E A2 , also computed with 120 volumes, including 81,612, 27,483, 12,247, 26,029 and 12,777 particles, were analyzed (Extended Data Fig. 6c, d). In the case of the ubiquitin-binding site on the α5 subunit of the CP, five 3D classes with the particle numbers of 136,071, 19,351, 4,100, 131,570 and 4,932 were generated by particle voting on the zoomed-in free energy landscape of E D computed with 120 volumes, which were bootstrapped from 326,409 × 3 particles via the mask containing this ubiquitin-binding site and no high-resolution restriction (Extended Data Fig. 6i). During these analyses, all masked volume-bootstrapping steps were performed for 50 iterations, whereas unmasked 3D classifications were performed for 30 iterations with the initial reference low-pass filtered to 60 Å. At the last step, 3D refinement and reconstruction of each class was done to finalize the high-resolution structure determination. To exploit intermediate conformations during the substrate translocation initiation, a zoomed-in free-energy landscape was computed for the focused 3D classification procedure. First, the free energy landscape of all 3,254,352 × 3 particles was computed and analyzed. Then the zoomed-in free energy landscape of the states E C and E D containing 2,521,686 × 3 particle images were computed from 1,000 volumes that were bootstrapped without masking or resolution restriction (Extended Data Fig. 6b). For further zoomed-in processes, the particles around the translocation initiation states (from E C to E D0 ) on this free energy landscape were extracted by the distance-based 3D classification strategy. Subsequently, these particles were augmented to the number of 410,098 × 3 to generate 160 volumes for the computation of the zoomed-in free energy landscape with the RP mask and no resolution restriction imposed in the volume bootstrapping, which led to the 3D classes E C1 , E C2 , and E D0.1 with the particle number of 68,506, 77,545, and 25,755, respectively (Extended Data Fig. 6j). Moreover, state E

D0.3 came from two clusters with 86,415 and 83,107 particles merged together on the unmasked free energy landscape, and the classes E

D0.2 and E

D1.1 including 105,081 and 51,808 particles, respectively, were also obtained after particle voting on this free energy landscape. In the local areas of the state E D1 on the energy landscape, two clusters with 80,841 and 94,273 particles were merged to reconstruct the high-resolution structure of state E D1.2 , while states E

D1.1 and E

D1.3 with 40,696 and 145,506 particles, respectively, were separated out from the clusters on this unmasked free energy landscape. In these studies, the maximum number of iterations in the 3D classifications of volume bootstrapping was 30, with the initial reference low-pass filtered to 60 Å routinely. Beyond resolving significantly more conformers from the same dataset, AlphaCryo4D also allows us to push the envelope of the achievable resolution of dynamic components and key metastable states due to its advantage in keeping more particles without sacrificing conformational homogeneity. To improve the resolution of state E D2 , a rigorous particle voting procedure was performed in each cluster on the unmasked free energy landscape (Fig. 3b, c), which improved the overall quality of particle dataset while classifying significantly more particles to this E D2 conformational class. Next, we conducted a focused 3D classification by using AlphaCryo4D to calculate the zoomed-in free energy landscape of the entire E D2 dataset to detect any potential conformational changes within this class. Five clusters near the energy wells of state E D2 , whose particle numbers were 154,661, 243,799, 198,221, 179,257 and 106,047, were investigated by separate high-resolution 3D refinement and reconstruction. The resulting refined density maps at 3.0-3.5 Å were found to be all in an identical conformation as previously reported . These five clusters were then merged together to refine a final 3D density map of state E D2 . Thus, AlphaCryo4D allowed us to obtain a new E D2 dataset that appears to retain high conformational homogeneity while including three-fold more particles, which improves its 3D reconstruction from the previously published resolution at 3.2 Å to 2.5 Å, measured by gold-standard Fourier shell correlation (FSC) at 0.143-cutoff. By contrast, in our previous work , we had attempted to include more E D2 -compatible 3D classes from 3D maximum-likelihood classification in RELION, which resulted in lower resolution at 3.3 Å by doubling the E D2 dataset, indicating that including more particles by conventional methods gave rise to higher heterogeneity in the class presumably due to increased misclassification. Analysis of doubly and singly capped proteasomes

Previous cryo-EM analyses of the proteasomal states were nearly all focused on the RP-CP subcomplex, by converting double-cap (DC) proteasome particles to pseudo single-cap (SC) proteasome, which was always mixed with true SC particle images, in order to improve the resolution with conventional methods . As a result, the cryo-EM density of the CP gate in the other side distal to the RP in the RP-CP subcomplex has been an average of closed and open states with unknown stoichiometry. This has made it impossible to appreciate the difference between the SC and DC proteasomes in the previous studies. To remove this ambiguity, we reconstructed 8 density maps of the true SC proteasome and 36 density maps of the true DC proteasome including all possible combination of the 8 major RP-CP states. To reconstruct the density maps of the DC proteasome, pseudo-SC particles corresponding to 8 different RP states were voted out from the free energy landscape, with the sub-states belonging to the same RP state combined together. Then the particles with one RP-CP were extracted from the particle stacks to refine the RP and CP density maps with RP and CP contour masks applied in the local search step in RELION, respectively. For each reconstruction of the DC proteasome, two RP density maps on the opposite sides of the CP were expanded from 600 × 600 × 600 pixels to 800 × 800 × 800 pixels using RELION and aligned by the density of CP in Chimera before being merged together. Counting the particle number of the 36 states of the DC proteasome, we could derive the distribution of the DC states, 𝑝 +, , and that of RP 𝑝 + only, where i and j denote the states of two RPs in the same DC proteasome. Based on the value of 𝑝 +, and total particle number, the experimental state distribution of the DC proteasome was plotted for quantitative analysis in two dimensions with respect to the states of two RPs in the same DC proteasome (Fig. 3e). If the states of two RPs of the same DC proteasome were independent of each other, the predicted state distribution 𝑝s +, of the DC proteasome can be calculated as: 𝑝s +, = 𝑝 + · 𝑝 , By using the total particle number and experimentally measured 𝑝 + , the 2D state distribution of the DC proteasome with two uncoupled RPs was calculated for further comparison to investigate the conformational entanglement effect of the two RP states (Fig. 3f-h). Atomic model building and refinement

To build the initial atomic model of the newly discovered states, we use previously published proteasome structures as a starting model and then manually improved in Coot . For the conformational states at resolutions lower than 5 Å, pseudo-atomic modelling was conducted in the following steps. First, each subunit was fitted as a rigid body, often in UCSF Chimera as well as in Coot . Then, the local structural domains and secondary structure elements were fitted as a rigid body, with linking loops flexibly fitted and modelled. After the mainchain structures were well fitted, no further fitting of the sidechain rotamers were pursued due to insufficient resolution except for energy minimization to remove unrealistic side-chain clashes. For the conformational states at resolutions higher than 5 Å, the mainchain traces were first fit in Coot. Then, extensive fitting, adjustment and optimization of sidechain rotamers were conducted through local real-space refinement and manual rectification in Coot to the degree commensurate to the map resolution and quality. Atomic model refinement was conducted in Phenix with its real-space refinement program. We used both simulated annealing and global minimization with NCS, rotamer and Ramachandran constraints. Partial rebuilding, model correction and density-fitting improvement in Coot were iterated after each round of atomic model refinement in Phenix . The improved atomic models were then refined again in Phenix, followed by rebuilding in Coot . The refinement and rebuilding cycle were often repeated for three rounds or until the model quality reached expectation (Extended Data Table 2). All figures of structures were plotted in Chimera , PyMOL , or ChimeraX . Local resolutions of cryo-EM density maps were evaluated using ResMap or Bsoft Blocres program . Structural alignment and comparison were performed in both PyMOL and Chimera. Electrostatic surfaces were calculated by APBS plugin in PyMOL. Given a large range of variations of local resolution in an expanded number of proteasome states and substates, as well as reconstructions with different stoichiometric ratio of RP versus CP, additional caveats and cautions were practiced in order to avoid misinterpretation and overall-fitting of atomic and pseudo-atomic models. First, when fitting the ubiquitin structure to the low-resolution local density, we consider both the existing homologous structural models as a modelling reference. Specifically, the fitting of diubiquitin to the density on RPN2 in states E A2.1 , E

A2.2 , and E

B.2 , we took the NMR structure of the yeast di-ubiquitin-bound Rpn1 T1 site as a reference, because of the high structural homology between RPN1 and RPN2. For the fitting of RPN10-bound density, we also took into account the electrostatic complementarity (Extended Data Fig. 10). Second, even within an overall high-resolution cryo-EM map, it often presents certain local densities at a lower local resolution due to poor occupancy of ligands or flexibility of less well-folded segments. For example, for the nucleotide in RPT5 in states E

D0.1 to E

D1.3 , the high-resolution atomic model of ADP refined from other states like E C1 were used to fit these lower-local-resolution densities as a rigid body. However, the B-factors of the fitting atoms of ADP in RPT5 were reported to be more than twice higher than those of other nucleotides, indicating its partial occupancy or unstable binding. Thus, ADP fitting in RPT5 should be regarded tentative for the most or for the purpose of evaluation of nucleotide states rather than a reliable high-resolution atomic modeling. Similarly, the atomic models of these locally low-resolution features were only further adjusted with strict stereochemical constraints when the map resolution and features permit. Third, for the CP gate at the other side opposite to the RP in the RP-CP subcomplex reconstructed from a mixture of true and pseudo-SC proteasomes (converted from the DC proteasomes), their local densities in states E D0.1 to E D2 appears to be an average of the open and closed states, and thus are weaker in amplitude than the rest of the CP. In the atomic model building of these reconstructions, we chose to model them with the closed CP gate state as long as their density resolution allows atomic modeling. This issue was completely solved when we separated the 3D classes of the SC and DC proteasomes and reconstructed their states respectively, where the atomic modelling of any CP gate no longer suffers from such ambiguities. Quantum mechanical calculation

First-principles calculations on the electronic structures of the substrate-proteolytic site complex were conducted as described. In brevity, local coordinates of the substrate-bound CP near the catalytic site at residue Thr1 were taken out from the experimentally fitted atomic models of CP and the substrate, in both E A and E D2 conformations. The proteolytic site is the Thr1 residue at the β2 subunit but extended along the chain for another two residues to make the system less finite. Hydrogen atoms are added to complete the residues, and to saturate the system boundaries. The structural relaxations and electronic structure calculation were carried out using the density functional theory (DFT) method with norm-conserving pseudopotentials as implemented in the SIESTA code . To set up the initial conditions of the finite system, the hydrogen atom in the hydroxy group of Thr1 is manually relocated to the nearby nitrogen atom in the same residue, corresponding to the initial proton transfer in the catalytic mechanism. All hydrogen atoms were subsequently optimized without symmetric restrictions using the conjugate gradient algorithm and a 0.04 eV/Å maximum force convergence criterion while keeping the rest of the system fixed. The generalized gradient approximation (GGA) exchange-correlation density functional PBE was employed together with a double-zeta plus polarization basis set, and a mesh cutoff of 200 Ry (corresponds to 0.23 Å smallest grid size). The charge density and charge density difference contour maps were plotted with the Siesta Utility programs denchar and Python, respectively. Data availability Acknowledgments.

The authors thank Q. Ouyang, A. Goldberg, D. Finley, S. Elsasser, M. Kirschner, Y. Lu, and for constructive discussions. This work was funded in part by the Natural Science Foundation of Beijing Municipality (grant No. Z18J008/Z180016), the National Natural Science Foundation of China (grant No. 11774012) and a gift academic grant from Intel Corporation. The cryo-EM data were collected at the Cryo-EM Core Facility Platform and Laboratory of Electron Microscopy at Peking University. The data processing was performed in the High-Performance Computing Platform at Peking University.

Author contributions . Y.M. and Z.W. conceived this study, devised the methodology and wrote the paper. Z.W. developed the deep learning system and conducted numerical studies of the system using the synthetic datasets. Z.W. and S.Z analyzed the experimental cryo-EM datasets and refined the density maps. W.L.W. conducted the quantum mechanical simulation. Y.M. supervised this study, verified the density maps, built and refined the atomic models, interpreted the data and drafted the manuscript with inputs from all authors.

Competing interests . The authors declare no competing financial interests.

Correspondence and requests for materials should be addressed to Y.M. Extended Data Fig. 1. Detailed algorithmic design of particle shuffling and voting in AlphaCryo4D. a , Schematic showing the method of particle shuffling for data augmentation and bootstrapping of 3D volumes. In the step 1, all particles are split randomly into M + 1 groups equally. Then the step 2 carries out the Bayesian clustering for reconstructions of 3D density maps within each of the M + 1 particle sets that are shuffled via the leave-one-group-out approach. After these two steps, thousands of 3D volumes are generated for the subsequent 3D deep learning, with each particle contributing to M volumes. b , Schematic showing the algorithmic concept of energy-based particle voting for 3D classification. The left panel shows the free energy landscape obtained by deep manifold learning. After clustering along the minimum energy path, all M locations of each particle on the free energy landscape can be tracked to cast M votes. A vote of the particle is only counted for the cluster when it is mapped within the voting boundary of the cluster, as indicated by the circles marked on the free-energy landscape. Eventually, this particle is classified to the 3D cluster with over M /2 votes from this particle. c , Alternative distance-based 3D classification method as a control in the analysis of algorithmic performance of particle voting. Instead of particle voting, each particle is directly classified to the cluster with the nearest clustering center among the M volume data points in the distance-based 3D classification strategy. Extended Data Fig. 2. Blind assessments of AlphaCryo4D and its comparison with the 3D PCA method using the simulated heterogeneous NLRP3 datasets of different SNRs. a and b , Reconstruction of the free energy landscape of the simulated NLRP3 datasets at SNRs of 0.05 ( a ) and 0.005 ( b ) by the t-SNE algorithm using the bootstrapped volumes and their corresponding feature maps. Colors in the left panels indicate the ground truth of 3D volume data points. c , Free energy profiles along the MEP calculated by the string method in the 2D energy landscapes of the simulated NLRP3 datasets at SNRs of 0.05 (left), 0.01 (middle) and 0.005 (right). d , Dimensionality reduction and 3D classification of the simulated datasets at three distinct SNRs by the 3D PCA method as the control of our algorithm evaluation. Colors of data points indicate the ground truth of their corresponding 3D volumes. Extended Data Fig. 3. Performance comparison of AlphaCryo4D with existing methods using the simulated heterogeneous NLRP3 datasets of different SNRs. a - l , 3D classification precision of the simulated datasets by AlphaCryo4D ( a - c ), PCA in CryoSPARC ( d - f ) and maximum-likelihood classification in RELION ( g - l ). The results of SNRs of 0.05 ( a, d, g, j ), 0.01 ( b, e, h, k ) and 0.005 ( c, f, i, l ) are shown on three rows for side-by-side comparison. On the top of each panel, the symbols of µ and 𝜎 denote the mean and variance of precision, respectively, with the values of missing classes treated as zeros. In the maximum-likelihood classification of RELION, both direct and hierarchical strategies are compared in the study. m-o , The gold-standard FSC of the 20 maps resulting from 3D classification by AlphaCryo4D ( m ), of the 18 maps resulting from the 3D classification by PCA in CryoSPARC ( n ), and of the 14 maps resulting from the maximum-likelihood 3D classification in RELION ( o ) on the simulated data of 0.01 SNR. They correspond to the precision results presented in panels ( b ), ( e ) and ( h ), respectively. Extended Data Fig. 4. Map assessment of AlphaCryo4D in reconstructions of conformational continuum in comparison with conventional methods on the simulated data of 0.01 SNR. a , The 20 maps of NLRP3 resulting from the 3D classification by AlphaCryo4D. b , The 18 maps of NLRP3 resulting from the 3D classification by PCA in CryoSPARC. c , The

14 maps of NLRP3 resulting from the 3D classification by ML3D in RELION. All maps are shown in transparent surface representations superimposed with their corresponding atomic models of the ground truth in cartoon representations, which are fitted to the maps as rigid bodies without further atomic modelling. The conformer ID numbers are marked on the upper left of each map panel. The results shown in panels ( a ), ( b ) and ( c ) correspond to the FSC results shown in panels ( m ), ( n ) and ( o ) of Extended Data Fig. 3, respectively. Extended Data Fig. 5. Mechanistic characterizations of the improvement of 3D classification precision in the intermediate steps of AlphaCryo4D using the simulated NLRP3 datasets of three typical SNRs. a-e , Precision analysis on the simulated data of SNR of 0.05. f-j , Precision analysis on the simulated data of SNR of 0.01. k-o , Precision analysis on the simulated data of SNR of 0.005. a , f , k , Statistical distribution of 3D classification precision in bootstrapped 3D volumes without particle shuffling. b , g , l , Distribution of 3D classification precision in bootstrapped 3D volumes after 3-fold particle shuffling. c , h , m , Distribution of classification precision in bootstrapped 3D volumes screened by voting boundary on the free energy landscape. d , i , n , Distribution of classification precision in bootstrapped 3D volumes after distance-based 3D clustering in the absence of particle voting as a control. e , j , o , Distribution of classification precision in bootstrapped 3D volumes after energy-based particle voting. Extended Data Fig. 6. The zoomed-in free-energy landscape used for focused 3D classifications of AlphaCryo4D on the experimental cryo-EM datasets for finding new states of the substrate-bound proteasome. a-j,

The zoomed-in local energy landscapes of the substrate-bound human proteasome by AlphaCryo4D are shown for state E A and E B in ( a ), for states E C and E D in ( b ), for states E A1 and E A2 with 3D mask focusing on RPN1 in ( c ) and ( d ), for states E A1 , E A2 , E B and E D2 with 3D masking around RPN2 in ( e ), ( f ), ( g ) and ( h ), respectively, for combined states of E D with masking around CP ( i ), and for combined states of E C ad E D0 with 3D mask focusing on the RP ( j ). k and l , The energy landscape computed from the experimental cryo-EM dataset of the NLRP3-NEK7 complex without ( k ) and with a global contour mask ( l ). Extended Data Fig. 7. Cryo-EM reconstructions and resolution assessment of novel proteasomal states determined by AlphaCryo4D. a , A typical raw micrograph of the substrate-bound human proteasome recorded under the super-resolution counting mode of Gatan K2 Summit direct electron detector. b , The power spectrum of the micrograph shown in panel ( a ), with the 3 Å resolution ring marked by the dashed circle. c , Local resolution measurement of the seven sequential states corresponding to the open CP gate states E D0.1 to E D2 . d , The local resolution measurement of the density maps of ubiquitin-bound RPN2 structure in state E A1 at two different contour levels. e , The local resolution measurement of the CP in state E D.α5 , with a ubiquitin-moiety bound to the α5 subunit. The right panel shows the CP density map at higher contour level. The color bar for local resolutions is shown for all panels in c - e , which were all measured by ResMap . f and g , The gold-standard FSC plots of the 20 conformers of the RP-CP subcomplex. h and i , The FSC curves calculated between the experimental cryo-EM maps and their corresponding atomic models for newly discovered states or improved states without ( h ) and with ( i ) masking in the calculations. Extended Data Fig. 8. Cryo-EM analysis of the doubly capped (DC) and singly capped (SC) proteasomes. a , The CP gate states of the 36 density maps of distinct conformational states of the DC proteasome. All density maps shown are low-pass filtered to their measured resolutions by the gold-standard FSC-0.143 cutoff without amplitude correction of B-factors. b , The FSC-0.143 resolution matrix of the 36 DC proteasomal states. c , The RP-CP interface of state E D0 , showing five C-terminal tails inserted into the inter-subunit surface pockets of α-ring. d , The CP gate states of the 8 density maps of distinct conformational states of the SC proteasome, with the upper right image in each panel corresponding to the RP-proximal side of the CP gate, and the lower left image to the RP-distal side of the CP gate. e , The gold-standard FSC plots of the 36 conformers of the DC proteasome. f , The gold-standard FSC plots of the 8 conformers of the SC proteasome. Extended Data Fig. 9. 44 cryo-EM reconstructions of the SC and DC proteasomes classified from the same experimental dataset by AlphaCryo4D. a , The density maps of 8 distinct conformational states of the SC proteasome. b , The density maps of 36 distinct conformational states of the DC proteasome. All maps are filtered to their respective gold-standard FSC-0.143 resolutions without amplitude correction of B-factors and are differentially colored by their subunits in ChimeraX . Extended Data Fig. 10. Structural analysis of ubiquitin-binding sites at high resolution. a , Cryo-EM density of state E

A2.1 shows the improved resolution localized around RPN11-bound ubiquitin and the RPT5 N-loop as compared to previously determined state E A2 . Ub, ubiquitin. b , High-resolution cryo-EM density of state E D2 superimposed with ubiquitin densities in state E D2.2 highlighted as magenta, showing that a diubiquitin chain in contact with both the RPN2 T1 site and the RPN10 VWA domain. The atomic model of state E

D2.2 in cartoon representation is superimposed with the density map. c , Closeup view of the RPN2-bound diubiquitin density in states E A2.2 (left column) and E

B.2 (right column) in two different perspectives, with the lower row showing the coexistence of the RPN11-bound ubiquitin and the RPN2-bound diubiquitin. d , The refined 3.7-Å RPN1 density map from state E A1.1 in transparent surface representation superimposed with its atomic model with ubiquitin bound to the RPN1 T2 site. e , Closeup view of the α5-subunit bound with ubiquitin in the refined 3.1-Å density map of state E D.α5 . The cryo-EM density of the α5-subunit is shown in transparent surface representation superimposed with its atomic model. Inset shows the ubiquitin density in transparent surface representation superimposed with its atomic model. f , Closeup views of the RPN10-bound ubiquitin densities in states E A2.3 , E

B.3 , and E

D2.3 . Right panel shows the pseudo-atomic model of ubiquitin interaction with the RPN10 VWA domain. g-l , The electrostatic surfaces of RPN1, RPN2, RPN10 and α5-subunit show the charge complementarity of the RPN1 T1 and T2 (panels g , h ), RPN2 T1 (panels i , j ) and α5-binding (panel k ) sites of ubiquitin are all acidic. By contrast, the RPN10 VWA site is basic (panel l ), suggesting that it may bind the acidic side of the ubiquitin molecule. m and n , The electrostatic surface of the ubiquitin shows one side that is basic (panel m ) and the other side acidic (panel n ). Extended Data Fig. 11. Cryo-EM structures of six sequential intermediate states between E C2 and E D1.3 . a , Cryo-EM density maps of the AAA-ATPase motor in six sequential states from E

D0.1 to E

D1.3 from three different viewing angles. The substrate densities are highlighted in yellow. b , Superposition of the seven structures in cartoon representations from a lateral perspective, with all structures aligned together based on their CP structures, showing the RP movements relative to the CP. c , Superposition of the six structures in cartoon representations from a top-view perspective, with all structures aligned together based on their CP structures. d , Measurement of substrate movement between states E D0.1 and E

D1.3 relative to the CP, suggesting that the substrate is translated ~5-6 Å during the process of ADP release in RPT5 and of RPT1 re-association with the substrate. e , The closeup view of the pore-1 loop interaction with the substrate with all six states superimposed after aligned together based on the structures of RPT3, RPT4 and RPT6. f - i , Superposition of the six sequential states with their RPT3, RPT4 and RPT6 aligned together. Red arrows show the direction of subunit movements in RPT1 and RPT5. j , A diagram illustrating the axial stepping of the substrate-interacting pore-1 loops that is coupled with ATP hydrolysis in the RPT subunits, revised from the previously published results based on the present study . k , The average B-factor of the nucleotides in six ATPase subunits fitted to the cryo-EM densities of states E D0.1 to E

D1.3 and computed by Phenix in real-space refinement procedure . The high B-factor of ADP fitted in RPT5 indicates its low occupancy or unstable association and suggests that the ADP in RPT5 undergoes dissociation in this process. Extended Data Fig. 12. Detailed structural features in substrate interactions in the CP of different gate states. a , The atomic model of substrate-bound catalytic active site of the β2- subunit superimposed with the cryo-EM density map of state E A at 2.7 Å. b , The atomic model of substrate-bound catalytic active site of the β2-subunit superimposed with the cryo-EM density map of state E D2 at 2.5 Å. c , Closeup view of the cryo-EM density of the β2-subunit in state E D2 shown in blue mesh representation superimposed with the atomic model. d , Cryo-EM densities of the typical secondary structure elements in the β2-subunit of state E D2 shown as blue mesh representation superimposed with their corresponding atomic models. The residues are labelled. The densities are shown at 4 s level and exhibit structural features consistent with 2.5-Å resolution. e and f,

3D Charge density difference map showing the interactions between the substrate polypeptide and the residue Thr1 in the β2-subunit of the CP. The red color labels an iso-surface of charge increase while the blue color labels charge decrease, both at a level of 0.04 e/Å . The charge increase region at the Thr1-Oγ atom is elongated sideways, and the substrate position in the E D2 state provides both a proximity and orientation alignment for the pair to interact. The dashed line triangle shows the plane in which the detailed charge difference contours are plotted as shown in Fig. 6c, d. Extended Data Table 1. Hyperparameters of the deep residual networks in the 3D autoencoder.

Optimizer: Adam. Epochs: 50. Initial learning rate: 0.01. A factor of 0.1 and patience of 3 means that the learning rate will times 0.1 (factor) if the loss function does not improve in 3 (patience) epochs. Batch size: 32 (for the three simulated datasets), 8 (for the substrate-bound proteasome dataset).

Layer name Output size (simulation) Output size (proteasome) Network structure Conv1

200 × 200 × 200

300 × 300 × 300

Conv2_x

100 × 100 × 100

150 × 150 × 150 stride Conv3_x

Conv4_x

50 × 50 × 50

75 × 75 × 75 stride Conv5_x

Conv6_x (encoding)

TransConv7_x

100 × 100 × 100

150 × 150 × 150 stride TransConv8_x

TransConv9_x

200 × 200 × 200

300 × 300 × 300 stride TransConv10_x

TransConv11 Extended Data Table 2. Cryo-EM data collection, refinement and validation statistics E A E A1.1 E A1.2 E A2.1 E A2.2 E A2.3 E B.1 E B.2

Data collection and processing

Magnification 105,000 105,000 105,000 105,000 105,000 105,000 105,000 105,000 Voltage (kV) 300 300 300 300 300 300 300 300 Electron exposure (e–/Å ) 44 44 44 44 44 44 44 44 Defocus range (μm) -0.6 to -3.5 -0.6 to -3.5 -0.6 to -3.5 -0.6 to -3.5 -0.6 to -3.5 -0.6 to -3.5 -0.6 to -3.5 -0.6 to -3.5 Pixel size (Å) 0.685 0.685 0.685 0.685 0.685 0.685 0.685 0.685 Symmetry imposed C1 C1 C1 C1 C1 C1 C1 C1 Initial particle images (no.) 3,254,352 3,254,352 3,254,352 3,254,352 3,254,352 3,254,352 3,254,352 3,254,352 Final particle images (no.) 339,327 192,219 2,539 147,108 5,500 5,842 173,931 9,754 Map resolution (Å) FSC threshold 2.7 0.143 2.9 0.143 7.5 0.143 3.2 0.143 6.0 0.143 4.8 0.143 3.1 0.143 4.7 0.143 Map resolution range (Å) 2.5-8.0 2.5-8.0 7.0-20.0 2.5-8.0 5.5-20.0 3.0-8.0 2.5-8.0 4.5-20.0 Refinement

Initial model used 6MSB 6MSB 6MSD 6MSE Model resolution (Å) FSC threshold 3.2 0.5 3.4 0.5 3.5 0.5 3.4 0.5 Model resolution range (Å) 2.5-8.0 2.7-8.0 3.0-8.0 2.9-8.0 Map sharpening B factor (Å ) -25 -28 -30 -30 Model composition

Non-hydrogen atoms Protein residues Ligands 63654 8203 11 104938 13391 12 105088 13543 12 105426 13418 12 106630 13570 12 106028 13494 12 105147 13400 9 106351 13552 9 B factors (Å ) Protein Ligand 131.13 125.87 145.9 110.13 145.9 110.13 151.26 108.00 182.77 123.85 R.m.s. deviations

Bond lengths (Å) Bond angles (°) 0.008 0.908 0.007 1.013 0.007 1.013 0.006 1.005 0.006 1.005 0.006 1.005 0.006 1.017 0.006 1.017

Validation

MolProbity score Clashscore Poor rotamers (%) 1.49 2.72 0.47 1.76 4.86 0.54 1.76 4.86 0.54 1.76 4.91 0.5 1.76 4.91 0.5 1.76 4.91 0.5 1.83 5.57 0.46 1.83 5.57 0.46

Ramachandran plot

Favored (%) Allowed (%) Disallowed (%) 93.3 6.52 0.18 91.34 8.36 0.3 91.34 8.36 0.3 91.42 8.33 0.25 91.42 8.33 0.25 91.42 8.33 0.25 90.89 8.77 0.34 90.89 8.77 0.34 Cryo-EM data collection, refinement and validation statistics (continued) E B.3 E C1 E C2 E D0.1 E D0.2 E D0.3 E D1.1 E D1.2

Data collection and processing

Magnification 105,000 105,000 105,000 105,000 105,000 105,000 105,000 105,000 Voltage (kV) 300 300 300 300 300 300 300 300 Electron exposure (e–/Å ) 44 44 44 44 44 44 44 44 Defocus range (μm) -0.6 to -3.5 -0.6 to -3.5 -0.6 to -3.5 -0.6 to -3.5 -0.6 to -3.5 -0.6 to -3.5 -0.6 to -3.5 -0.6 to -3.5 Pixel size (Å) 0.685 0.685 0.685 0.685 0.685 0.685 0.685 0.685 Symmetry imposed C1 C1 C1 C1 C1 C1 C1 C1 Initial particle images (no.) 3,254,352 3,254,352 3,254,352 3,254,352 3,254,352 3,254,352 3,254,352 3,254,352 Final particle images (no.) 5,117 68,506 77,545 25,755 105,081 169,522 92,777 174,114 Map resolution (Å) FSC threshold 6.1 0.143 3.4 0.143 3.3 0.143 3.9 0.143 3.3 0.143 3.2 0.143 3.3 0.143 3.2 0.143 Map resolution range (Å) 5.5-20.0 2.5-8.0 2.5-8.0 3.0-8.0 2.5-8.0 2.5-8.0 2.5-8.0 2.5-8.0 Refinement

Initial model used 6MSG 6MSH 6MSH 6MSH 6MSH 6MSJ 6MSJ Model resolution (Å) FSC threshold 3.4 0.5 3.8 0.5 4.3 0.5 4.1 0.5 4.0 0.5 3.9 0.5 3.8 0.5 Model resolution range (Å) 3.2-8.0 3.1-8.0 3.7-8.0 3.1-8.0 3.0-8.0 3.1-8.0 3.0-8.0 Map sharpening B factor (Å ) -35 -35 -40 -35 -30 -35 -30 Model composition

Non-hydrogen atoms Protein residues Ligands 105749 13476 9 104415 13314 9 103729 13236 5 105343 13488 11 105343 13488 11 105343 13488 11 105343 13488 11 105343 13488 11 B factors (Å ) Protein Ligand 174.88 143.96 178.15 106.39 167.88 268.45 213.61 208.40 211.63 183.93 148.64 154.06 165.65 175.01 R.m.s. deviations

Bond lengths (Å) Bond angles (°) 0.006 1.017 0.005 0.888 0.007 0.980 0.006 1.079 0.008 1.115 0.006 1.027 0.008 0.987 0.005 0.893

Validation

MolProbity score Clashscore Poor rotamers (%) 1.83 5.57 0.46 1.65 4.17 0.4 1.73 4.39 0.62 2.44 18.24 0.36 2.47 18.78 1.47 2.28 16.38 0.96 2.45 18.97 1.37 2.45 17.86 1.68

Ramachandran plot

Favored (%) Allowed (%) Disallowed (%) 90.89 8.77 0.34 91.88 7.91 0.21 91.97 7.82 0.21 89.47 10.17 0.36 89.31 10.32 0.37 89.58 10.1 0.32 89.09 10.78 0.13 90.79 8.99 0.22 Cryo-EM data collection, refinement and validation statistics (continued) E D1.3 E D2 E D2.2 E D2.3 E A1.RPN1 E D.α5

Data collection and processing

Magnification 105,000 105,000 105,000 105,000 105,000 105,000 Voltage (kV) 300 300 300 300 300 300 Electron exposure (e–/Å ) 44 44 44 44 44 44 Defocus range (μm) -0.6 to -3.5 -0.6 to -3.5 -0.6 to -3.5 -0.6 to -3.5 -0.6 to -3.5 -0.6 to -3.5 Pixel size (Å) 0.685 0.685 0.685 0.685 1.37 0.685 Symmetry imposed C1 C1 C1 C1 C1 C1 Initial particle images (no.) 3,254,352 3,254,352 3,254,352 3,254,352 3,254,352 3,254,352 Final particle images (no.) 146,506 856,683 24,674 6,192 128,161 131,570 Map resolution (Å) FSC threshold 3.2 0.143 2.5 0.143 3.7 0.143 7.0 0.143 3.7 0.143 3.1 0.143 Map resolution range (Å) 2.5-8.0 2.5-4.8 3.0-8.0 6.0-20.0 3.0-8.0 2.5-10.0 Refinement

Initial model used 6MSJ 6MSK 6MSK 6MSK 6MSK Model resolution (Å) FSC threshold 3.9 0.5 3.2 0.5 3.5 0.5 3.8 0.5 3.6 0.5 Model resolution range (Å) 3.0-8.0 2.5-5.8 3.5-8.0 3.5-8.0 2.9-5.8 Map sharpening B factor (Å ) -30 -20 -40 -100 -25 Model composition

Non-hydrogen atoms Protein residues Ligands 63654 8203 11 105218 13420 11 107626 13724 11 106422 13572 11 7717 994 107626 13724 11 B factors (Å ) Protein Ligand 178.66 187.12 102.99 116.74 182.3 162.17 141.87 162.57 135.00 R.m.s. deviations

Bond lengths (Å) Bond angles (°) 0.005 0.913 0.004 0.810 0.005 1.033 0.005 1.033 0.005 0.837 0.007 1.088

Validation

MolProbity score Clashscore Poor rotamers (%) 2.32 17.69 0.47 1.84 5.48 0.69 1.84 5.48 0.69 1.84 5.48 0.69 3.41 24.57 12.01 1.88 6.09 0.73

Ramachandran plot