Adaptive Ensemble Refinement of Protein Structures in High Resolution Electron Microscopy Density Maps with Radical Augmented Molecular Dynamics Flexible Fitting

Daipayan Sarkar, Hyungro Lee, John Vant, Matteo Turilli, Josh Vermaas, Shantenu Jha, Abhishek Singharoy

Abstract. Recent advances in cryo-electron microscopy (cryo-EM) have enabled modeling macromolecular complexes that are essential components of the cellular machinery. The density maps derived from cryo-EM experiments are often integrated with manual, knowledge-driven, and physics-guided computational methods to build, fit, and refine molecular structures. Going beyond a single stationary-structure determination scheme, it is becoming more common to interpret the experimental data with an ensemble of models, which contributes to an average observation. Hence, there is a need to decide on the quality of an ensemble of protein structures on-the-fly, while refining them against the density maps. We introduce such an adaptive decision making scheme during the molecular dynamics flexible fitting (MDFF) of biomolecules. Using RADICAL-Cybertools, and the new RADICAL augmented MDFF implementation (R-MDFF) is examined in high-performance computing environments for refinement of two protein systems, Adenylate Kinase and Carbon Monoxide Dehydrogenase. The use of multiple replicas in flexible fitting with adaptive decision making in R-MDFF improves the overall quality of the fit and model by 40% relative to the refinements of the brute-force MDFF. The improvements are particularly significant at high, 2 - 3 Å map resolutions. More importantly, the ensemble model captures key features of biologically relevant molecular dynamics that is inaccessible to a single-model interpretation. Finally, this pipeline is applicable to systems of growing sizes, with the overhead for decision making remaining low and robust to computing environments.