ARC23 Speaker Olexandr Isayev



Active learning guided drug design lead optimization based on relative binding free energy modeling

Olexandr Isayev, assistant professor, CMU Department of Chemistry

Abstract
In silico identification of potent protein inhibitors commonly requires prediction of a ligand binding free energy (BFE). Thermodynamics integration (TI) based on molecular dynamics (MD) simulations is a BFE calculation method capable of acquiring accurate BFE, but it is computationally expensive and time-consuming. In this work, we have developed an efficient automated workflow for identifying compounds with the lowest BFE among thousands of congeneric ligands, which requires only hundreds of TI calculations. Automated machine learning (AutoML) orchestrated by active learning (AL) in an AL–AutoML workflow allows unbiased and efficient search for a small set of best-performing molecules. We have applied this workflow to select inhibitors of the SARS-CoV-2 papain-like protease and were able to find 133 compounds with improved binding affinity, including 16 compounds with better than 100-fold binding affinity improvement. We obtained a hit rate that outperforms that expected of traditional expert medicinal chemist-guided campaigns. Thus, we demonstrate that the combination of AL and AutoML with free energy simulations provides at least 20× speedup relative to the naïve brute force approaches.


Biography
Olexandr Isayev is an assistant professor in the  Department of Chemistry at Carnegie Mellon University. His research focuses on theoretical and computational chemistry, machine learning, cheminformatics, drug discovery, computer-aided molecular design, and materials informatics. He directs the Isayev lab, which works at the interface of theoretical chemistry, pharmaceutical sciences and computer science. In particular, they use molecular simulations and artificial intelligence to solve difficult problems in chemistry. Issayev's lab is working towards the acceleration of molecular discovery by  combining AI, informatics, and high-throughput quantum chemistry. They also focus on both generative and predictive ML models for chemical and biological data.