Imagine having access to accurate drug candidate properties instantly without always needing to perform assay testing. Because of advances in the field of artificial intelligence and in-silico molecular testing, you no longer have to imagine. Using only the structure of a molecule, it is now possible to generate high-accuracy and near-instantaneous property predictions for a wide variety of small molecule drug candidates.
At Cognistx, we specialize in building advanced applied machine learning products that provide business value in real-world scenarios. Given the tremendous potential for artificial intelligence in drug discovery, we have developed MoleculeAI - a centralized platform for small molecule lead generation and optimization with AI tools for molecular property prediction, compound clustering, and automatic compound generation.
In this article, we will first introduce the drug development pipeline and identify the primary bottlenecks which can be solved using AI. Next, we will consider a case study for using the MoleculeAI platform to generate novel NSAID derivatives. Finally, we will highlight the business return-on-investment opportunities created by using MoleculeAI.
Before a drug is available on the market, it must first pass through several key stages of pre-clinical drug development, including target identification, compound screening, lead identification, and lead optimization. Overall, the drug development process can take 12-15 years and cost over $1 billion [1]. AI can help you save valuable time and help reduce the risk for lead candidates, as illustrated in Figure 1.
Figure 1: How In Silico testing reduces the number of compounds and saves precious time it takes to develop a new drug.
Lead optimization is the primary bottleneck within the drug discovery phase before preclinical trials may commence, and it requires scientists to rank their candidates based on how they are predicted to do during preclinical trials, which can be inferred from chemoinformatics and biomolecular expertise. Given how much computational translation there has been in the realms of cheminformatics and biology, coupled with the amount of available pharmacological and biomolecular datasets, inferences that scientists have made for years can be heavily streamlined through computational calculations and predictive tasks using AI. Scientists can now measure the potential success of their candidates in this manner, and we call it “in-silico testing”. Traditional psychedelics like LSD and Psilocybin have shown to have high binding affinity to the 5-HT receptor. With this in mind, Enveric was interested in investigating how predictable the binding affinity of a given compound to the 5-HT receptor would be using available datasets and Machine Learning modalities.
Our work was primarily focused on a subset of the BindingDB database (1-3), an online public data source containing information curated from patents, journals, and binding measurements made by scientists. Specifically, we extracted a subset of the BindingDB database containing binding affinities measurements for small molecule drug candidates against various serotonin receptors. To perform model training, we used Cognistx’s AutoMol high-performance training pipeline. AutoMol is a centralized resource for fully-automated model training, model architecture hyperparameter tuning, ensembling, and deployment for molecular property prediction. Machine learning in industry settings is a slow and expensive process. Data scientists often spend much of their time running different experiments and analyzing results before passing their models onto software engineers, which create the infrastructure for model deployment. Using the AutoMol training pipeline, we can focus more of our time on working directly with the client to ensure that model predictions align with their goals. Once training is complete, AutoMol automatically deploys the highest-performing models to our MoleculeAI dashboard. This allows clients to view model predictions on their current molecular library and generate accurate predictions for new compounds in real-time. MoleculeAI also gives clients access to historical modeling predictions to easily measure progress over time and compare results.
Figure 2: AutoMol’s structured AI approach to developing custom models to meet customer needs.
Enveric’s goal for this engagement was to quickly produce accurate predictions for small molecule drug candidates against serotonin receptors of interest.
From a business standpoint, the automatic prediction of serotonin receptor binding affinity, given the appropriate dataset configuration per model, allows Enveric scientists to effectively rank new candidates almost instantaneously, as compared to manual review where candidates would need to be tediously compared to compounds that have already had validated metrics for binding affinity to the serotonin receptor.
The time saved on compound validation plus the cost saved from not running assay testing for compounds predicted to have significantly low binding affinity reinforces the business value of these automatic model-based metrics. Beyond the time and cost saved in this process, the advanced analytic capabilities of MoleculeAI gives the user access to significantly more data than before to make more informed decisions and reduce the risk of moving certain compounds forward.
Over time, as more assay data is collected, the training set per predictive model grows, which leads to more fine-tuned predictive metrics on the client’s proprietary data. In the future, our models may include features from docking utilities such as AutoDock Vina, highlighting the binding profile between the actual serotonin receptor protein and a given ligand.
AutoMol is highly scalable and customizable from a featurization and modeling architecture perspective, meaning that the complexity of predictions will only increase as the knowledge space of computational chemistry and biology continues to expand. Because this process is much faster than (but retains the same performance as) traditional machine learning approaches, we can spend more time focusing on data analysis and exploring new, creative techniques to incorporate AI models into your workflow.
For more information on MoleculeAI, please contact Jagriti@Cognistx.com
References