AI/ML tools, startups and companies transforming drug screening
Using AI to improve efficiency and effectiveness during drug screening
Drug discovery starts with screening large libraries of small molecules to identify compounds (called hits) with activity against a biological target (examples of common classes of biological targets are proteins and nucleic acids). This often involves searching and wandering around in a vast chemical space, comprising more than 10^60 molecules each waiting to be investigated by scientists (In case you need a comparison, there are something like 10^22 to 10^24 stars in our known Universe…).
Accordingly, during screening we can only compromise by focusing on the known chemical space, with chemicals that can be found on smaller chemical libraries, public databases (for example: PubChem, Chemspider, ZINC, NCI, ChemDB, BindingDB, ChemBank, ChEMBL, CTD, HMDB, SMPDB, DrugBank) and corporate collections—containing something like 10^8 molecules (100 million).
In modern drug discovery, the screening of large libraries can be done with in vitro high-throughput screening experiments of chemical libraries (High-throughput screening, HTS) and also with in silico methods to virtually screen compounds in order to identify novel drug candidates. So, today’s newsletter is an overview of the different AI/ML tools, startups and companies transforming drug screening during hit identification (identify compounds/hits that have the potential to bind to a specific target protein or nucleic acid and modulate its activity in a desired way).
➡️ High-Throughput Screening (testing large libraries of chemicals on proteins, cells or animal embryos)
Synsight is a deep tech company developing a screening technology that enables the development of effective first-in-class drug candidates (for RNA targeting) based on an AI discovery platform and cell imaging, with phenotypic assays, high-content screening and high-content imager allowing to acquire more than 60000 images per day. In particular, Synsight developed the Microtubule Bench technology (MT bench®), an industrialized cell testing to screen molecules by microscopy and identify and quantify the modulations of small molecules on protein-protein interactions or between protein and nucleic acid. In 2022, Synsight secured a 1,5M funding round by a Chinese investor.
Ailynix is an AI Drug Design and Discovery company to predict the biological activity of chemical compounds. For example, Zunomics, a subsidiary of Ailynix, has developed a Computational Antiviral Drug Discovery (CAViDD) Platform to discover novel oral antiviral drugs, by mining unexplored chemical space to deliver innovative medicines.
Pangea Bio utilizes AI to uncover promising molecules from nature’s diverse chemical space, enhanced by traditional human evidence of safety and efficacy. In particular, The PangeAI discovery platform (Knowledge Graph, Computational Metabolomics, Compound Activity Profiling) accelerates the discovery and development of novel therapeutics from plants and other kingdoms of life, to translate nature’s metabolome into medicine, for neurological and neuropsychiatric diseases.
Vevo Therapeutics, a biotechnology company using its Mosaic in vivo drug discovery platform and AI models to uncover better drugs, was launched last year with an oversubscribed and upsized $12M seed financing round. The Mosaic platform is the first platform to make in vivo data generation scalable, with single-cell precision, while capturing patient diversity in drug response. In a single in vivo experiment, Mosaic can measure how a drug impacts cells from tens to hundreds of patients, generating millions of datapoints on drug-induced changes in gene expression.
Another AI drug discovery company utilizing data at single-cell resolution to identify cell-state transitions that drive disease is Cellarity, a life sciences company founded by Flagship Pioneering. Until now, the dominant approach in drug discovery has been to reduce disease biology into a single molecular target and then leverage high-throughput screening to identify molecules that bind to these targets. But at Cellarity, they focus on the whole cell because, most often, a disease isn’t driven by one mechanism or protein, accordingly they use single-cell technologies to identify the cellular drivers of the transition from health to disease and then apply DL models to create drugs that reverse disease at the cellular level. On July 27, 2023, Cellarity announced that The Galien Foundation, the premier global institution dedicated to honoring innovators in life sciences, named Cellarity a 2023 Prix Galien USA Award nominee for "Best Startup. And in October, 2023, Cellarity announced a partnership with the Chan Zuckerberg Initiative to drive innovation in ML algorithms for single-cell analysis via support of the Open Problems in Single-Cell Analysis initiative.
Quote:
“There are more than 1,500 algorithms developed for single-cell data, and understanding the deep complexity of cells captured by single-cell technologies requires robustly evaluating the performance of these methods.” Diogo Camacho, Ph.D., Vice President of Computational Biology at Cellarity
Quris has an AI Chip-on-Chip platform that allows automated testing of thousands of drugs on miniaturized Patients-on-a-Chip, while next-generation nano-sensors allow for continuous monitoring of the responses from each miniaturized organ to these drugs. Then, their ML classification algorithm is trained with the data continuously generated in this high-throughput system. On September 28, 2023, Quris announced the extension of its collaboration with Merck to leverage Quris-AI platform’s ability to effectively identify liver toxicity risks in a selection of drug candidates.
Quote
“Based on the results of our initial collaboration, we are looking forward to exploring how its BioAI platform can advance our drug development and testing programs, and working towards an AI-enabled IND process that reduces the reliance on animal testing.” Said Danny Bar-Zohar, Global Head of Research & Development at Merck
The Quris BioAI platform (29 granted and pending patents) delivers
best-in-class drug safety, through its advanced ML and generative AI models, which are trained on Quris-AI’ highly predictive, proprietary data, that are generated from its AI-based patient-on-chip platform. This dramatically accelerates and cuts costs of drug development, and avoids the potentially disastrous pitfalls of traditional animal testing.
A novel deep-learning algorithm called CeCILE (Cell Classification and In-vitro Lifecycle Evaluation), is used to detect and analyze cells on videos obtained from phase-contrast microscopy, up to a sample size of 100 cells per image frame, in order to gain information about cell numbers, cell divisions and cell deaths over time during drug screening.
In this GEN webinar (Register here), Dr. Shelby Wyatt VP of Global Pharma Strategy at Flywheel.io (a cloud-based company with a medical imaging AI platform) will discuss how AI-based medical imaging can help imaging/data scientists accelerate drug development initiatives, and how Flywheel can build reliable, scalable medical imaging solutions that seamlessly integrate advanced AI technology from NVIDIA and other technology partners to enable robust imaging data management and analysis. During drug screening Flywheel can offer the following solutions to the imaging research labs:
Metadata management with search,
Automated pre-processing & pipelines,
Machine learning workflow,
Customisation via APIs, Python, & Matlab,
Provenance,
BIDS support, and
Secure collaboration.
On June 27, 2023, Flywheel announced it has raised $54M in Series D funding co-led by Novalis LifeSciences LLC and NVentures, NVIDIA’s venture capital arm. Microsoft also participated in the round, along with insiders Invenshure, 8VC, Beringea, Hewlett Packard Enterprise, Intuitive Ventures, iSelect, Gundersen Health System, Seraph, and Great North Ventures. Faegre Drinker Biddle & Reath LLP served as counsel to Flywheel in connection with the financing.
Molecular Devices, one of the leading providers of high-performance bioanalytical measurement solutions for life science research, pharmaceutical and biotherapeutic development, has just introduced the CellXpress.ai™ an automated cell culture system for screening, a revolutionary ML-assisted solution that standardizes the entire cell culture journey to deliver consistent, unbiased, and biologically relevant results at scale. The CellXpress.ai™ is an AI-driven cell culture innovation hub that gives your team total control over demanding cell culture feeding and passaging schedules (eliminating time in the lab), while maintaining a 24/7 schedule for growing and scaling multiple stem cell lines, spheroids or organoids. All of it backed with the assurance of a full event log to confirm on-time feedings and critical task execution with complete digital microscopy records.
Moreover, Molecular Devices is offeringan AI-based software that provides Photoshop-like tools for image annotation, and
the ImageXpress® Confocal HT.ai High-Content Imaging System, designed to help researchers advance phenotypic screening of 3D organoid models. The ImageXpress utilizes a seven-channel laser light source with eight imaging channels to enable highly multiplexed assays while maintaining high throughput by using shortened exposure times. Water immersion objectives improve image resolution and minimize aberrations so scientists can see deeper into thick samples. Moreover, the combination of MetaXpress® software and IN Carta® software simplifies workflows for advanced phenotypic classification and 3D image analysis with ML capabilities and an intuitive user interface.
➡️ Assay Development and assay interference during screening
A different problem during screening is assay interference caused by small molecules during in vitro experiments. Several approaches have been developed that allow scientists to flag potentially “badly behaving compounds”. Usually, these compounds are typically aggregators, reactive compounds and/or pan-assay interference compounds and many are frequent hitters. The solution to this problem comes from Hit Dexter, a recently introduced ML approach that predicts how likely a small molecule is to trigger a false positive response in biochemical assays. In particular, the new Hit Dexter 2.0 web service covers both primary and secondary screening assays, providing user-friendly access to similarity-based methods for the prediction of aggregators and dark chemical matter (a set of drug-like compounds that has never shown bioactivity despite being extensively assayed), as well as a comprehensive collection of available rule sets for flagging frequent “bad” hitters and compounds including undesired substructures.
➡️ Protein prediction tools (AI can assist in structure-based drug discovery by predicting the 3D protein structure, that determines the biological function of the target)
AlphaFold is considered the gold standard for AI protein structure predictions, making predictions with high accuracy of something like 200 million proteins. Other 3D protein prediction tools are Meta’s ESM Metagenomics atlas, ColabFold, RoseTTAFold, IntFOLD and OmegaFold launched by Helixon.
In particular, RoseTTAFold is a “three-track” neural network that simultaneously considers patterns in the protein sequence, how a protein’s amino acids interact with one another and a protein’s possible 3D structure. In this neural network, sequence, distance and the 3D information flow back and forth, allowing the network to collectively reason about the relationship between a protein’s chemical parts and its folded structure.
RoseTTAFold can solve challenging x-ray crystallography and cryo–electron microscopy modeling problems and is the most renowned tool used for ab initio protein predictions, that are the most challenging to conduct, as they involve predicting protein structures based only on first-principles and without using existing templates.
Both AlphaFold2 and RoseTTAFold rely on Multiple Sequence Alignments (MSAs) as inputs to their models, which map the evolutionary relationship between corresponding residues of genetically-related sequences, derived from large, public, genome-wide gene sequencing databases that have grown exponentially since the emergence of next-generation sequencing. Finally, the Rosetta molecular modeling software package is used right now by OutpaceBio, to customize drug design creating next-generation smart cell therapies.On July 20, 2022, the Chinese biotech firm Helixon launched OmegaFold, a new combination of a protein language model (PLM) that allows making predictions from 1) single sequences, eliminating the need for MSAs, and 2) from a geometry-inspired transformer model, the Geoformer module, a new geometry-inspired transformer neural network to further distill the structural and physical pairwise relationships between amino acids (study).
So far, OmegaFold claims to outperform RoseTTAFold and achieved similar prediction accuracy to AlphaFold, along with other models such as HelixFold-Single and ESMFold. It is touted to have a higher potential in predicting the structure of orphan proteins and antibodies that don't require MSAs as their input.NVIDIA and Evozyne created a generative AI model for proteins, to generate predictions for proteins whose structure is unknown. Evozyne used NVIDIA’s implementation of ProtT5, a transformer model that’s part of NVIDIA’s BioNeMo, a software framework and service for creating AI models for healthcare, and created two proteins with significant potential in healthcare and clean energy. On September 27, 2023, Evozyne announced the closing of an $81M Series B investment round. Fidelity Management & Research Company and OrbiMed led the funding with participation from NVentures, NVIDIA’s venture capital arm. Previous investors Paragon Biosciences and Valor Equity Partners expanded their support in the round.
➡️ AI-Driven Virtual Screening (Virtual screening, VS, is the computational counterpart of the experimental HTS, in which compounds from chemical libraries are tested for their activities against a biomolecular target that might have therapeutic relevance toward a specific disease)
The Deep Docking (DD) platform enables up to 100-fold acceleration of structure-based virtual screening by docking only a subset of a chemical library, synchronized with a ligand-based prediction of the remaining docking scores. This method results in hundreds- to thousands-fold virtual hit enrichment (without significant loss of potential drug candidates) and hence enables the screening of billion molecule–sized chemical libraries without using extraordinary computational resources.
Codexis is a leading enzyme engineering company that has a proprietary platform, the CodeEvolver that provides in silico, high-throughput assay screening with AI, and has the power to transform the performance of an enzyme, tailoring it for a specific application and process. By using powerful ML tools and sophisticated molecular, cellular and bioanalytical workflows, at Codexis they can design and screen libraries of thousands of enzyme variants in high throughput, then sequence every variant and correlate its sequence with its performance in a highly application-relevant screen. Among Codexis' partners you can find Merck, GSK, Novartis, Nestle, Takeda and many more.
Adimab is the industry leader in translating your target hypotheses into therapeutically relevant antibody drugs. They have a traditional discovery process that starts with AI mining their large synthetic human IgG repertoire, which ensures every antibody delivered is unique.
On February 09, 2023, Ablexis, a biopharmaceutical company focused on licensing its AlivaMab Mouse technology for antibody drug discovery, announced a license agreement with Adimab in order to implement Ablexis’s AlivaMab Mouse into Adimab’s proprietary yeast-based technology for antibody drug discovery. Financial terms of the license were not disclosed. And just a month ago, Ono Pharmaceutical (one of the largest pharmaceutical companies in Japan) and Adimab have signed a drug discovery collaboration agreement for the development of antibody drugs in the oncology segment.Innophore, with a cutting-edge drug and enzyme discovery platform that uses AI guided point-cloud technology, not only analyzes a protein’s 3D structure but includes extended surface properties (HALOS) and volumetric cavities (catalophores) to predict target's characteristics and reactivity in AI virtual screening simulations. Their AI-driven strategy to design novel therapeutic enzymes combines the Catalophore technology, a mix of prepared protein structural data (CATALObase), and search 🔎 algorithms and patterns tailored to specific needs.
Among their products you can also find Cavitomix, a PyMol Plugin. CavitOmiX plugin for Schrodinger’s PyMOL, is a tool that allows you to analyze protein cavities from any input structure. You can dive deep into proteins, Catalophore cavities and binding sites using crystal structures and state-of-the-art AI models from OpenFold (powered by NVIDIA’s BioNeMo service), DeepMind`s AlphaFold and ESMFold by Meta. And by just entering any protein sequence users can get the structure predicted by OpenFold or ESMFold loaded into their PyMOL within seconds.Gandeeva’s technology includes three proprietary platform modules working in concert: SPOTLIGH, a Target Selection Engine, an AI-based approach to identify a continuous stream of validated targets; HYPERFOCUSTM, a Cryo-EM Engine, a state-of-the-art atomic resolution imaging to map druggable sites; and CRYO-CADDTM, a Drug Discovery Engine, a rapid iterative cycle to generate structural insights at the speed of chemistry. On March 30, 2023, Gandeeva Therapeutics announced the initiation of a research collaboration with Moderna Inc. to explore applications of Gandeeva’s technology platform for a Moderna program.
Diadem is a EU project developing a platform for organic electronics providing a one-stop-shop solution from digital discovery to experimental verification by linking the virtual screening of small molecule candidates with the chemical supply chain. They offer a one-stop solution for searching, refining and supplying chemicals, with the power of cheminformatics and molecular modeling, without users having to worry about physical computing infrastructure while offering the following benefits:
Completely novel chemical solutions,
Robust property prediction,
Candidate available from chemical supply chain,and
One-stop integrated platform for fast service from discovery to lab.
Diadem has three participants:
The Materials Innovation Factory, located at the University of Liverpool, aiming to accelerate product development and gain competitive advantage through smarter, faster and more precise ways of working, using world class automated lab equipment.
Nanomatch, an SME based in Germany, developing predictive, adjustable virtual design tools for organic electronic applications. Using a multiscale simulation approach, they translate molecular properties to the device scale, and thereby bridge the gap between fundamental chemistry and device design.
Mcule, an online drug discovery platform, providing the highest quality database integrated with molecular modeling searching tools and cloud services to help biotech and pharma companies find new drug candidates quickly and efficiently. Mcule’s services include
Compound sourcing, based on a high-quality compound database, advanced compound selection, automated price optimization and professional delivery service,
Hit identification tools, their Workflow Builder is a cost-effective, cloud-based solution for identifying new chemical starting points by structure- and ligand-based virtual screening and screening library design, and
Lead optimisation tools, intuitive, easy-to-use modeling applications specifically designed for bench scientists to evaluate and generate ideas in the lead optimisation process.
The consortium members just held their 4th progress meeting in Heidelberg, Germany between the 12th and 13th of October, 2023. The meeting mainly focused on the development of the alpha platform version and the refined and extended DiaDEM chemical database.
Thank you for reading,
📄 References: