AI/ML tools and startups for preclinical drug discovery
An overview of AI/ML startups π transforming preclinical drug discovery
Data-Driven Preclinical Drug Development
During drug development once a target of interest (protein or nucleic acid) has been identified and validated through the target identification phase (target prioritization, disease target association and ligandability assessment), the next stage in drug discovery is to identify high-quality chemical hits that bind to and modulate the specific target (hit identification). Hit identification (namely a starting point for downstream activities) is the most critical step to identify compounds able to interact with the fully validated target.
Once the hits have been identified, follows the lead identification and optimization phase, which is the most expensive and time-consuming phase in drug discovery, with the goal of identifying compounds with an optimal balance of drug-like properties while maintaining sufficient potency. A lead compound is a chemical compound that has pharmacological or biological activity likely to be therapeutically useful, but might have suboptimal structure that requires modification to fit better to the target.
β‘οΈ This is an overview of AI in Target Identification and Validation Analysis with:
β Statistical analysis-driven approaches: omics data, including genome-wide association studies (GWAS) and summary data-based Mendelian randomisation (SMR).
A curation of ML studies applied to post-GWAS prioritization of variants and genes can be found here.
β Network-based approaches: gene co-expression and miRNA-disease networks.
β ML techniques: including classifiers (random forest, support vector machine, Neural Net) and regression models.
β A key figure of the workflow of AI-driven target discovery can be found here.
β The Key Databases for Target Identification are:
π’ LinkedOmics: a comprehensive database of cancer clinical and molecular data that gathers TCGA cancer-related multi-omics, clinical and mass spectrometry proteomics data.
π’ DepMap portal: a website portal offering analytical and visualization tools for cancer that includes cancer cell line sensitivity and genetic data, and
π’ Therapeutic target database: a database of linked medications and recognised therapeutic proteins, nucleic acids and diseases.
β A novel target prioritization approach is GuiltyTargets, which relies on attributed network representation learning of a genome-wide protein-protein interaction network annotated with disease-specific differential gene expression and uses positive-unlabeled (PU) machine learning for candidate ranking. An application of GuiltyTargets to Alzheimerβs disease resulted in a number of highly ranked candidates that are currently discussed as targets in the literature. Interestingly, one of them is also the target of an approved drug (Tolcapone) for Parkinsonβs disease, highlighting the potential for target repositioning with this method.
β‘οΈ This is an overview of AI in Hit Identification and Validation Analysis with:
π© Reverse pharmacology or drug design is based on the hypothesis that a βdesignedβ molecule can induce a specific modulation of a biological target:Β AI/ML tools and startups for drug design.
π© In modern drug discovery, the screening of large libraries can be done with in vitro high-throughput screening experiments of chemical libraries (High-throughput screening, HTS) and also with in silico methods to virtually screen compounds in order to identify novel drug candidates: AI/ML tools, startups and companies transforming drug screening.
π© Drug repositioning or drug repurposing is an approach to accelerate the drug discovery process through the identification of a novel clinical use for an existing drug approved for a different indication: AI drug repurposing β»οΈ startups: should you date your ex?.
π© Knowledge Discovery in Databases, refers to the nontrivial extraction of implicit, previously unknown and potentially useful information from data stored in databases: Biomedical data mining: AI/ML tools and startups. And
π© During drug discovery, once a potential drug has been identified, a very important step is to elucidate the molecular target and mechanism of action (MoA) π of the new drug: Researching the mechanism of action π© of a drug with AI.
β‘οΈ This is an overview of Lead identification and Optimization Analysis:
In general, lead identification and optimization involves optimization of multiple objectives, such as
β chemical synthesis, safety, specificity, efficacy and pharmacokinetics properties,
β DMPK modeling that stands for drug metabolism and pharmacokinetics,
β ADME modeling that stands for absorption, distribution, metabolism and excretion and
β PBPK modeling using a series of differential equations that are parameterized with known physiological variables,
while maintaining drug potency, and can be assisted by AI:
β³ AI/ML tools for planning and execution of chemical synthesis,
β³ AI-Guided Absorption, Distribution, Metabolism, Excretion, and Toxicity (ADMET) Prediction, and
At the end of the target & hit identification & validation as well as the lead identification & validation phase of drug discovery starts the preclinical drug development.
The preclinical phase of drug developmentβalso referred to as preclinical studies or nonclinical studiesβis a stage of research that begins after the drug discovery phase (target π― selection, target validation, compound screening π and lead optimisation) and before the clinical trials phase, and during which important feasibility, iterative testing and drug safety data are collected.
The preclinical studiesβin vitro π§«, in vivo π, ex vivo π§ͺπ± and in silico π₯οΈ analysisβ are mainly performed in compliance with GLP/GSP guidelines (good laboratory practice and good scientific practices) to ensure reliability and reproducibility of results. In general, the most commonΒ challenges during preclinical experiments have to do with:
π reducing time, money and uncertainty in planning preclinical experiments,
π automating sample analysis with robotic cloud laboratory,
π automating the selection, manipulation and analysis of cells, mice π, antibodies, reagents etc,
π using secondary measures such as patient-derived xenograft mouse tumor models for clinical efficacy in oncology programs,
π using HepG2 cells as a surrogate for genotoxicity,
π using Caco-2 cells permeability assay as a surrogate for estimating human intestinal permeability,
π translational biomarker discovery, and
π estimation of a first-in-human (FIH) dose.
β‘οΈ This is an overview of AI solutions for Preclinical Development:
β The Toronto based biotech company BenchSci founded in 2015, is utilizing AI to fight the data reproducibility crisis fromΒ unreliable antibodiesΒ (Reproducibility crisis: Blame it on the antibodies), an important issue during the preclinical phase of drug development.
Backed by Googleβs AI fund Gradient Ventures, the ASCEND platformβ BenchSciβs end-to-end SaaS platformβenables the discovery of biological connections, reduces unnecessary experiments and uncovers risks at the preclinical phase.Β ASCEND is the industry standard for antibody selection and over 50,000 scientists in 16 of the top 20 pharmaceutical companies and more than 4,500Β academic institutions use ASCEND to navigate antibody data to plan more successful experiments, with proven savings of millions per year in hard costs alone, since antibodies constitute 40-50% of reagent failures. Moreover, ASCEND helps also with exploring biological connections between proteins, pathways, disease and other entities, with prioritizing targets and with validating hypotheses.
On May 25, 2023, BenchSci announced a US $70M Series D funding round (for a total of $170M) led by Generation Investment Management, with participation from existing investors iNovia Capital, TCV, Golden Ventures and F-Prime Capital.
β Berkeley Lights now part of Bruker Cellular Analysis provides AI solutions for antibody discovery, cell line development, cell therapy development and synthetic biologyΒ for life science research.
Back in March 2023, Berkeley Lights and IsoPlexis merged and PhenomeX was formed, and as a result The Beacon Optofluidic System (for moving cells without physical contact) by Berkeley Lights and The Protein Barcoding Suite (for evaluating the functional phenotypes of individual cells) by IsoPlexis were brought together. Then on August 17, 2023, PhenomeX and Bruker entered into a merger agreement with PhenomeX, to acquire functional cell biology company PhenomeX for a total equity value of $108M, that will serve as Brukerβs entry point into single-cell research and boost its spatial biology programs (by also acquiring Canopy Bioscience and investing in Acuity Spatial Genomics).
Bruker Cellular Analysis, in order to understand the phenome, is now offering:
Bruker Cellular Analysis' Optofluidic technology to accelerate your product development by analyzing the phenotype and genotype of single cells or clones and generating insights by rapidly screening 1000s of cells at once.
Proteomic Barcoding Product Suite to capture highly multiplexed cytokine, chemokine, and phosphoprotein signatures from single cells to predict disease progression, treatment resistance and therapeutic efficacy.
Optofluidic Workflows: Beacon for antibody discovery, cell line development, T-cell profiling and synthetic biology.
Proteomic Barcoding Solutions: IsoSpark for IsoCode immune cell behavior, IsoCode intracellular Signaling Insights, CodePlex for high impact protein screening and Meteor for quantitative bulk proteomics.
Bruker (founded on September 7, 1960, in Karlsruhe, Germany) advancing life sciences with Green-Tech and AI had $1.367 billion revenue in Q1βQ2 2023 and $2.531 billion in 2022.
β Evaxion Biotech is a preclinical-clinical AI immunology platform company aiming to develop novel immunotherapies to treat cancer and infectious diseases. PIONEERβ’ is their proprietary AI-Immunology model for the rapid discovery of patient-specific neoantigens used for personalized cancer immunotherapies. Using proprietary computational AI models developed at Evaxion, PIONEERβ’ has been developed to select the neoantigens that are most likely to generate a strong T-cell response and anti-tumor effect in each patient. This platform holds the potential for generating one new target every 24 hours. Evaxion Biotech (Denmark, 2008) has raised a total of $77M and for more about their 2023 business progress here.
β Selvita is one of the largest preclinical contract research organizations in Europe, providing multidisciplinary support in resolving the unique challenges of research within drug discovery, regulatory studies, as well as research and development. The Selvita Group owns Ardigen (Poland, 2015), a bioinformatics company harnessing advanced AI methods for novel precision medicine, by offering specialized services for immunology, microbiome, biomedical imaging and digital CRO. Ardigenβs Biomedical Imaging Platform integrates the most recent advancements in AI (computer vision and cheminformatics) with molecular biology and medicinal chemistry, bridging the gap between cell imaging and small molecule design, combining molecular structures, high content screening images and omics data.
β CellarityΒ based in Cambridge, Massachusetts was created by Flagship Pioneering and launched in December 2019 to develop medicines by studying and altering the cellular signatures of disease, by looking beyond individual molecular targets and instead focusing on the whole cell by using single-cell technologies. Cellarityβs unique platform links biology and chemistry with high-dimensional, transcriptomic data to design medicines against cellular signatures of disease instead of just a single target, so their approach is designed to drive higher clinical translatability and success. Cellarity has a partnership with the Chan Zuckerberg Initiative to drive innovation in ML algorithms for single-cell analysis via support of the Open Problems in Single-Cell Analysis initiative.
On January 4, 2024, Novo Nordisk and Cellarity announced the expansion of their research collaborations that aims to unravel novel biological drivers of nonalcoholic steatohepatitis (NASH) and will leverage Cellarityβs platform to develop a small molecule therapy against this disease. Under the terms of the agreement, Novo Nordisk will reimburse R&D costs. Additionally, the agreement may pay up to $532M in upfront, development and commercial milestone payments, as well as tiered royalties on annual net sales of a licensed product, to be shared between the respective companies and Flagshipβs Pioneering Medicines. Cellarity raised a total of $294M.
β Genome Biologics (Belgium, 2016) a resident company of Johnson & Johnson Innovation - JLABS develops solutions for preclinical drug discovery.
One of the startupβs products is GENIMPAS a cloud-based, in-silico accelerated drug repositioning and development platform that uses pattern recognition and ML to match compound databases (transcriptomics, GWAS, CVS and protein interaction) and drug discovery pipelines with profiles of disease-relevant genes. GENIMAPS utilizes a customized pattern recognition and real-time ML algorithms to perform matches with ever increasing accuracy and precision, and is directly coupled to the GENISYST for accelerated preclinical drug testing platform.
GENISYST is a patented technology for in vitro and in vivo multiplexed disease modeling, applied to modulate a combination of coding genes, microRNAs and non-coding RNAs in specific tissues or cell lineages of the adult, newborn or embryo. GENISYST is also applicable for in vitro 2D and 3D (human tissue organoid) cell culture based drug screens. GENISYST is applicable in small (mice, rats) and large animals (pig, sheep, NHP) preclinical studies. When coupled to GENIMAPS, GENISYST can potentially bring a repurposed/orphan drug into Phase II or III Clinical studies in 6-18 months.
Genome Biologics has raised a total of β¬17.5M and in 2021 secured up to β¬9.8M in equity funding from the EIC Accelerator to accelerate development of its clinical programs for the prevention of cardiotoxicity and heart damage β an area of huge unmet need within the EU and globally.
β Quris based in Boston and Tel-Aviv, Quris in 2019 is led by a stellar team of pioneers in machine-learning, big-data, genomics, technology, and medical device development, bringing its first drug tested preclinically using AIβrather than animal modelsβinto human trials, using simulated organs trained by AI for the autism-associated rare disease fragile X syndrome. They use a unique ML approach that generates data for classification algorithms by testing drugs on miniaturized Patients-on-a-Chip and then the ML model is trained using this automatically-tagged data.
The company has already a collaboration with Merck (on a preclinical study evaluating Quris-AIβs BioAI Drug Safety Platform) and on January 04, 2024 Quris announced the addition of Michel Vounatsos, former CEO of Biogen, to its Board of Directors, and the appointment of Yossi Ben Amram, former Merck & Co., Inc. (MSD) Regional President, as President of Quris-AI. Quris has raised a total funding of $37M over 3 rounds from 7 investors.
β Emerald Cloud LabΒ (ECL)Β is a warehouse with million-dollar robots and the only remotely-operated research facility that can handle all aspects of daily lab work, from designing experiments to data acquisition and analysis.Β Users can log on to their website from anywhere in the world, enter the experimental protocol that theyβd like to perform, and the robots will take care of the rest. In this way, entire experiments, that would normally take weeks or months, are performed in hours with higher precision, since ECL (California, 2010) operates 24 hours a day, 365 days a year and is increasing productivity by 300%. Β
In particular, ECL offers access to 200 different kinds of high-throughput scientific instruments (for a monthly fee that costs less than a single piece of lab gear), combined with the facilityβs nonstop operations. Plus it has an automated cloud system that provides the ECL staff the same benefits it provides customers.
The two ECL co-founders Brian Frezza and DJ Kleinbaum are now sharing their lessons starting a new ECL with academics, in a collaboration with Carnegie Mellon University (CMU) to build the worldβs first cloud lab in an academic setting. On 22 Aug, 2023, the world's largest cloud lab provided open access to its proprietary Symbolic Lab Language (SLL), considered the most developed and widely used language for remotely controlling experiments in a cloud lab. ECL has a post-money valuation in the range of $100M to $500M as of June 1, 2015.
β Synthace (UK) is offeringΒ Antha, a cloud-based software platform for automating and improving the success rate of biological processes, by spreading biological information in a repeatable way, linking lab equipment, protocols and processes, thereby allowing vast and speedy development, enhancing productivity. The Synthace digital experiment platform lets you design powerful experiments, run them in your lab, then automatically build structured data and no code is necessary.
On May 24, 2023, Synthace announced the successful integration of their platform with OpenAIβs ChatGPT to let scientists use a natural language interface to describe their intent, use AI to design complex experiments and then run those experiments on lab equipment. On September 20, 2023, Synthace released a comprehensive analyst report titled Lab automation & experimentation in life science R&D 2023-2024 shedding light on the industry challenges faced by researchers and highlighting opportunities for improvement.
Synthace has raised a total of $81M. in funding over 13 rounds.
β Arctoris (US/UK, 2016) is a service for academic research groups and biotech companies, allowing scientists to directly configure a wide range of research experiments online, and receive their results and detailed methods reports immediately. Arctoris is not only capable of screening but is actually designed for fundamental research, encompassing a wide range of techniques: AI model training and validation, cell and molecular biology assays, biotech and target-based assays. In other words, is a fully automated laboratory that generates robust, reliable, reproducible data at scale, and is a globally operating platform company with an internal pipeline of drug discovery.
In 2021, IBM Research and Arctoris announced they are investigating the application of AI and automation to accelerate closed loop molecule discovery by combing IBMβs RXN for Chemistry (an AI online platform leveraging state-of-the-art NLP architectures to automate synthetic chemistry) with Arctorisβ Ulysses platform leveraging robotic experiment execution and digital data capture technologies across cell and molecular biology and biochemistry/biophysics.
On March 23, 2022, Cyclica, a biotech with the vision to advance the most robust and sustainable drug discovery pipeline, and Arctoris have agreed to expand their partnership to progress drug discovery programs for novel neurodegenerative targets with a focus on Alzheimerβs disease.Β On October 04, 2022, Arctoris has appointed three globally recognized experts in Alzheimerβs disease, ML applied to closed loop discovery and automated chemistry as members of its Scientific Advisory Board: Professor John Davis (University of Oxford), Professor Rafael GΓ³mez Bombarelli (MIT), and Dr Teodoro Laino (IBM Research). Arctoris has raised a total of $10M in funding over 3 rounds.
β Strateos (California, 2012), that started as Transcriptic (a SaaS-based biotechnology company providing robotic solutions for biology labs) on June 2019Β announced a merger with 3Scanβa digital 3D tissue model specialistβto form the companyΒ Strateos, that combines the engineering capabilities of the two previous companies to help automate chemical, biological and 3D image analysis in a closed loop robotic laboratory. Strateos smart labs are located in Menlo Park and San Diego, California, Which together represent more than 14,000 square feet of laboratory space and 200-plus state-of-the-art research instruments tailored to the application needs of small-molecule and biologics drug discovery, cell and gene therapies and synthetic biology.
In 2022, Strateos that is considered a pioneer in the development of remote access laboratories and lab automation software for life science research, announced the availability of an integrated solution for small molecule discovery programs seeking a faster, automated way to perform Design, Make, Test and Analyze (DMTA) cycles βClosing the Loop from Idea to Data and Accelerating Cycle Times for Faster Drug Discoveryβ. The DMTA cycle is a common workflow to optimize a companyβs identified hits towards clinical candidates.
In 2023, the company appointed Alexander K. Arrow, MD, CFA, as Chief Financial Officer to play a key role in leading the companyβs business and finance strategy focused on growth and new investments. The company has also named Juliet M. Moritz, MPH, Vice President, Project Management Office, reflecting the significant demand by large pharmaceutical companies and CROs for deployment of Strateos Cloud Lab Automation-as-a-Service Platform. Strateos has raised a total of $105.8M in funding over 12 rounds.
β Celeris Therapeutics that uses AI-driven technologies to design molecules to degrade proteins that cause diseases like Parkinson's and various cancers, announced in 2022 a research collaboration agreement with Merck KGaA, in the field of early drug discovery using CelerisTx graph-based AI platform for discovering and designing novel small molecule binders and bifunctional degraders. Their discovery engine exploits a combination of cutting-edge DL methods such as geometric DL, active learning, a unique database of processed degrader information, and currently has under construction a robotic wet lab facility for closed-loop drug discovery. On September 15, 2023, Celeris Therapeutics appointed James Field (the founder of LabGenius) as a Board Director with expertise in TechBio Scaling, Robotics, and AI. Celeris Therapeutics founded in Graz, Austria in 2021 and headquartered in Silicon Valley, California has raised a total of β¬16.5M.
β Aiforia (Helsinki Finland, 2013) by analyzing images uploaded to its cloud is allowing researchers to detect any visible feature or pattern at scaleβincluding in tissues and cellsβin order to understand pathophysiology. Aiforiaβs platform brings together AI and high-performance cloud computing and assists image-based diagnostics by providing efficient and scalable solutions, enabling new discoveries and clinical support with highly accurate and consistent data. For example with Aiforia Create, Lundbeck pharmaceuticals performed neuroscientific histopathological analysis of alpha-synuclein (a neuronal protein linked to Parkinsonβs disease) during preclinical studies, accomplishing high speed and accuracy demonstrating the AI-based quantification was as accurate as the manual method in quantifying Ξ±-synuclein.
On June 21, 2021, Eprediaβa global precision cancer diagnostics companyβannounced that it has entered into a distribution agreement for Aiforia's AI based pathology software, in order to distribute Aiforia's portfolio of preclinical and clinical pathology tools globally. In 2022, Aiforia had its fourth CE-IVD marked clinical AI Model for Breast Cancer to its rapidly expanding portfolio of novel tools for cancer diagnostics (Prostate Cancer Gleason, Lung Cancer PD-L1, Breast Cancer PR, ER, Ki-67). In January 2024, Aiforia entered into an exclusive licensing agreement with Mayo Clinic to globally commercialize a specialized AI model for predicting the recurrence of colorectal cancer. Finally, AiforiaΒ has raisedΒ a total ofΒ β¬21.2M.Β
For more: AI/ML tools, startups and investors for preclinical drug discovery (2nd part).
Until next time,