AI Startups for Data Aggregation and Analysis (part 1)
AI meets Drug Development
Data Aggregation and Analysis during Drug Development
Even though public opinion and reputation of pharmaceutical companies appear to be eroding and a significant decline in the public’s perception of the transparency, openness and authenticity of drug makers “is in the air, everywhere I look around”, pharmaceutical companies do play a positive role in society. So, despite corruption and lack of transparency, the biggest problem of pharma is its conservative nature while dealing with a process — drug development — that simply doesn’t work anymore due to the lack of innovation, amid digital disruption, rapid technological advances and other issues such as lack of data reproducibility. Accordingly, “an army” of AI pharma startups is being set up to deal with pharma’s problems.
Let’s see now some of these data-driven startups, that employ ML and AI, for Data Aggregation and Analysis during Drug Development (petabytes of data).
AccutarBio (New York US, 2015), is a Shanghai and Brooklyn-based AI pharma company, that has raised millions of dollars in funding trying to use 3D projections of chemical structures to develop anti-cancer drugs.
The company employs AI for drug discovery — outperforming traditional drug pocket prediction, drug-target complex conformation prediction, drug-target binding affinity prediction and drug property (ADME) prediction — and offers so far:
a data-driven atom-based scoring function trained with 100,000 protein crystal structures containing information of >100 million amino acid side chains,
a dynamic deep neural network specifically designed for chemical informatics and
a drug pocket side chain conformation prediction and drug docking.
AccutarBio researchers have a number of papers on the arXiv pre-print server and recently they received $15 million in funding (including money from Chinese AI/facial recognition company YITU) while they now partner with Amgen.
Ardigen (Kraków Poland, 2015) is a Polish bioinformatics company — part of the Selvita Group — trying to accelerate drug development by decoding microbiome, designing immunity and providing digital drug discovery services.
Ardigen’s neoantigen prediction platform called “ArdImmune Vax” employs AI to identify an optimal set of neoantigens as targets for cancer vaccines or adoptive cell therapies. This technology is also very well suited for the design of vaccines for infectious diseases.
On April 2018 Ardigen proved its excellence in AI by winning the 2nd place in the prestigious NCI-CPTAC DREAM-Proteogenomics Challenge (a community-based collaborative competition to answer key questions in cancer proteomics). Selvita S.A. — one of the largest preclinical contract research organisations in Europe— and Ardigen have signed on April 2020 a grant agreement to create the HiScAI Technology Platform (HiScAI — High Content Screening Artificial Intelligence), dedicated to the study of phenotypic changes in cells treated with a drug candidate, using ML and AI to analyse the data from high content screening.
Biorelate - Curating Truths with AI (Manchester UK, 2014) helps scientists solving THE MOST difficult biomedical challenges of today by curating truths from existing knowledge, enabling smarter and faster research and development.
Biorelate’s cognitive computing platform Galactic (Galactic-AI™) can speed up the research process by collecting and curating more than 30 million biomedical research text sources. With up to 80% of biomedical data thought to be unstructured, the platform helps researchers to generate a clearer view of the current state of research and gain invaluable insights. Dr Daniel Jamieson, CEO and founder of Biorelate announced on May 2020 that Biorelate has made its cloud-based web tool Galactic available to researchers to help drug discovery efforts, since lab access is restricted due to the global pandemic.
BioSymetrics (New York US) is a biomedical AI company that helps researchers develop drugs with greater speed and precision and is offering (Augusta Architect):
a prediction platform for molecular mechanisms,
in vivo barcoding,
data set de-noising and
identification of lead compounds from gene/diseases prediction.
Moreover, they offer a clinical insights engine that can be used to improve value based care initiatives that combined with the above four solutions can enrich pharmaceutical research and discovery efforts with clinical data.
On August 2020 BioSymetrics announced it has joined Accenture’s partner ecosystem — an integral part of Accenture’s INTIENT cloud-based R&D platform which has been designed to help life sciences organisations improve end-to-end productivity, efficiency and innovation from drug discovery through clinical and patient services.
Biotx.ai (Berlin Germany, 2017) is an AI tool for biomedical data which helps to reliably find complex patterns in high-dimensionality biomedical data. Biomedical data is difficult to analyse because of the problematic structure of small patient cohorts, sample sizes and many other factors, so by using their platform complex interactions can be found within the biomedical data and retrieve highly accurate predictive biomarkers.
Traditionally, big data analysis has involved millions of subjects and few features, that is easy to train AI with. But, with very small patients cohorts and sample sizes (amount of information gathered through clinical trials and genetic testing), with few subjects (few patients) and millions of features (an incredibly large number of data points, like the entire genome) is impossible to train AI with. This is something biotx.ai calls “wide data”. Wide data sets are incredibly hard to analyse, so most pharma avoid this approach entirely, opting instead to sequence entire populations.
So, biotx.ai has designed its AI algorithms to make wide data manageable by separating meaningful findings from the noise, leading to the discovery of previously untedectable complex genetic patterns, allowing for a more accurate prediction of disease status and drug response (predictive biomarkers) and patient stratification for clinical trials.
Causaly Inc (London UK, 2017) offers a semantic AI-platform machine which reads collections of scientific articles and extracts causal associations through linguistic and statistical models dealing with THE MOST difficult biomedical challenge: they increase productivity in literature reviews by filtering out false positives with their technology.
Every month something like 100,000 biomedical articles are added to the over 30+ million already published, which makes it almost impossible (apart time-consuming and inefficient) to try to decipher key relationships and find emerging discoveries in the vast data ocean of biomedical research. For this reason Causaly’s AI is reading and understanding biomedical literature similarly to how humans do, with Causaly having read everything ever written in biomedicine visualising relevant relationships within seconds. Causaly is working with several pharma and biotech companies, including Novartis, as well as with hospitals and academia.
Datavant, Inc (San Francisco US, 2017) employs AI for the clinical trial process, as well as organises and structures healthcare data to inform actionable insights for the design and interpretation of clinical trials (aggregates and analyses biomedical datathrough ML to lower the time, cost, and risk of drug development). Datavant specialises in breaking down silos and analysing health data securely and privately.
On July 2020, Medable Inc — the leading software provider for decentralised clinical trials — and Datavant announced a partnership that will help clinical trial teams easily integrate multiple data sources to accelerate decentralised trial design, recruitment and data management.
Deep Intelligent Pharma (DIP) (Beijing China, 2017) is a global start-up dedicated to empowering and accelerating drug discovery, development and registration through the most advanced AI technologies. With its end-to-end AI-driven platforms, the company enable clients to efficiently move compounds from the lab to post-marketing stage with great quality.
They offer the following AI solutions:
knowledge graph, a biomedical research tool that brings knowledge to scientists in real time by connecting millions of knowledge nodes,
drug discovery platform,
organic synthesis system,
clinical data platform,
regulatory management platform,
top-quality medical translation services and many more.
They operate across China, US and Japan. DIP has raised $26.1 million from Sequoia Capital China and ZhenFund. Among its partners you can find Roche, GSK and Bayer.
Data4Cure’s (California US, 2013) Biomedical Intelligence Cloud platform and services help pharmas make more informed decisions using bioinformatics, ML and AI applications built on top of the largest repository of semantically linked biomedical data and literature. By inferring and organising knowledge from thousands of genomic, phenotypic and clinical datasets, they allow researchers to identify new targets and biomarkers, repurpose drugs and identify disease pathways.
Central to the platform is a dynamic biomedical knowledge graph called CURIE™ spanning over 1 billion biomedical facts and relations continuously inferred from thousands of datasets (both public and customer-specific) and millions of publications.
Data2Discovery (Utah US, 2012) uses AI to find hidden connections and new insights in diverse, linked datasets by allowing researchers to understand and treat diseases by connecting data in new ways.
Genialis (Texas US, 2011) uses AI to analyse multi-omics next-generation sequencing data allowing researchers to reveal previously unseen patterns across large, heterogeneous datasets to predict targets and biomarkers. Over the course of the year, Genialis initiated collaborations with numerous cutting edge biopharma, including Checkmate Pharmaceuticals and Oncologie, as well as renowned biotechnology leader Thermo Fisher Scientific.
With partners like Checkmate and Oncologie, as well as others, the goal has been to leverage RNA sequencing and clinical trial outcomes data to model gene signatures that stratify patients based on predicted drug response. Moreover, Genialis and Thermo Fisher Scientific struck up a collaboration to provide the scientific community with a comprehensive set of tools for RNA sequencing.
In 2019 the Alliance for AI in Healthcare (AAIH) was launched — the first industry advocacy organization dedicated to promoting responsible adoption and use of AI to improve healthcare outcomes — and Genialis was a founding member of the AAIH.
Helix (Georgia US, 2017) uses AI to respond to verbal questions and requests in a lab setting. In this way allows researchers to increase efficiency, improve lab safety, keep current on relevant new research and manage inventory. HelixAI is participating in the Amazon Alexa accelerator.
Evid Science (California US, 2017) trusted by the largest institutions in the world, puts 70M+ evidence-based data points at your fingertips, all backed by the literature. Evid Science’s patented AI, which they claim can read up to 25 million articles in an hour, has already processed the publicly available medical literature across all endpoints, interventions and therapy areas and updates nightly, reducing weeks (or even months) of work to a few clicks and enabling customers to make faster, smarter, evidence-based decisions. Evid Science’s researchers have a number of papers.
Iris.ai (Oslo Norway, 2015) utilises AI mapping out existing knowledge from published research, patents and internal R&D content. Moving beyond limiting keywords, endless result lists and the biased citation, Iris.ai is the perfect AI assistant for cross-disciplinary early stage research projects, that allows to establish and find the similarity of document “fingerprints” based on a combination of keyword extraction, word embeddings, neural topic modeling and other natural language understanding techniques.
Intelligencia’s (New York US, 2017) iNsight aproprietary data cube, integrates structured and unstructured data from a host of data sources, to assess the probability of technical and regulatory success of an asset (drug) at any stage of clinical development, across Phases 1–3. With clinical development having a success rate of ~10% they utilise ML models to assess the probability of technical and regulatory success of a drug at any stage of clinical development. Further, they interpret the reasons behind their estimates and provide insights into the drivers (positive or negative) of the probability of technical and regulatory success.
Intellegens’s (Cambridge UK, 2017) — a spin-out from the University of Cambridge — first commercial product Alchemite utilises AI to learn underlying correlations in fragmented datasets with incomplete information, allowing researchers to estimate missing knowledge of how candidate drugs act on proteins.
The Alchemite™ platform is based on cutting-edge deep learning algorithms that can see correlations between all available parameters, both inputs and outputs, in fragmented, unstructured, corrupt or noisy datasets that are as little as 0.05% complete. The generated models can predict missing values, find errors and optimise target properties with greater levels of accuracy than traditional approaches where complete data is needed. Intellegens’ researchers have a number of papers and Intellegens is one of 10 startups selected for the ATI Boeing Accelerator to boost UK innovation.
Innoplexus (Frankfurt Germany, 2011) is a consulting-led technology and product development company focusing on big data and analytics, using AI to generate insights from billions of disparate data points from thousands of data sources. In this way they allowing researchers to improve decision-making by seeing information in context from biomedical data sources including publications, clinical trials, congresses and theses.
Innoplexus — with over 250 employees and 120+ patent filings including 14 grants in AI, ML and blockchain technologies — can generate real-time insights from hundreds of terabytes of structured and unstructured private and public data, thereby facilitating continuous, informed decision-making for its customer base at an unprecedented speed. On April 14, 2020, Northern Data AG — providers of high-performance computing solutions — announced a strategic partnership with Innoplexus AG to accelerate drug discovery and development against COVID-19 and other diseases.
InveniAI’s (Connecticut US) provides AI-driven technology solutions for accelerating innovation crucial to decision making, growth and success of an organization, by serving a variety of industries, including healthcare (biopharmaceuticals, consumer healthcare, animal therapeutics, vaccines), food/nutrition, agri-tech, aviation, chemicals and government.
They have created a compelling technology platform, AlphaMeld®, that has been central to creating significant clinical, transactional and strategic impact on over 150 global collaborations. They have also a therapeutic drug pipeline that is now being developed by their sister company, BioXcel Therapeutics (a developing high value therapeutics in neuroscience and immuno-oncology using AI). On July 2020 — InveniAI announced a strategic collaboration with GlaxoSmithKline Consumer Healthcare to leverage AlphaMeld®.
Insitro (California US, 2018) a top AI startup integrates ML techniques for drug development. It offers life sciences, engineering and data science to define problems, design experiments, analyse the data and derive insights that develops new therapeutics.
They use an integrated model of disease spanning in vitro cellular systems and in silico machine learning models, creating the insitro model, to discover previously unseen disease subtypes and search for interventions that move them from an “unhealthy” to a “healthy” state. The company partnered with Gilead Sciences to find medicines to treat a liver disease called nonalcoholic steatohepatitis (NASH) because of all the related human data that Gilead has amassed over time. On May 2020 it was announced that it has raised $143 million in an oversubscribed Series B financing.
On the second part of this blog I will talk about more data-driven startups, that employ ML and AI, for Data Aggregation and Analysis during Drug Development (petabytes of data).
Thank you for reading 💙
And if you liked this post why not share it?
#science #health #pharma #AI_drugdiscovery #drugdiscovery #AI #biotechAI #pharma_AI