AI Startups for Data Aggregation and Analysis (part 2)
AI meets Drug Development
Data Aggregation and Analysis during Drug Development
Even though public opinion and reputation of pharmaceutical companies appear to be eroding and a significant decline in the public’s perception of the transparency, openness and authenticity of drug makers “is in the air, everywhere I look around”, pharmaceutical companies do play a positive role in society. So, despite corruption and lack of transparency, the biggest problem of pharma is its conservative nature while dealing with a process — drug development — that simply doesn’t work anymore due to the lack of innovation, amid digital disruption, rapid technological advances and other issues such as lack of data reproducibility. Accordingly, “an army” of AI pharma startups is being set up to deal with pharma’s problems.
Let’s see now in the second part of this blog (the first part can be found here) some of these data-driven startups, that employ ML and AI, for Data Aggregation and Analysis during Drug Development (petabytes of data).
Quertle’s (Nevada US, 2008) flagship product Qinsight™ enables unparalleled discovery of literature through AI-powered searching, integration, organization, and presentation including predictive visual analytics. Qinsight, which covers journal articles, patents, clinical trials, treatment protocols and much more, is in use by pharmaceutical and biotechnology companies, universities, research centers, and healthcare providers around the world.
Linguamatics (Cambridge UK, 2001), an IQVIA company, utilises AI to extract and analyse text. Linguamatics is a software company providing high performance natural language processing based text mining software. The software enables the rapid extraction of business critical facts and relationships from large document collections. Linguamatics’ text mining software can be used for drug discovery and basic research, patents analytics, drug safety and pharmacovigilance, precision medicine, voice of the customer, real world data, clinical trial analytics, regulatory compliance, clinical research and many more.
LabTwin (Berlin Germany, 2018) is the world’s first voice and AI-powered digital lab assistant, working alongside scientists at the point of experimentation. They utilise AI to understand voice-based commands and transcribe voice-based notes, allowing researchers to take notes and organise lab documentation faster and with less effort. On October 2019 it was announced a new partnership between LabTwin and ABI-LAB, a life science incubator and accelerator that supports biotech, medtech and medical data startups. On June 2020 LabTwin, has been named a Gartner Cool Vendor in Life Sciences. Gartner, is the world’s leading research and advisory company that recognises interesting, new and innovative vendors, products and services.
Owkin’s (New York, US, 2016) mission is to use ML to develop better drugs for patients. They offer:
Owkin Loop that connects medical researchers with high-quality datasets from leading academic research centers around the world and powered by the two main components of Owkin’s Software Stack: Owkin Studio and Owkin Connect.
Owkin Studio designed for medical researchers to apply AI to their research cohorts and scientific questions.
Owkin Connect enables the secure and federated training and validation of AI techniques over the Loop. And
Owkin Lab an in-house award-winning interdisciplinary team of data and biomedical scientists, that collaborate with partners to train AI models and design bespoke, end-to-end research solutions to meet their needs.
Owkin’s researchers have a number of papers and in their paper on Nature communications (August 2020) they presented a deep learning model to predict RNA-Seq expression of tumours from whole slide images. The test is called HE2RNA.
Plex Research’s (Boston US, 2017) unique AI search engine technology combines breadth, depth and transparency to give you precise answers to the most difficult scientific questions. It ties together large and disparate data sources, and enables scientists to make powerful and actionable insights into the world’s scientific information, all from a single search bar. Plex impacts all stages of the drug development pipeline and they offer:
Plex Professional, a search engine that connects all types of scientific data (broad array of public sources and databases). And
Plex Enterprise, that allows organisations to make the most use of their internal scientific data by connecting organisation’s own proprietary data and algorithms with the public data available in Plex Professional.
PatSnap (Singapore, 2007) has brought together the world’s most comprehensive R&D dataset in one easy to use platform to help innovation leaders analyse tech trends, assess new opportunities, conduct competitor intelligence and maximise return on IP assets. By combining millions of data points from patents, licensing, litigation and company information with non-patent literature — they analyse over 114 million chemical structures, clinical trial information, regulatory details, toxicity data and over 121 million patents and other sources — providing the world’s most innovative organisations with a new intuitive source of information to accelerate their R&D.
Percayai (Missouri, US, 2018) uses AI to organize and prioritise data in a contextual manner, enabling interactive 3D diagrams illustrating biological information, allowing researchers to rapidly generate testable hypotheses from complex, omic and multi-omic data sets.
SciNote (Wisconsin US, 2015) is a top-rated platform by 70k+ scientists in 100+ countries in academia or industry. They offer efficient digital lab management and all experimental data in one place: from note-keeping to inventory management, reporting and CFR 21 Part 11. In SciNote all your data is searchable, accessible and traceable.
Sparrho was founded in 2013 out of frustration with existing literature search tools by two Oxbridge scientists, and now has an amazing team based in London. They use AI to curate — in combination with human expertise — millions of scientific papers from thousands of publications, allowing researchers to stay up-to-date with new scientific publications and patents.
They have 60M+ research articles/patents aggregated, they have indexed 50K+ unique data sources and they have 18K+ content curators (scientists, researchers, PhDs, teachers) across the globe. Sparrho’s platform hosts 155k+ curated collections and three-min digests, and their content is enhanced by world-class researchers (130k+ scientists) from 1,500+ universities. They are planning to to carry forward their success across Asian and European markets and cater to the Top-50 Pharma companies of the world.
ThoughtSpot (California US, 2012) is a business intelligence platform that helps you explore, analyse and share real-time business analytics data easily. ThoughtSpot’s AI-Driven analytics platform puts the power of a thousand analysts in every business person’s hands. They enable natural language search on billions of rows of data from any source, allowing researchers to speed analysis of clinical trial results and historical genomics data.
ThoughtSpot announced ThoughtSpot 6.2 an update to its search and AI-driven analytics platform, that includes new exploration, collaboration and visualisation capabilities, to help organizations unlock value from their data in record time. With new features like DataFlow, Embrace for SAP and Teradata and the ThoughtSpot bulkloader API, enterprises have more flexibility and choice on how they leverage their data for search and AI-driven analytics, wherever that data originates.
Nference, Inc (Massachusetts US, 2013) offers AI software platform that helps researchers extract knowledge in real-time from commercial, scientific and regulatory literature, allowing researchers to identify competitive white space, eliminate blind spots in research and discover disease similarities by phenotype for clinical trial design. The platform enables a diverse set of applications ranging from R&D to commercial strategy and operations in the life sciences ecosystem.
On January 2020 it was announced that Nference closed a $60 million Series B Financing to advance its augmented intelligence platform for clinical research and therapeutic development.
Kyndi’s AI (California US, 2014) platform uses ML to streamline regulated business processes and offer auditable AI systems for government, financial services and healthcare. Kyndi enables enterprises and government to transform regulated processes by offering auditable AI systems. The platform is comprised of the following AI engines and tools:
Kyndi’s Discovery Engine,
Kyndi’s Relevance Engine,
Kyndi’s Explanation Engine,
Kyndi’s Lexicon Engines.
On July 2019, Kyndi announced that it raised $20 million in Series B funding led by Intel Capital, with participation from UL Ventures, PivotNorth Capital and existing investors.
Molecular Health (Heidelberg Germany, 2004) utilises AI to analyse molecular and clinical data of individual patients allowing researchers to improve prediction of drug response and resistance, to design more successful trials and to use molecular evidence for market acceptance. The cloud-based MOLECULAR HEALTH DATAOME platform they offer analyses the molecular and clinical data of individual patients against the world’s medical, biological, and pharmacological knowledge, to drive more precise diagnostic, therapeutic and drug safety decisions.
On July 2020, Centogene — a commercial-stage company focused on rare diseases that transforms real-world clinical and genetic data into actionable information — and Molecular Health announced that they will collaborate exclusively to initiate the Real-life data and Innovative Bioinformatic Algorithms (RIBA) project. RIBA aims to foster a unique novel precision medicine environment to accelerate, de-risk, and improve the development of new orphan drugs based on the combination of large real-life data sets in rare disease with innovative big data, innovative AI, as well as computational algorithms and expertise.
OneThree Biotech (New York US, 2018) utilises AI to integrate and analyse data from over 30 types of chemical, biological and clinical data allowing researchers to generate new insights across the drug development pipeline, including: Target Discovery, Lead Identification, ADME/Toxicity and Therapeutic Positioning and Biomarkers. This 2-year-old company has just started offering free toxicity screening on potential treatments for COVID-19 and also announced it has closed a $2.5 million seed round, with Primary Venture Partners and Meridian Street Capital as lead investors. OneThree Biotech’s researchers have a number of papers on the arXiv pre-print server, Cell, Nature and Plos.
StoneWise (Beijing China, 2018) utilises AI to enable knowledge mining, molecule generation, and property prediction allowing researchers to build knowledge graphs of scientific literature, predict molecular properties, design novel molecules, and perform retro-synthetic analysis. They offer:
the “me-too drugs” platform with well-curated knowledge graphs and state-of-art algorithms for de novo molecule design,
the “best-in-class” platform that can perform intelligent search among the space of all synthetically accessible compounds, whose size is estimated to be 10⁶⁰ and
the “first-in-class drug” platform providing technologies to accelerate the discovery of first-in-class drugs, including methods to find novel targets, as well as methods to find effective combinations of well-established targets.
This year they raised nearly $10 million in Series A funding, to quickly discover novel treatments for new and endemic infections.
Wisecube AI (Washington US, 2016) utilises AI to analyse internal and external datasets allowing researchers to rank small molecules for drug repurposing and optimisation of clinical studies. Wisecube utilises a multi-stage process — Managed Datasets, Collaborative workspaces, Visual Workflows, Easy Deployments, Centralised Catalog, Integrated community— called Safe AI Development Lifecycle that ensures continuous model development and monitoring.
To conclude, it’s quite possible to say that the pharma industry and academia have historically done a poor job of managing their data. But data, both preclinical and clinical, are both a huge asset and a big “messy” problem (Replication or Reproducibility Crisis, Hidden Results, Negative Results, Raw Results, Ghostwriting, In Silos) for the pharma and academia. So, improving data lifecycle is undoubtable a priority.
Thank you for reading 💙
And if you liked this post why not share it?
#science #health #pharma #AI_drugdiscovery #drugdiscovery #AI #biotechAI #pharma_AI