TechBio: Latest Trends and News
How to draw insights from a wealth of data, informing and transforming drug discovery
Hi 👋 everyone and welcome back to another edition of MetaphysicalCells on AI/ML Drug Discovery News.
First Some Weird Science Facts Worth Knowing
📌 Your Eyes Are Better Than Any Camera: Pictures on the latest cellphones can capture 20 or 30 megapixels, but that’s nothing compared to what you can see with the human 👁️ capable of capturing 576 megapixels.
📌 Phobias Can Be Genetic: A 2013 study found that your phobias might be genetic 🧬, appearing due to your ancestors’ experiences.
📌 Clouds Are Heavy: The big cumulus clouds ☁️ you might see on a sunny afternoon can weigh upwards of a million pounds.
BIG BIOMEDICAL DATA
📌 During a data science challenge at Novartis in collaboration with MIT a group of researchers using state-of-the-art ML algorithms and extensive feature engineering for predicting clinical development outcomes, managed to develop novel models that outperformed the baseline MIT model that was proposed in a prior study, whose work used one of the largest pharmaceutical pipeline databases in the world provided by Informa, Pharmaprojects and Trialtrove (“Predicting drug approvals: The Novartis data science and artificial intelligence challenge”). Pharmaprojects tracks all drugs in development from bench to market with the industry’s most established and reputable drug database. And Citeline’s Trialtrove supports trial decision-makers throughout the clinical trial life cycle, from strategy to execution. Drawing from over 60,000 sources, Trialtrove provides unmatched trial intelligence, curating data on trial benchmarks and metrics, enrolment and study timelines, patient populations, endpoints, outcomes, geographic distribution and more.
In particular, the two winning teams developed new predictive models that can be used to augment human judgment to make more informed data-driven decisions in portfolio risk management and capital allocation. The top-performing teams in their winning solutions delivered also additional heuristics with respect to predicting the probability of success:
Identification of novel features predictive of probability of success
Novel approaches and methodologies for feature extraction, combining domain expertise and ML
Creative ways of introducing additional data types to the problem, such as unstructured text and biochemical data—for example, several teams presented ways of connecting new data types, although this in itself did not translate into top leaderboard performance
The authors concluded at the end, that remains a clear opportunity to further improve these models in this competition, since more accurate models can be developed with access to better quality and more comprehensive data, and a broader pool of challenge participants.
📌 The highlights from the review “Big data: Historic advances and emerging trends in biomedical research” by Conor John Cremin, Sabyasachi Dash and Xiaofeng Huang, regarding big data playing a key role for understanding human human diseases, are the following:
The global 🌏🌎🌍 big data market for the healthcare industry is estimated to exceed $70 billion by 2025.
At present, the amount of research data generated per day is estimated to be comparable to that previously generated in a decade. Recent reports indicate that we currently possess 44 zettabytes of data, which will increase to 463 exabytes of data each day globally. Given the rapid evolution of digital networks and access to the IoTs, 90% of the global population older than 6 years is anticipated to have an online footprint.
Several large-scale cloud-based platforms have been developed to store, process, curate and manage massive amounts of biomedical data, like: ELIXIR, which is supported and funded by the European Molecular Biology Laboratory, the Global Alzheimer’s Association Interactive Network (GAAIN) and Genomics Data Commons funded by the US National Cancer Institute, among others. As such, a growing need exists to develop new technologies that can handle such large-scale datasets while minimising the costs and demands on computation.
Integration of data in multi-omics approaches (such as whole genome sequencing, transcriptome sequencing: RNA-seq & ribosome profiling, proteome profiling: mass spectrometry & interactome profiling [chromosome conformation capture, ChIP-seq and hybrid assays], single-cell genome and transcriptome sequencing of circulating tumour cells in liquid biopsies, and metagenomics) can be performed in several ways. One example includes the R package HiCeekR, which enables the integration of Hi-C, RNA-seq and ChIP-seq datasets to explore chromatin configuration changes and their effects on the transcriptome of a host. Seurat, a powerful R package with versions (V1 to V3) developed by an interdisciplinary team of investigators and data scientists, can integrate single-cell RNA-seq and single cell ATAC-seq data. More recently, a multimodal reference mapping approach, which can integrate the datasets of single-cell RNA and protein profiles, has been included in Seurat V4.
After integration of data in multi-omics, co-expression analysis reveals network characteristics of a host. In this regard, weighted gene co-expression network analysis (WGCNA) is an R package commonly used in building co-expression networks from individual RNA-seq datasets. Similarly, the CEMiTool unifies the discovery and analysis of co-expression modules and provides unique results such as genes altered upon a specific treatment or during a disease condition as well as gene set enrichment analysis in a user-friendly and high-quality format.
This tool also integrates transcriptomic data with interactome information that allows for the identification of potential hubs on each gene network.
However, the largest contribution of big data has been in the development of AI and ML. Accordingly, platforms such as TensorFlow (a free and open-source software library for ML and AI) and PyTorch (a ML framework based on the Torch library, used for applications such as computer vision and natural language processing, originally developed by Meta AI and now part of the Linux Foundation umbrella) are already popular open-source platforms for performing these high-demand computations.
For example, high-throughput next generation genome sequencing is also contributing to data overload in biomedicine. Hence, data with high variability and velocity are handled by cloud-based ML technologies, like Bowtie2 (an ultrafast and memory-efficient tool for aligning sequencing reads to long reference sequences), HiSAT (a fast and sensitive alignment program for mapping next-generation sequencing reads both DNA and RNA), MinHashing for large-scale data sampling (a technique for quickly estimating how similar two sets are, for more: A Review for Weighted MinHash Algorithms), HADOOP for large-scale data distribution and technologies, Big BWA and Spark BWA for cloud-based read mapping.
Similarly, platforms including EMBL-ABR, tranSMART and CyVerse have been developed to manage (acquire, store, analyse and report) basic and clinical data from studies across various disease models and geographic locations.
📌 DeepMind’s cofounder Mustafa Suleyman in a recent interview at MIT Technology Review, he said that: “Generative AI is just a phase. What’s next is interactive AI: bots that can carry out tasks you set for them by calling on other software and other people to get stuff done”. He also calls for robust regulation!!!—and doesn’t think that’ll be hard to achieve.
Suleyman has also a new billion-dollar company called Inflection (a Silicon Valley identity management company), with top talent coming from DeepMind, Meta and OpenAI, and thanks to a deal with Nvidia one of the biggest stockpiles of specialised AI hardware in the world (“DeepMind’s cofounder: Generative AI is just a phase. What’s next is interactive AI”).
Inflection is offering the following three solutions: GoodHire, the easiest, most flexible and most delightful employment background check experience you can find. SafeDecision API, delivering real-time background checks for companies wanting to screen to build trust without compromising growth. And Insight API, getting actionable insights with real-time access to millions of people records from thousands of trusted government sources.
📢 Lost in Transl(A)t(I)on: Differing Definitions of AI [Updated] by Holistic AI, an AI governance, risk and compliance (GRC) software platform (HAI Platform) that aims to empower enterprises to adopt and scale AI confidently.
In this blog, they analyse how AI is defined by different regulatory initiatives and bodies, including the Information Commissioner’s Office, EU AI Act, OECD, Canada’s Artificial Intelligence and Data Act, and California’s proposed amendments. Then they analyse the commonalities and differences that set them apart, centring their analysis on the system outputs, the role of humans, autonomy, and types of technology.
NEWS
👉 Paige, a global leader in end-to-end digital pathology solutions and clinical AI with the first Large Foundation Model using over one billion images from half a million pathology slides across multiple cancer types, is developing with Microsoft a new AI model that is orders-of-magnitude larger than any other image-based AI model existing today configured with billions of parameters. This model assists in capturing the subtle complexities of cancer and serves as the cornerstone for the next generation of clinical applications and computational biomarkers that push the boundaries of oncology and pathology (“Paige Announces Collaboration with Microsoft to Build the World’s Largest Image-Based AI Model to Fight Cancer”).
Paige has raised a total of $220M in funding over 5 rounds.
📢 Visit
for a comprehensive guide to Life Sciences innovation terminology: TechBio Guide🔬: Lexicon - Top 100 Key Terms👉 AstraZeneca's rare disease unit Alexion and AI-drug discovery biotech Verge Genomics (one of industry's most advanced all-in-human AI powered drug discovery companies) have partnered to identify new drug targets for rare neurodegenerative and neuromuscular diseases, with the deal potentially worth over $840m (“Verge Genomics Announces Artificial Intelligence-Enabled Drug Discovery Collaboration with Alexion for Rare Neurodegenerative and Neuromuscular Diseases”).
Verge has raised a total of $200K in funding over 1 round.
👉 AstraZeneca and Futurize have officially announced the launch of Futurize’s 🆕 HealthTech incubator program, the FuturizeU, for early-stage university startups in Africa’s 🌍🦏healthcare sector leveraging cutting-edge technology like AI and ML. In collaboration with AstraZeneca, through the A. Catalyst Network, and co-funded by Bristol Myers Squibb, the program will run from September 12 to November 17, 2023, and welcomes teams from partner universities across Sub-Saharan Africa, focusing on university students, alumni and groups from the previous two Fuel Africa cohorts, all with transformative healthcare concepts (“Futurize and Astrazeneca Unveil Futurizeu: The First-Of-Its-Kind Healthcare Incubator Across Africa”).
👉 NOETIC is an AI drug discovery company leveraging spatial data (being the biggest user of two new platforms from two different spatial companies—one for proteomics and the other for transcriptomics) to develop cancer immunotherapies, and was launched with $14M seed financing from two former leaders of Recursion Pharmaceuticals, hoping to bring a data-obsessed mentality to cancer research. Noetik combines self-supervised learning with the industrial-scale generation of human multimodal data.
Spatial biology, is the new frontier of biology defined as the study of tissues within their own 2D or 3D context. For example, spatially resolved transcriptomics technology is being applied by large consortia, such as the Human Cell Atlas and Brain Initiative Cell Census Network (BICCN), with the goal of generating complete maps of large and complex tissues like the human brain. For more check here the “Top 10 Spatial Biology Companies”.
Furthermore, this a webinar featuring Jiangwei Zhang, PhD (Bristol Myers Squibb) and Spencer Schwarz (Canopy Biosciences) on how innovative spatial biology platforms such as Canopy’s ChipCytometry™ technology can help Inflammatory Bowel Disease research.
👉 Valence Labs (a ML research laboratory powered by Recursion) has just joined the OpenFold consortium (a non-profit AI research consortium whose goal is to develop free and open-source software tools for biology and drug discovery), alongside with UCB (an innovation-driven biopharmaceutical company in Belgium that has rich experience in Computer-Aided Drug Design) and Nvidia to continue the contribution to the growing movement of democratising ML research for the drug discovery community. OpenFold that was founded in February 2022, is a project of the Open Molecular Software Foundation, a non-profit organisation advancing molecular sciences by building communities for open-source research software development.
👉 Glass Health, a digital notebook for clinicians, just raised a $5M round led by Initialized Capital. They use a technique called retrieval augmented generation to connect a large language model to their database of clinical guidelines created and maintained by the Glass Health Clinical Team of academic physicians. They have specifically designed Glass to be doctor-facing rather than patient-facing so that the clinician can closely supervise and review the LLM’s outputs before applying the.
📢 What Clinicians Really Think About AI: Recent Past: Technology in Healthcare Doesn’t Live Up to the Hype
By
and👉 Lantern Pharma, an AI company developing targeted and transformative cancer therapies using its proprietary AI/ML platform RADR and with multiple clinical stage drug programs, just announced that “Received IND Clearance from FDA Enabling Phase 1 Initiation for Drug Candidate LP-284 in Non-Hodgkin’s Lymphomas”. LP-284 is being developed for the treatment of relapsed or refractory non-Hodgkin’s lymphoma, including mantle cell lymphoma and double hit lymphoma and other high-grade B-cell lymphomas. Lantern expects to commence enrolment of patients for the first-in-human Phase 1 trial for LP-284 during the fourth quarter of 2023.
Lantern Pharma has raised a total of $95M in funding over 7 rounds.
👉 Nimbus Therapeutics raised $210M co-led by GV, SR One and Atlas Venture (“Nimbus raises $210m for tech-enabled small molecule medicines development”). Nimbus Therapeutics said that it will continue the clinical development of NDI-101150, which is a haematopoietic progenitor kinase 1 (HPK1) inhibitor, in patients having solid tumours.
Nimbus Therapeutics has raised a total funding of $637M over 9 rounds.
👉 Relay Therapeutics, a clinical-stage precision medicine company transforming the drug discovery process by combining leading-edge computational and experimental technologies, just announced that will “Present Clinical Data on RLY-4008 in Advanced FGFR2-Altered Solid Tumors at 2023 AACR-NCI-EORTC International Conference on Molecular Targets and Cancer Therapeutics”. RLY-4008 (lirafugratinib) is a potent, selective and oral small molecule inhibitor of FGFR2, a receptor tyrosine kinase that is frequently altered in certain cancers.
Relay Therapeutics has raised a total funding of $520M over 3 rounds.
👉 London's 🎡 health tech firm Jude has secured €3.9 million in funding to enhance the quality of life for individuals with bladder conditions. The company will further expand its range of products and services to launch symptom tracking, prescriptions and advance research into the underserved area of nutrition and urinary health (“London-based healthtech Jude bags €3.9 million to improve people with bladder conditions’ lives”).
👉 Berlin's 🇩🇪 Likeminded acquired WhyLab a mental health tech company aiming to bolster group-based mental health support in the workplace (“Berlin-based Likeminded acquires mental healthtech whylab to strengthen group formats in the workplace”).
👉 Finally, IOMED a Barcelona, Spain-based 🇪🇸 company that uses AI to extract healthcare data from hospital sources just raised €10 million in Series A funding. The round was led by Philips Ventures and XTX Ventures, with participation from Fondo Bolsa Social, Redseed, Adara Ventures, and EASO Ventures (“Barcelona-based IOMED secures €10 million to champion healthcare ecosystems with real-world data”).
Until next time 👋,
P.S. 📽️📹
On this episode of AI IRL, Bloomberg's Nate Lanxon and Jackie Davalos discuss AI companionship with Eugenia Kuyda, founder of Replika and Ellen Huet, Bloomberg reporter about love (loneliness and loss) at the age of generative AI.
From virtual companions to robot therapists, some are finding that AI is getting close to the kind of emotional support, friendship and even sexual gratification they'd find in a person. But while the yearning for love and companionship is very human, the AI chatbots filling that void are very much not!!
On this video “Machine Learning Uncovers Neural Pathways of Narcissism” discover the identified brain circuits, the associated personality traits and how data can offer rich insights into our intricate psyche.
Watch Marc Andreessen: How Risk Taking, Innovation & Artificial Intelligence Transform Human Experience discussing what it takes to be a true innovator, including the personality traits required, the role of environment and the support systems needed to bring revolutionary ideas to fruition, as well as the role of intrinsic motivation and one’s ability to navigate uncertainty. Marc also points out that soon everyone will use AI as their personalised coach and guide for making decisions about their health, relationships, finances and more.
Exicting 24 hours or so for AI-based drug discovery with ABCL and EXAI updates on some big partnerships!