BioStrand: The Ultimate Convenience in Multi-Omics Data Search and Analysis 🔝
Large Language Models Revolutionizing Drug Discovery
“From where we stand the rain seems random. If we could stand somewhere else, we would see the order in it.”
By Tony Hillerman, Coyote Waits
⚙️ BioStrand: Large language models (LLMs) and Antibodies ⚙️
BioStrand—founded in Belgium in 2019—is an independently operating subsidiary of ImmunoPrecise Antibodies (IPA) at the intersection of biotech discovery, biotherapeutics and AI. Biostrand has a patented LENSai™ Integrated Intelligence Technology powered by HYFTs® (Universal FingerprintsTM), that seamlessly integrates massive data from diverse data sources, to enhance LLMs and accelerate antibody discovery.
In particular, by analyzing diverse data sources LENSai can identify new antibody targets and predict binding affinities, as well as design and optimize antibody sequences and predict potential side effects, immunogenicity and therapeutic applications. To do so, HYFT (that is their proprietary transversal language) uncovers meaningful patterns, sequences, connections and insights from data, empowering scientists to make faster and more informed decisions. Moreover, HYFT can unify, organize and standardize all omics data, which enables scientists to collaborate and perform more efficient and accurate analyses.
But what is HYFT?
HYFTs are Universal Fingerprint patterns mined throughout the whole biosphere that when linked together they form a Knowledge Graph that constitutes over 660M HYFTs and more than 25 billion relations (connections). These HYFTs can connect sequence to structure and function, but also link sequence to all types of textual information such as scientific papers and medical records. Recently, the company also added more than 20 million structural HYFTs (S_HYFTs) to this Graph, and continues as we talk to add metadata and relations.
The platform continuously enriches its knowledge base through data looping, while NLP provides instant access to 33 million abstracts from the PubMed database, expanding the network. Overall, LENSᵃⁱ seamlessly integrates:
Sequence (DNA-RNA-Protein),
Structure (AlphaFold, ESM-2, Rosetta Fold, Cryo-EM, Crystallography) and
Text (Peer-Reviewed Literature, Patents, Clinical Trials).
The commercial release of LENSai API (BioStrand, ImmunoPrecise Antibodies’ Subsidiary, Announces Immediate Commercial Offering of Groundbreaking Software with Customizable Interface for AI-Driven Drug Discovery) was announced on June 10, 2024 and shortly after (June 12, 2024) BioStrand has been honored with the prestigious 2024 Impact Award, sponsored by InterSystems.
The 2024 Impact Award 🏆acknowledges BioStrand’s innovative LENSai technology, selected from over 1,000 client projects for its remarkable contributions based on three key criteria:
🛸 Makes a significant difference: LENSai’s innovative approach has substantially advanced the field of biotherapeutics.
🛸 Breaks new ground: The technology introduces novel methods and solutions, pushing the boundaries of current research and development.
🛸 Sets an example: BioStrand’s achievements serve as a benchmark for other organizations, demonstrating excellence in innovation.
"New Paradigm for Biological Sequence Retrieval Inspired by Natural Language Processing and Database Research" on bioRχiv (November 13, 2023) 📚
In this manuscript, BioStrand presents its novel search algorithm involving an indexing scheme based on patterns discovered by natural language processing, i.e., short strings of nucleotides or amino acids, akin to standard k-mers (substrings of length contained within a biological sequence), but mined from cumulative cross-species omic data repositories.
Their results suggest that the HYFT-indexing and searching is a good alternative and a static, alignment-free method to retrieve homologous sequences down to 50% sequence identity.
LENSai Applications
▶️ In Silico Discovery: LENSai Epitope Binning
In a case study of validating LENSai in silico antibody epitope binning against classical wet lab binning, LENSai was used for epitope binning—a crucial step used to characterize the binding of monoclonal antibodies to target a protein—performed on 29 antibodies sequences directed against a transmembrane protein. Then the results were compared with classical wet lab binning indicating that BioStrand's AI accurately predicted antibody-antigen binding interactions. Moreover, the AI identified similar antibodies that shared epitopes, aiding in grouping them together.
This case study demonstrates how BioStrand's AI technology:
can significantly improve the epitope binning process for a faster, more accurate and cost-effective development of therapeutic monoclonal antibodies, and
offers also a green environmentally friendly solution, that is safer for the planet, by reducing lab waste 🗑️ from unnecessary wet lab experiments.
Overall, the in silico epitope binning powered by LENSai offers a pivotal advancement with its ability to analyze over 5,000 sequences, delivering rapid insights for early triaging (obtaining results within hours for small subsets to 2 weeks for >5k antibodies). This advanced method allows you, within large pools of antibodies, to detect clusters of clones that share similar target binding regions ensuring a thorough and inclusive selection process. LENSai epitope binning smart algorithms enhance biological research, offering accurate, high-throughput candidate selection while reducing time and costs.
The LENSai proprietary Epitope Binning algorithm includes:
antibody sequential and structural profiling,
docking information, accounting for steric hindrance and glycosylation sites and
atomic interactions of Ab-Ag complexes.
Moreover, apart therapeutic monoclonal antibodies BioStrand’s platform could assist researchers also with fighting the data reproducibility crisis (for more about the reproducibility crisis: source, source) in all biomedical realms, where unreliable antibodies used in different assays are in part responsible for this crisis: Reproducibility crisis: Blame it on the antibodies!
But what is wrong with the current commercial antibodies? Well….they are littering the field with false findings!
When the Human Protein Atlas—a Swedish consortium that aims to generate antibodies for every protein in the human genome— looked at some 20,000 commercial antibodies they found that less than 50% can be used effectively to look at protein distribution in preserved slices of tissue.
Moreover, 80% of all life scientists use antibodies, spending $3 billion 💰 per year on millions of different products. But, the largest vendors compete on catalog size, so they often buy antibodies from smaller suppliers, relabel them and offer them for sale. This means according to Bernard—head of the biotechnology consultancy Pivotal Scientific in Upper Heyford, UK—that the 2 million antibodies on the market probably represent 250,000–500,000 unique ‘core’ antibodies. So, in the end you have 80% of all life scientists spending time and money testing the same antibodies (but with a different brand name) over and over again, wasting roughly $1,5 billion per year. And that’s a lot of money and time wasted!
For this reason, BioStrand’s AI-assisted antibody discovery platform facilitating the categorization of antibodies with similar target binding regions on the target can play a significant role with antibody optimization.
After all, good antibodies (used throughout the entire drug development from preclinical to clinical phase in different assays) are like good life partners….so, design them carefully and pick them wisely.
▶️ In Silico Discovery: LENSai Immunogenicity Screening
But apart from antibody discovery, BioStrand can assist researchers also with accelerating the preclinical in vivo phase with in silico screening speeding up humanization (IPA Released New HYFT-Powered In Silico Humanization Platform, Aims to Disrupt the Transgenic Animal Model Market). Current existing methods can only assess candidates against a subset of the human proteome, which can result in an incomplete evaluation of a protein's humanness and immunogenicity. The immunogenicity of protein therapeutics has so far proven to be difficult to predict in patients, with many biologics inducing undesirable immune responses towards the therapeutic resulting in reduced efficacy, anaphylaxis and occasionally life threatening autoimmunity. However, BioStrand uniquely addresses these limitations with several distinct advantages.
Accordingly, BioStrans’s innovative solution aims to significantly expedite the early stages of drug discovery by enabling the early elimination of less promising therapeutic candidates, thereby reducing time, cost and the risk of failure during later stage discovery. In particular, BioStrand's patented HYFT technology screens the entire human proteome (that is the entire complement of proteins that is or can be expressed by a cell, tissue or organism), as well as various animal proteomes, against candidate therapeutics in just under one minute per candidate molecule. This process advances the most 'human-like' molecules, speeding up humanization, reducing the preclinical in vivo phase required before use in clinical trials, and offering a rapid, scalable and cost-effective alternative to transgenic animal models with biotherapeutics. The advantages offered in this way, include:
use of the most comprehensive human proteome reference set, enabling more thorough comparisons,
the ability to extract unique Universal Fingerprint™ patterns from biological sequences,
the capacity to compare the humanized therapeutics to the human proteome, evaluating its similarity to endogenous proteins,
the ability to capture both sequence and structural information, enabling a better understanding of the protein's function and
scalability and high-throughput capabilities, allowing for rapid assessments of numerous candidate proteins against extensive reference datasets.
Boost 🔌 your clinical strategy with LENSai Immunogenicity Screening
In silico screening that helps mitigate risks, enhance efficacy and reduce time/costs.
Overall, LENSai for protein analysis and fast immunogenicity screening (that measures the antibodies generated against your pipeline, allowing you to determine efficacy and safety with confidence) combines HLA II binding (HLA-peptide binding assay determines the ability of each candidate peptide to bind to one or more class II HLA) and human proteome presence for comprehensive risk assessment and in-depth profiling offering:
detailed linkage between clone and target,
geno- and phenotype binding distribution mapped to target indication profile,
connects target, lead and clinical events and
connects MHCII allele (known as the human leukocyte antigen HLA) phenotypes and genotypes associated with clinical events.
But LENSai can do more.
▶️ Analysis and Integration of EHR Data
On June 04, 2024, IPA announced that BioStrand is leveraging their patented Foundation AI Model, LENSai, by applying advanced LLMs to capture real world data from Electronic Health Records (EHR). To be specific, LENSai can significantly enhance the analysis and integration of EHR data, enabling the integrated use of real-world data and evidence in drug discovery and the development of precision medicines (IPA’s Subsidiary, BioStrand, Announces Advanced Large Language Model (LLM) for Electronic Health Records (EHR)).
More good news about BioStrand come from the front of collaborations.
Collaborations 🤝
On June 25, 2024, IPA announced that its subsidiary BioStrand is collaborating now with PGxAI to leverage BioStrand's patented Foundation AI Model, LENSai , to advance the precision medicine field of pharmacogenomics and to support LENSai commercial rollout through expanding LENSai features and market reach.
PGxAI is a leading AI-powered pharmacogenetics platform that is transforming precision medicine through proprietary algorithms and real-world data analysis. In partnership with InterSystems (the leading provider of data management solutions for industries with complex challenges working in collaboration with AI experts from Microsoft Google, Meta and Amazon), PGxAI utilizes advanced tools like vector search and generative AI to analyze extensive real-world data, addressing challenges in drug selection, dosage personalization and identifying significant drug-drug interactions. Their database is anchored in the vast reservoir of information from PharmGKB, the premier repository aggregating recommendations from pharmacogenetic consortiums (CPIC, DPWG, CPNDS, RNPGx) and state regulators (FDA, EMA, Swissmedic), supplemented with other pharmacogenetic research findings.
Moreover, on March 28, 2024 InterSystems together with IPA announced a landmark collaboration that will integrate the new vector search capability of the InterSystems IRIS® data platform with IPA’s subsidiary BioStrand's LENSai platform, setting a new standard for AI-driven applications in healthcare and life sciences.
Let’s go back 🔙 for a moment to the BioStrand acquisition by IPA.
Back on March 29th, 2022, IPA (NASDAQ: IPA) announced that it has entered into a definitive share purchase agreement to acquire BioStrand BV, BioKey BV and BioClue BV (collectively referred to as “BioStrand”), a group of Belgian biotech entities and pioneers in the field of bioinformatics and biotechnology. IPA, that is a biotherapeutic innovation-powered company that supports its business partners in their quest to discover and develop novel antibodies against a broad range of target classes and diseases, paid approximately €20M for BioStrand plus a potential earnout consideration.
Soon after the acquisition from IPA (November 30, 2022), at BioStrand they entered into their first research collaboration and license agreement with BriaCell Therapeutics Corp (NASDAQ: BCTX), a clinical-stage biotechnology company specializing in targeted immunotherapies for cancer, in order to leverage BioStrand’s LENSai software. Upon successful antibody discovery, BioStrand will receive an upfront payment of US$500,000, and will be eligible to receive future success-based development milestones, including those for the submission of Investigational New Drugs (INDs), clinical milestone payments and commercial royalties on net sales of products.
The mind behind everything just described is Ingrid Brands, who is the co-founder and former CEO of BioStrand that now holds the position of General Manager guiding the company's path towards becoming a global leader in the genetic research domain. But the biotech startup is more than just a company since it is a family affair 💍💑, where Ingrid’s husband Dirk Van Hyfte is more than a co-founder and Head of Innovation.
Let’s talk now about something different: “Patterns and BioStrand”.
✴️ Patterns ✴️
Regarding patterns, LLMs and search algorithms, BioStrand is a company on its way to potentially solve the Information Integration Dilemma in systems biology (Cracking the Information Integration Dilemma (IID) in systems biology) by learning how to integrate, standardize and curate “ALL data omics complexity” into one comprehensive, contextual, scalable data matrix that will change forever the way we do research.
Information integration is the merging of information from heterogeneous sources with differing conceptual, contextual and typographical representations. It is used in data mining and consolidation of data from unstructured or semi-structured resources.
When it comes to omics data and biology, cracking the Information Integration Dilemma means finding this “something” that connects the ▶️ genetic code with the ▶️ transcriptome and the ▶️ proteome in the realm of mathematics, physics and information science, beyond chemistry and beyond biology.
Think of this “something” like a “shadow code”—like the theater of shadows 👥 where a beam of light behind the actors (for example the mRNAs) will place them in silhouette so the audience (for example researchers doing sequence analysis) will only see their outline (the 1D structure of the mRNAs)—but in actual reality this “something” works like a software in a higher dimension (4D or even 5D) that actually runs a manufacturing site allowing our cells to produce in 3D, in a highly sophisticated and hierarchical way, all the mRNAs at any given time and space point!
This “shadow code” (or data matrix) orchestrates the entire cell manufacturing line from transcription to translation of each individual cell (and in synchronicity of all our trillions of cells), and goes back to our origins, where probably nothing happened randomly. Furthermore, add to this “cell manufacturing” complexity all the different dimensions involved. And here comes the best part, the foundational discovery of Biostrand was that the HYFTs are finite in nature (contrary to k-mers) a fact that implies that there is a system underlying (a “shadow code”) orchestrating everything. At this point, once Biostrand realized the importance of this discovery they indexed all the HYFTs as data objects in order to analyze them, connect them and understand nature.
Keep in mind that the complexity of biological information is that it relates to multiple dimensions such as functions of proteins (let’s call that a 4D dimension because also time is implicated when you consider function), 3D structures of molecules, genetic information (2D and 1D) involving the whole process of translating DNA code into RNA and proteins, and much more. In the end, all these dimensions need to be captured to understand biological systems and create meaningful predictions and insights. That is why HYFTs turn out to be extremely useful.
To conclude, the unique feature of HYFTs are their ability to connect in one single network crucial dimensions in biology such as ➡️ sequence information (that is the 1D and 2D of DNA, RNA or proteins), ➡️ information on 3D structures of proteins, ➡️ function (4D) and ➡️ information found in scientific literature and other sources of knowledge (that is us, in our dimension, producing data). To connect all these dimensions, the HYFT network has over 660 million HYFT patterns and 25 billion relations (connections).
One more thing, although LLMs are trained on big datasets they tend to lack understanding and comprehension of complex biological systems. Instead, the HYFT technology helps overcome this big problem for LLMs in biotechnology, where computer programs need to grasp the interrelated nature of all biological information to give useful results. In actual fact, by using the HYFTs patterns as “words”, which carry information and knowledge and thus are connected meaningfully, BioStand’s platform introduces ‘meaning’ to LLMs. On top of that, at BioStrand they allow you to analyze at the sub-sequence level which is key to get a finer grain than any other methods out there since at the moment LLMs operate only from sequence level information which is totally noisy.
“The integration of HYFT technology with the most recent AI developments such as LLMs but also with the 3D protein structure prediction capabilities of AlphaFold-2, a protein structure analysis AI created by DeepMind, an Alphabet business focused on AI, as well as ESM-2, a ground-breaking approach developed by Meta AI researchers to predict protein structure, creates a very powerful system for our AI-based learning. This next-level improvement will accelerate our antibody discovery process and may open new possibilities in precision medicine.”
Dr. Dirk Van Hyfte, Head of Innovation and Co-Founder at BioStrand
But who is IPA?
⚛️💟 ImmunoPrecise Antibodies Ltd (IPA) 💟⚛️
IPA (1983, Canada) is a progressive, scientific Contract Research Organization (CRO) recently ranked by one of the pharmaceutical industry’s most trusted independent market research sources with the highest competitive score for its antibody service portfolio.
The company represents a HUB of biotherapeutic intelligence that includes a hybrid of experts and technologies, in the science and business of bioplatform-based discovery. They provide highly specialized full-continuum therapeutic antibody discovery, development, and out-licensing services (Talem)—with advanced omics and complex intelligence technology (LENSᵃⁱ) that provide greater efficiency and accuracy than ever before. Talem Therapeutics LLC is also a subsidiary of IPA and is focused on the discovery and development of next-generation, human, monoclonal, therapeutic antibodies.
IPA has also a research collaboration and exclusive option license agreement (as of March 30, 2023) with Xyphos Biosciences, Inc (a wholly owned subsidiary of Astellas Pharma Inc, TYO: 4503) to jointly conduct research activities to identify and optimize proprietary LENSᵃⁱ in silico generated antibodies, targeting an undisclosed target in the tumor microenvironment, as potential therapeutic development candidates. Targeting this molecule has the potential to markedly enhance anti-tumor immunity with other Astellas therapies including chimeric antigen receptor-based (CAR) technologies. Astellas will have the exclusive option to license any development candidates generated as part of the collaboration.
On August 19, 2024, IPA announced a groundbreaking achievement 🤽: the ability to engineer antibodies entirely through computer simulations using LENSai.
This marks a significant milestone for the biotechnology industry. Additionally, the antibodies produced by IPA are highly specific to a challenging oncology target located within the Tumor Microenvironment (ImmunoPrecise Antibodies Successfully Engineers in silico Antibodies to Elusive Tumor Protein Using Its Patented LENSai Technology).
Finally, on March 20, 2024 IPA announced that it has acquired the LSA® instrument platform from Carterra, a leading provider of high-throughput large and small molecule screening and characterization solutions. This instrument allows for high throughput surface plasmon resonance-based antibody characterizations thereby significantly increasing the IPA’s capacity in performing various label-free protein interaction analyses including best-in-class kinetics, epitope binning, quantitation, epitope mapping and blocking/neutralization assays.
Dr. Jennifer Bath is the Chief Executive Officer and President and Frederic Chabot is Head of Corporate Development at IPA.
Until next time 💮
Substack Newsletters To Follow in 2024
Money Machine Newsletter. Market beating stocks in 5 min. Picked by elite traders. Delivered weekly to your inbox pre-market. 💹
Very good writeup Marina
To add to this beautiful piece, biostrand just released another case study on tumor micro environment where lensAi has been successful in yielding 3 antibodies 100% computer based with desired characteristics in this difficult area.
https://26206544.fs1.hubspotusercontent-eu1.net/hubfs/26206544/IPA/Content/Case%20Studies/AntibodyOptimization_A8.pdf