Science

How AI Is Powering Earli's Mission

November 25, 2024
How AI Is Powering Earli's Mission

How AI Is Powering Earli's Mission 

By Henry Lee

At Earli Inc., we believe earlier detection and treatment of cancer can dramatically improve patient outcomes and save lives. Like the rest of the world, we have been thrilled by recent AI breakthroughs from OpenAI, DeepMind, and the Baker Lab.  But actually, our journey with AI began years before these developments, because Earli wouldn’t be here without a heavy dose of machine learning and AI.  

AI Step One for Earli:
Automating Analysis of Pre-Clinical
Positron Emission Tomography (PET) 

Earli is about early cancer diagnosis and treatment.  A cancer diagnosis requires the exact location of the malignant tumor - after all, no clinician will treat a cancer they cannot clearly see in an image.  

The problem is, medical image analysis traditionally requires painstaking manual work, particularly in segmentation—the process of identifying and labeling tissue boundaries. This involves hand-drawing regions of interest (ROIs) on anatomical images from CT or MRI scans, which are then combined with PET scans to isolate metabolic activity within specific anatomical compartments. Even for experts adept at identifying these unique biological tissues, this process is highly time-intensive, taking up to six hours per scan. Moreover, ROI boundaries often suffer from subjectivity and inconsistency between analysts.

When Earli launched, we decided to approach this very differently: deploy AI to help read images and automate as much of it as possible. Unfortunately, when the Earli journey began in 2018, few forms of AI were available (no LLM tools, for example). So, to perform pre-clinical medical image analysis automatically, the Earli team had to get creative. 

We looked at existing successful implementations of convolutional neural network models (like ResNet and U-Net) for three-dimensional medical image segmentation—with nnUNet as the standout performer in the Medical Segmentation Decathlon.

The direct application of nnUNet didn’t work for pre-clinical imaging analysis: The obvious anatomical differences between humans and pre-clinical animal models (mice, dogs, and pigs) prevented us from directly leveraging the existing clinical, human-based models. Meanwhile, the only available alternative for non-human models, AIMOS, could segment organs in healthy mouse anatomy but lacked crucial tumor-detection capabilities.

To tackle this challenge, Earli  iterated and built on nnUNet's three-dimensional preprocessing pipeline and model architecture. We trained this emerging AI on proprietary pre-clinical CT data we obtained from three institutions using different microCT machines. Within two months, we developed an AI model capable of automatically segmenting eight ROIs, reducing analysis time from six hours to fifteen minutes per animal - and with more accurate and consistent results.

AI Step Two: 
Help design the protein output of Earli’s “cancer cell factories”

After the encouraging first use of that AI model, the Earli team became deeply interested in leveraging AI for its core science. Earli’s technology enables cancer-activated gene expression through non-viral delivery methods, effectively turning cancer cells into factories that produce specific proteins for diagnostic or therapeutic purposes.This capability opened new horizons for engineering the payloads produced from these vectors, and AI, we realized, could be a key part to multiple parts of that story.

For instance, through computational power, we can now engineer proteins with enhanced stability for longer-lasting effects. We can modify proteins to eliminate toxic properties while designing variants with higher target affinity.  Recently, we developed completely synthetic cytokines that have the potential to trigger more effective immune responses against tumors than natural cytokines. The development process began with the backbone design, wherein we started with an endogenous cytokine and strategically preserved key residues of amino acid sequences. Next, during the protein inpainting phase, a deep neural network generated unique sequences for the remainder of the protein. Then we looked at the conformations using AlphaFold 2 for structure and binding confirmation.  The iterative process continued until we achieved the desired properties, followed by rigorous biological validation. Our scientists reviewed each cytokine conformation and its affinity to target molecules, removing regions with undesired behavior before proceeding through additional rounds of protein inpainting and confirmation.

Earli’s Coming AI Step Three: 
An “AI flywheel” that gets stronger and stronger

What’s next for AI at Earli?  The heart of the Earli platform relies on the vision of co-founder Dr. Sam Gambhir to use cancer-activated gene expression to drive the production of payloads useful for diagnoses of early stage malignancies. Although the original concept used a truncated sequence from an endogenous promoter, the scientific team at Earli has used a combination of bioinformatic outputs performed on thousands of tumor samples from cancer patients and filtered that data with ML approaches and empirical testing using massively parallel reporter assays on thousands of unique sequences. The result of these efforts allow us to piece together a series of regulatory elements called transcription factor binding sites and to generate de novo synthetic promoters that have a remarkable ability to activate selectively within cancer samples across a wide range of genetic backgrounds. This process uses a unique combination of cutting edge compute power and innovative empirical science. It took three years of development work by the Earli team. Now that the process has been established, there is an exceptional opportunity to use the empirical output from these initial experiments for a newly developed “AI flywheel” that dramatically accelerates and improves the iterative process.   By embarking on our most ambitious AI project to date this process will permit us to take the results from one data generation round to inform the next round of development, making the data more and more predictive, accurate, and applicable.  That accelerates the development timeline, drug candidate nominations, perhaps even patient-specific efficacy predictions down the road.

Existing models like DNABERT and MegaMolBART, while powerful in their domains, operate in isolation. Biology demands a more integrated approach. Earli's technology has always spanned the worlds of biology, engineering and software.  That intersectional thinking positions Earli uniquely to develop a proprietary multi-modal “AI flywheel for programmable disease control,” building upon the first six years of groundbreaking research. 

That’s what AI means to us at Earli.