Navigation

© Zeal News Africa

DeepMind's AlphaGenome Uses AI to Decipher Noncoding DNA for Research, Personalized Medicine | Scientific American

Published 9 hours ago4 minute read

This AI system can analyze up to one million DNA letters at once, predicting how tiny changes in noncoding regions trigger everything from cancer to rare genetic disorders—and potentially revolutionizing personalized medicine

DNA (deoxyribonucleic acid) sequencing and magnifying glass, illustration

KTSDesign/Science Source

The puzzle seems impossible: take a three-billion-letter code and predict what happens if you swap a single letter. The code we’re talking about—the human genome—stores most of its instructions in genetic “dark matter,” the 98 percent of DNA that doesn’t make proteins. AlphaGenome, an artificial intelligence system just released by Google DeepMind in London, aims to show how even tiny changes in those noncoding sections affect gene expression.

DeepMind’s newly released technology could transform how we treat genetic diseases. Though scientists long dismissed noncoding DNA as “junk,” we now know this so-called dark matter controls when and how genes turn on or off. AlphaGenome shows promise in predicting how mutations in these regions cause diseases—from certain cancers to rare disorders where crucial proteins never get made. By revealing these hidden control switches, AlphaGenome could help researchers design therapies that target genetic conditions, potentially aiding millions of people.

But to understand the complexity of the task for which AlphaGenome was created, one must consider how the definition of a “gene” has evolved. The term, coined in 1909 to describe invisible units of heredity (as proposed by Gregor Mendel in 1865) initially carried no molecular baggage. But by the 1940s, the “one gene, one enzyme” idea took hold. And by the 1960s, textbooks taught that for a stretch of DNA to be properly called a gene, it had to code for a specific protein.


If you're enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.


Over the past two decades, the definition has broadened with the discoveries of genes that code for the numerous types of RNAs that don’t get translated into proteins. Today a gene is considered to be any DNA segment whose RNA or protein product performs a biological function. This conceptual shift underscores the genome’s real estate map: Only about 1 to 2 percent of human DNA directly codes for proteins. But with the broader definition, roughly 40 percent is gene territory.

What remains unaccounted for is significant: more than a billion units of code that can determine how and how often genes get activated. Because relevant clues lie far apart and play out through complex cycles of gene regulation, decoding them has been among biology’s hardest challenges. AlphaGenome’s goal is to understand how these regions affect gene expression—and how even tiny changes can tilt the entire body’s balance between health and disease. To do so, the AI system uses a DNA sequence with a length of up to one million letters as input—and “predicts thousands of molecular properties characterising its regulatory activity,” according to a statement issued by DeepMind.

Already, AlphaGenome has replicated results from genetics labs. In a June 2025 preprint study (which has yet to be peer-reviewed), AlphaGenome’s team described using the model to run a simulation that mirrored known DNA interactions: mutations that act like rogue light switches by cranking a gene into overdrive in a certain type of leukemia. When AlphaGenome simulated interactions on a stretch of DNA containing both the gene and the mutation, it predicted the same complex chain of events that were already observed in lab experiments.

Though AlphaGenome is currently available only for noncommercial testing, responses in the scientific community have been enthusiastic so far, with both biotech start-ups and university researchers publicly expressing excitement about the system’s potential to accelerate research.

Limits remain. AlphaGenome struggles to capture interactions that are more than 100,000 DNA letters away, can miss some tissue-specific nuances and is not designed to predict traits from a complete personal genome. Complex diseases that depend on development or environment also lie outside its direct scope. The system does suggest wide-ranging uses, however: By tracing how minute changes ripple through gene regulation, it could pinpoint the roots of genetic disorders. It could help in the design of synthetic DNA. And above all, it could offer a faster way to chart the genome’s complex regulatory circuitry.

Origin:
publisher logo
Scientific American
Loading...
Loading...
Loading...

You may also like...