Can AI Build a Virtual Cell? Scientists Race to Simulate Life’s Smallest Unit

Researchers worldwide are accelerating efforts to create AI-driven virtual cell models, which promise to revolutionize disease research, gene regulation, and our understanding of complex biological systems. Stephen Quake, head of science at the Chan Zuckerberg Initiative (CZI), envisions a future where computational modeling will take precedence over traditional lab experiments, saying: “Our goal is to flip cell biology from 90% experimental to 90% computational”.

These virtual cells aim to predict how diseases progress or how cancer cells react to treatments, transforming the speed and precision of computational biology. CZI plans to invest hundreds of millions of dollars into virtual cell projects and Google DeepMind is pursuing similar ambitions, according to CEO Demis Hassabis.

In Europe, Jan Ellenberg is co-leading the development of Alpha Cell, a powerful virtual cell model expected in 2026, combining single-cell datasets with advanced machine learning. Yet despite growing enthusiasm, experts caution that concrete results are still limited, and clear frameworks are lacking.

Computational modeling of cells has a history dating back to 2012, when scientists created a complete digital model of Mycoplasma genitalium, an organism with only 525 genes. Unlike these early, mechanism-heavy efforts, today’s virtual cell projects harness AI’s ability to find patterns in massive datasets, similar to how large language models are trained. Quake notes, “Training models directly on data is a game changer”.

Modern models heavily rely on single-cell RNA sequencing data, enabling the creation of comprehensive cell atlases that reveal cellular diversity across species. Institutes like Arc and CZI are rapidly generating single-cell datasets—CZI alone plans to release sequencing data from one billion cells, adding to its current library of over 100 million.

Hani Goodarzi at Arc Institute highlights that the scale of single-cell data, billions of data points, makes it ideal for AI, similar to the datasets that power language models.

However, models trained only on single-cell data still face limitations. The Arc Institute’s State model, for example, struggles to generalize to new datasets. To truly realize virtual cells, researchers agree it’s crucial to integrate additional data types such as microscopy images, revealing how cellular components interact dynamically. “We need more than single-cell sequencing” Ellenberg emphasizes.

Defining a virtual cell remains another challenge. “There’s no consensus on what a virtual cell truly means,” says Jonah Cool, who leads CZI’s billion-cell project. Tim Mitchison of Harvard Medical School echoes this uncertainty, suggesting practical near-term models might focus on specific cell types or biological processes like gene regulation.

Quake acknowledges the road ahead: “Our vision of replacing most lab work with AI modeling won’t happen overnight—both biologists and AI models need time to adapt”.
By integrating AI in biotechnology, large-scale single-cell datasets, and molecular modeling, virtual cells could one day become essential tools for rare diseases research and precision medicine, reshaping how we study, diagnose and treat disease.

Leave a Reply

Your email address will not be published. Required fields are marked *