AI-designed DNA controls genes in healthy mammalian cells for first time microbiologystudy

A study published today in the journal Cell marks the first reported instance of generative AI designing synthetic molecules that can successfully control gene expression in healthy mammalian cells. Researchers at the Centre for Genomic Regulation (CRG) created an AI tool which dreams up DNA regulatory sequences not seen before in nature. The model can be told to create synthetic fragments of DNA with custom criteria, for example: ‘switch this gene on in stem cells which will turn into red-blood-cells but not platelets.’

The model then predicts which combination of DNA letters (A, T, C, G) are needed for the gene expression patterns required in specific types of cells. Researchers can then chemically synthesise the roughly 250-letter DNA fragments and add them to a virus for delivery into cells.

As a proof-of-concept, the authors of the study asked the AI to design synthetic fragments which activate a gene coding for a fluorescent protein in some cells while leaving gene expression patterns unaltered. They created the fragments from scratch and dropped them into mouse blood cells, where the sequence fused with the genome at random locations. The experiments worked exactly as predicted.

“The potential applications are vast. It’s like writing software but for biology, giving us new ways of giving instructions to a cell and guiding how they develop and behave with unprecedented accuracy,” says Dr. Robert Frömel, first author of the study who carried out the work at the Centre for Genomic Regulation (CRG) in Barcelona.

The study could lead to new ways for gene-therapy developers to boost or dampen the activity of genes only in the cells or tissues that need adjusting. It also paves the way for new strategies to fine-tune a patient’s genes and make treatments more effective and reduce side effects.

The work marks an important milestone in in the field of generative biology. To date, advances in the field have largely benefited protein design, helping scientists create entirely new enzymes and antibodies faster than ever before. However, many human diseases stem from faulty gene expression that is cell-type specific, for which there might never be a perfect protein drug candidate.

Gene expression is controlled by regulatory elements like enhancers, tiny fragments of DNA which switch genes on or off. To fix faulty gene expression, researchers can comb through genomes looking for naturally-existing enhancers that happen to suit their needs, limiting themselves to the sequences evolution has produced.

AI-generated enhancers can help engineer ultra-selective switches that nature has not yet invented. They can be designed to have exactly the on/off patterns required in specific types of cells, a level of fine-tuning which is crucial for creating therapies that avoid unintended effects in healthy cells.

However, the development of AI models requires lots of high-quality data, which has been historically lacking for enhancers. “To create a language model for biology, you have to understand the language cells speak. We set out to decipher these grammar rules for enhancers so that we can create entirely new words and sentences,” explains Dr. Lars Velten, corresponding author of the study and researcher at the Centre for Genomic Regulation (CRG).

The authors of the study created huge volumes of biological data to build their AI model by carrying out thousands of experiments with lab models of blood formation. They studied both enhancers and transcription factors, proteins also involved in controlling gene expression.

Until now, scientists studying enhancers and transcription factors typically used cancer cell lines because they are easier to work with. The researchers worked with healthy cells instead because it’s more representative of human biology. Their work helped uncover subtle mechanisms that shape our immune system and blood cell production.

Over five years, the team synthesised more than 64,000 synthetic enhancers, each carefully designed to test different arrangements and strengths of binding sites for 38 different transcription factors. It’s the largest library of synthetic enhancers ever built in blood cells to date.

Once inserted into the cells, the team tracked exactly how active each synthetic enhancer became across seven stages of blood-cell development. They discovered that while many enhancers activate genes in one type of cell, they repress genes in another.

Most enhancers worked like a volume dial, turning gene activity up or down. Surprisingly, certain combinations acted like on/off switches. The scientists call this “negative synergy,” meaning two factors that usually turn a gene on individually could effectively shut that gene down when they occur together.

The data from the experiments was crucial in setting out the design principles of the machine learning model. Once the model had enough measurements of how each synthetic enhancer changed gene activity in real cells, it could predict new designs that yield on/off outcomes, even if these enhancers had never existed in nature.

The study was designed to determine if a technology can work in practice before committing to larger-scale research. The researchers have only scratched the surface. Both humans and mice have an estimated 1,600 transcription factors regulating their genomes.

The work was carried out by Lars Velten, Robert Frömel, Julia Rühle, Aina Bernal Martínez, Chelsea Szu-Tu and Felix Pacheco Pastor, all members of Lars Velten’s research group at the Centre for Genomic Regulation. Rosa Martinez Corral from the Barcelona Collaboratorium, a joint initiative between the CRG and EMBL-Barcelona, also took part. The research was funded by an ERC Starting Grant from the European Union and a grant of the Spanish National Agency for Research.

Source link

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top