
For the first time, large-scale DNA sequence data on three U.K. long-term birth cohorts has been released, creating a unique resource to explore the relationship between genetic and environmental factors in child health and development.
The first resource containing high-resolution DNA sequencing data for more than 37,000 children and parents collected over multiple decades from across the U.K. is now available to researchers worldwide.
The data release is led by the Wellcome Sanger Institute, the Children of the 90s study (also known as ALSPAC), the Millennium Cohort Study (MCS), and Born in Bradford (BiB), and supported by the Medical Research Council (MRC) and the Economic and Social Research Council (ESRC).
This work is supported by the ongoing efforts of Population Research UK, a U.K.-wide initiative led by teams at the University of Bristol and University College London, which aids longitudinal population studies by working to coordinate and connect the current research landscape.
Now available on the European Genome-phenome Archive (EGA), these high-quality genomic data can be used in combination with the existing longitudinal health and survey information provided by participating families. These combined data resources offer the scientific community the opportunity to make valuable insights in areas ranging from population genetics to the social sciences.
For example, it could be used to investigate the impact of genetic variation on neurodevelopmental conditions or childhood obesity, and how these are influenced by environmental factors.
Longitudinal research follows large numbers of participants over multiple years, repeatedly examining them at regular time points through, for example, blood tests, body measurements, and health questionnaires, to detect changes over time.
Previously, large DNA sequence datasets have typically focused on children with rare conditions or adult population cohorts. This new data release focuses on sequencing “birth cohorts,” which are population-based cohorts of people followed from birth through to adolescence or early adulthood.
To produce this latest data release, researchers at the Sanger Institute sequenced all 20,000 genes in the human genome, known as exome sequencing, in samples from 8,436 children and 3,215 parents from the Children of the 90s study, 7,667 children and 6,925 parents from the MCS, and 8,784 children and 2,875 parents from BiB.
These three U.K. longitudinal birth cohort studies are internationally recognized and data from these cohorts have already been used to study the contribution of common genetic variants on phenotypes ranging from childhood obesity to parental nurturing behaviors and anxiety and depression.
For example, by using Children of the 90s data, researchers found that a genetic variant in a gene called MC4R is associated with increased weight across childhood4 and studies like this could help design effective weight management interventions and change the way society views obesity.
That specific study used targeted DNA sequencing of the MC4R gene, whereas the new exome sequencing data reported here will allow similar investigations of other genes in the human genome. This will help drive more discoveries and research that could benefit human health.
The team has made the anonymized data as accessible as possible to approved researchers, including drafting a data note and other materials to help support its use by those who are less familiar with large-scale sequencing data.
In coming months, this DNA sequence data resource will be expanded to encompass all participants in these cohorts as well as additional cohorts. The value of these data will be enhanced by harmonizing the data across the different cohorts, providing a more powerful resource than could be achieved by one study in isolation.
“Longitudinal population studies from the U.K. have already had a huge impact on biomedical research worldwide. This significant addition of whole exome sequencing data will further transform our understanding of the development of complex traits and diseases across the life course,” says Dr. Carl Anderson.
“The U.K.’s cohorts and longitudinal population studies are an extraordinary national asset, made possible by the participation of a diverse range of people. The rich data and samples from these studies, when combined with whole exome sequencing, can unlock new research questions and insights into human society, development, health and aging.
“MRC’s funding is part of our overall investment in understanding the drivers of disease to enable precision prevention and personalized treatments, and maximizing existing infrastructure to ensure real value for money. This work aligns perfectly with a new exciting national resource that is supported by MRC and ESRC, Population Research UK, which is all about coordinating and leveraging U.K. cohorts,” says Dr. Richard Evans.
“The success of this initiative shows that coordination across cohort studies can be incredibly powerful and I’m excited to see the research that will come out of this fantastic new genetic data resource. We hope that this encourages other researchers to conduct long-term research studies, and solidifies their importance in U.K. and global research,” says Professor Nicholas Timpson.
The data are available to approved researchers worldwide, via the European Genome-phenome Archive (EGA).
Corresponding data note is available on Wellcome Open Research: wellcomeopenresearch.org/articles/9-390/v2
Citation:
Largest ever DNA sequencing dataset on UK child development studies made available (2025, March 4)
retrieved 4 March 2025
from https://medicalxpress.com/news/2025-03-largest-dna-sequencing-dataset-uk.html
This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no
part may be reproduced without the written permission. The content is provided for information purposes only.