A new cryptography framework for secure genomic studies microbiologystudy

A new cryptography framework for secure genomic studies
Overview of secure federated genome-wide association studies (SF-GWAS). Credit: Nature Genetics (2025). DOI: 10.1038/s41588-025-02109-1

Advances in generative artificial intelligence and machine learning, trained on large-scale datasets across multiple institutions, have the potential to revolutionize medicine. However, data is hard to gather. It is siloed in individual hospitals, medical practices, and clinics around the world. Privacy risks stemming from disclosing medical data are also a serious concern, so existing data-sharing regulations have largely limited the scope of data collaborations for medical research.

Cryptographic tools for secure computation do exist but they are either impractical or don’t implement current state-of-the-art methods. Now, an approach developed by EPFL has been demonstrated successfully at scale and is being rolled out across Europe.

Secure federated genome-wide association studies or SF-GWAS is a combination of secure computation frameworks and distributed algorithms that empowers efficient and accurate studies on private data held by multiple entities while ensuring data confidentiality. A study on five datasets, including on a UK Biobank cohort of 410,000 individuals, has showcased an order-of-magnitude improvement in runtime compared to previous methods.

The research is published in the journal Nature Genetics.

“In many cases, it’s not possible to centralize data for practical or legal reasons or just because people aren’t willing to share it. So, the goal is to extract information without sharing the data,” said Jean-Pierre Hubaux, the Academic Director at EPFL’s Center for Digital Trust (C4DT), affiliated with the School of Computer and Communication Sciences.

“We developed a prototype several years ago but what was missing was the demonstration that it works at scale with real-world-size datasets. This has now been done in collaboration with MIT and Yale with our latest research showing that it is possible to extract information from datasets that remain geographically distributed, with no significant precision loss in terms of results. This opens a new era in terms of data collaborations,” he continued.

SF-GWAS combines two key concepts. First, it takes a federated approach to secure computation, meaning that each dataset is kept at the respective source site. This minimizes computational costs by avoiding large data transfers between sites and allows the use of efficient cryptographic operations that protect the partial computational output generated at each site.

Second, it introduces an efficient algorithmic design to support the federated execution of various end-to-end GWAS pipelines.

“It sounds counterintuitive, but our approach shares data without sharing,” explained Hubaux. “It leverages the existence of the datasets without having to transfer it and is essentially an additional value to the data, an additional motivation to work together without losing control.”

SF-GWAS has already been installed in Switzerland’s five university hospitals. It is currently being rolled out in several Italian hospitals and for European cancer networks by Tune Insight, the EPFL spin-off leading this work. The company is also in talks with medical institutions in other countries.

In addition to unlocking medical research at scale to define and optimize public health care policy, which is just not possible in a world of silos, Hubaux believes that SF-GWAS will have a valuable side benefit. Currently, datasets are de facto distributed worldwide, sitting on hard disks and tapes here and there, because it has traditionally been so complicated to transfer data. The recording of medical data is also applied differently in different places. Hubaux calls this “prehistoric” and says that as a result, datasets are very underutilized.

“We are setting up a value system to make sure that future data is going to be interoperable, that it is recorded in the same way place to place, otherwise it will be junk in, junk out. It’s costly and the transition will take time but we have developed the tools to facilitate it and there is an evolution underway,” Hubaux said.

“The willingness to work at scale is a change of culture and, hopefully, this is a virtuous circle: people feel encouraged to be more rigorous in terms of the way they store and structure their data in order to guarantee interoperability because if they don’t, their institution may be excluded from the rest of the community. This is really a side benefit—better overall quality of health and medical data.”

More information:
Hyunghoon Cho et al, Secure and federated genome-wide association studies for biobank-scale datasets, Nature Genetics (2025). DOI: 10.1038/s41588-025-02109-1

Provided by
Ecole Polytechnique Federale de Lausanne


Citation:
A new cryptography framework for secure genomic studies (2025, March 3)
retrieved 3 March 2025
from https://medicalxpress.com/news/2025-03-cryptography-framework-genomic.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no
part may be reproduced without the written permission. The content is provided for information purposes only.



Source link

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top