
Systemic sclerosis (SSc) is a severe autoimmune disease with complex genetic causes. Some genetic contributors have been identified, but others remain unknown, which has impeded development of targeted treatments. In a new study published in Annals of the Rheumatic Diseases, researchers at Baylor College of Medicine and collaborating institutions used complementary approaches that integrate exome sequencing and evolutionary action machine learning to identify protein changes and their associated mechanisms in SSc.
Previous genome-wide association studies (GWAS) that analyzed the frequency of common genetic variants show the strongest genetic contributors located in the human leukocyte antigen (HLA) region on chromosome six. In this study, researchers led by first author Dr. Shamika Ketkar performed GWAS using exome sequencing data from 2,559 SSc patient cases and 893 healthy control cases in the Scleroderma Family Registry and DNA Repository at the University of Texas Health Science Center at Houston. They aimed to find novel genes and rare variants contributing to SSc risk.
“What truly surprised and excited us was the discovery and replication of MICB, a gene located within the HLA region but acting independently of the classical HLA genes. MICB had not previously been implicated in systemic sclerosis, and its identification represents a novel genetic contributor and a potential therapeutic target,” said Ketkar, assistant professor of molecular and human genetics at Baylor.
Collaborators in Spain replicated the findings using previously published European GWAS data comprising nearly 10,000 cases, further strengthening the significance of the findings. At Baylor, Dr. Olivier Lichtarge’s lab used its evolutionary action-machine learning (EAML) framework to analyze the exome sequencing data and prioritize genes with high-impact variants predictive of SSc.
The results once again pointed to MICB, as well as other genes on chromosome six like NOTCH4 and rare missense variants in genes enriched in interferon signaling (a key pathway in the immune system), including IFI44L and IFIT5.
“With our machine learning framework, we are not only identifying whether a variant occurs frequently, but also, using evolutionary data across all species, we are weighing the likelihood the variant is functionally disruptive to the protein and eventually to the patient,” said Lichtarge, Cullen Chair and professor of molecular and human genetics, biochemistry and molecular biology and pharmacology.
“We previously used this method in diseases with much larger genome data sets, like Alzheimer’s disease and heart disease, and in this study, we show that it can be effective in complex diseases with a smaller patient data set.”
To understand the functional impact of the genetic variants identified in the study, researchers integrated publicly available single-cell RNA sequencing data from SSc skin biopsies to resolve cell type-specific expression patterns of risk genes. They also performed expression quantitative trait locus (eQTL) analysis using whole blood datasets to establish regulatory links between disease-associated variants and transcriptomic changes.
MICB and NOTCH4 were found to be expressed in fibroblasts and endothelial cells, two cell types that play central roles in fibrosis and vasculopathy, key clinical features of SSc. These complementary analyses confirmed functional regulatory effects of identified risk genes.
“To solve complex diseases like SSc, we need to combine different approaches and machine learning to the analysis of large DNA, RNA and protein data sets to discover otherwise hidden targets for treatment,” said corresponding author Dr. Brendan Lee, professor, chair and Robert and Janice McNair Endowed Chair of molecular and human genetics at Baylor.
More information:
Shamika Ketkar et al, Integrative exome sequencing and machine learning identify MICB and interferon pathway genes as contributors to SSc risk, Annals of the Rheumatic Diseases (2025). DOI: 10.1016/j.ard.2025.05.009
Citation:
Integrative exome sequencing and machine learning identify new genes contributing to systemic sclerosis risk (2025, June 16)
retrieved 16 June 2025
from https://medicalxpress.com/news/2025-06-exome-sequencing-machine-genes-contributing.html
This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no
part may be reproduced without the written permission. The content is provided for information purposes only.