AI-driven model predicts type 1 diabetes risk with greater accuracy
by Dr. Sanchari Sinha Dutta, Ph.D. · News-MedicalBy combining large-scale genetics with machine learning, researchers uncover hidden risk patterns and distinct patient subtypes that could transform how type 1 diabetes is identified and understood.
Researchers performed genetic association analysis and machine learning methods to classify and estimate genetic risk for type 1 diabetes. The study is published in Nature Genetics.
Genetic and immune factors drive complex type 1 diabetes risk
Type 1 diabetes is a chronic metabolic disease characterized by destruction of pancreatic beta cells, leading to a lack of insulin production and resulting in hyperglycemia (high blood sugar). Evidence suggests that the disease develops in genetically susceptible individuals upon exposure to environmental triggers.
The disease typically appears in childhood and adolescence; however, adults are also susceptible. Autoantibodies that specifically target insulin-secreting pancreatic cells are often used as a biomarker to predict the clinical onset of type 1 diabetes. However, these autoantibodies are transient and less frequently found in adult-onset cases, restricting timely disease prediction.
To improve risk prediction, focus has been on genetic factors that can identify susceptible individuals. Genetic variants in class I and II Major Histocompatibility Complex (MHC) genes are the largest risk factors for type 1 diabetes. A collective inheritance of these genes can increase disease risk by 16-fold.
Genetic risk scores have been developed and used widely for early prediction of type 1 diabetes risk, which is vital for preventing adversities like diabetic ketoacidosis at diagnosis. In this study, researchers at the University of California and Broad Institute conducted genetic association analysis and used a machine learning model, T1GRS, to improve the gold-standard genetic risk score for type 1 diabetes.
Machine learning model improves genetic classification of type 1 diabetes
The researchers found that the machine learning model T1GRS improves classification accuracy, with higher area-under-the-curve (AUC) values across multiple cohorts. Classification was improved, particularly among individuals without high-risk HLA haplotypes and those with more complex genome-wide risk profiles in Europeans and African Americans.
The model showed 89 % sensitivity and 84 % specificity for type 1 diabetes at an optimal threshold in the discovery dataset, with high efficacy in distinguishing individuals with diabetes from those without.
The researchers identified genetic variants at 79 known loci and 8 previously unreported loci that were not previously associated with type 1 diabetes. They also conducted both MHC-specific and genome-wide association analyses and identified several type 1 diabetes-related novel variants that influence immune functions and gene activation.
A total of 199 identified risk variants were used to train the machine learning model, including lead variants at 102 non-MHC regions. Using these variants identified across the genome and within the MHC region, the model generated a T1GRS score to identify individuals with type 1 diabetes risk. A key advantage of the model is its ability to capture nonlinear interactions between genetic variants, identifying numerous interactions between MHC and non-MHC loci that contribute to disease risk.
The analysis of genetic factors that robustly influenced each person's T1GRS score led to categorization of diabetic individuals into four subtypes: T cell-enriched, MHC-enriched, pancreas-enriched, and MHC-driven. The analysis revealed that individuals with well-known high-risk genetic variants for type 1 diabetes are more likely to get the disease in childhood (early-onset).
T1GRS advances genetic screening across diverse populations
These features make T1GRS a potentially improved clinical screening tool compared to previous genetic risk scores, which most accurately predict type 1 diabetes risk in higher-risk individuals with enriched family history and early age of onset.
Since both genetic and environmental factors can influence the complex pathophysiology of type 1 diabetes, there remain inherent limitations to the predictive ability of genetic data. Machine learning models that combine genetic data with molecular signals influenced by environmental triggers can further improve disease risk prediction when genetic data alone cannot fully capture disease risk.
Download your PDF copy by clicking here.
Journal reference:
- McGrail C. (2026). Genetic association and machine learning improve the prediction of type 1 diabetes risk. Nature Genetics. DOI: https://doi.org/10.1038/s41588-026-02578-y. https://www.nature.com/articles/s41588-026-02578-y