Personalising Type 2 Diabetes Screening with Genetics and Machine Learning

Abstract: Early detection of type 2 diabetes remains a major public health challenge, as many individuals develop the disease years before symptoms appear. In this study, researchers developed a two-stage machine learning framework that combines genetic risk (polygenic risk scores) with routine clinical data to identify people at higher risk of developing diabetes within five years. The model was trained using data from the diverse All of Us Research Program and externally validated in the UK Biobank, ensuring robust performance across different populations. By integrating genetics with traditional risk factors, the approach aims to move beyond one-size-fits-all screening tools.

What makes this framework practical is its alignment with real clinical workflows. The first stage identifies at-risk individuals without requiring immediate blood tests, while the second stage refines risk using standard glycaemic measures such as HbA1c or fasting glucose. The model consistently outperformed existing screening tools and showed stable performance across sex and ancestry groups. These findings highlight how combining genomics and machine learning can support earlier, more personalised diabetes prevention strategies at the population scale.

Read the full study here: https://pmc.ncbi.nlm.nih.gov/articles/PMC12853511/

Figure 1