Alain Areeba Siddiqui / Data Science / Faculty Mentor: Masoud Rostami

Diabetes is a global health concern, affecting over 38.4 million people in the United States and more than 830 million people worldwide. Early detection and risk assessment are crucial for improving patient outcomes and reducing the burden on healthcare systems. Machine learning (ML) models can enhance diabetes prediction by identifying patterns in patient data that are not immediately evident through traditional diagnostic methods. This study uses ensemble learning algorithms to predict diabetes risk level based on various health indicators, including BMI, patient history, and socioeconomic factors. Using the CDC’s Diabetes Health Indicators dataset, multiple ML models were trained and combined to improve predictive accuracy. The final ensemble model achieved strong performance, with an accuracy of 0.97, a precision of 0.96, recall of 0.97, and an average F1-score of 0.96, demonstrating its effectiveness in diabetes risk classification. The findings highlight the potential for integrating machine learning models into patient software systems such as Epic Systems or Oracle Health. Such models can support healthcare providers in early diagnosis and treatment planning. By using AI-driven insights, these models can identify key risk factors and contribute to more personalized and efficient diabetes management strategies, ultimately improving patient care.
Diego Maldonado
Interesting and well-made project! Many people in my family suffer from diabetes, and I hope that one day models like these can be used to catch pre-diabetes before it even has a chance to develop into diabetes.