Machine Learning Tools for Long-Term Type 2 Diabetes Risk Prediction

Machine Learning Tools for Long-Term Type 2 Diabetes Risk Prediction

Abstractnbsp;

A steady rise has been observed in the percentage of elderly people who want and are still able to contribute to society. Therefore, early retirement or exit from the labour market, due to healthrelated issues, poses a significant problem. Nowadays, thanks to technological advances and various data from different populations, the risk factors investigation and health issues screening are moving towards automation. In the context of this work, a worker-centric, IoT enabled unobtrusive users health, wellbeing and functional ability monitoring framework, empowered with AI tools, is proposed. Diabetes is a high-prevalence chronic condition with harmful consequences for the quality of life and high mortality rate for people worldwide, in both developed and developing countries. Hence, its severe impact on humansrsquo; life, e.g., personal, social, working, can be considerably reduced if early detection is possible, but most research works in this field fail to provide a more personalized approach both in the modeling and prediction process. In this direction, our designed system concerns diabetes risk prediction in which specific components of the Knowledge Discovery in Database (KDD) process are applied, evaluated and incorporated. Specifically, dataset creation, features selection and classification, using different Supervised Machine Learning (ML) models are considered. The ensemble WeightedVotingLRRFs ML model is proposed to improve the prediction of diabetes, scoring an Area Under the ROC Curve (AUC) of 0:884. Concerning the weighted voting, the optimal weights are estimated by their corresponding Sensitivity and AUC of the ML model based on a bi-objective genetic algorithm. Also, a comparative study is presented among the Finnish Diabetes Risk Score (FINDRISC) and Leicester risk score systems and several ML models, using inductive and transductive learning. The experiments were conducted using data extracted from the English Longitudinal Study of Ageing (ELSA) database.