Machine-learning model uses patient records to predict sleep disorders
A machine learning model can effectively predict a patient’s risk for a sleep disorder using demographic and lifestyle data, physical exam results and laboratory values, according to a new study published this week in the open-access journal PLOS ONE by Samuel Y. Huang of Virginia Commonwealth University School of Medicine, and Alexander A. Huang of Northwestern Feinberg University School of Medicine.
The prevalence of diagnosed sleep disorders among American patients has significantly increased over the past decade. This trend is important to better understand and reverse since sleep disorders are a significant risk factor for diabetes, heart disease, obesity, and depression.
In the new work, the researchers used the machine-learning model XGBoost to analyze publicly available data on 7,929 patients in the US who completed the National Health and Nutrition Examination Survey. The data contained 684 variables for each patient, including demographic, dietary, exercise and mental health questionnaire responses, as well as laboratory and physical exam information.
Overall, 2,302 patients in the study had a physician diagnosis of a sleep disorder. XGBoost could predict the risk of sleep disorder diagnosis with a strong accuracy (AUROC=0.87, sensitivity=0.74, specificity=0.77), using 64 of the total variables included in the full dataset. The greatest predictors for a sleep disorder, based on the machine-learning model, were depression, weight, age and waist circumference.
The authors conclude that machine learning methods may be effective first steps in screening patients for sleep disorder risk without relying on physician judgment or bias.
Samuel Y. Huang said, “What sets this study on the risk factors for insomnia apart from others is seeing not only that depressive symptoms, age, caffeine use, history of congestive heart failure, chest pain, coronary artery disease, liver disease, and 57 other variables are associated with insomnia, but also visualizing the contribution of each in a very predictive model.”