Skip to main content

Transforming the understanding
and treatment of mental illnesses.

Smartphone Data May Not Reliably Predict Depression Risk in Diverse Groups

Research Highlight

Smartwatches, smartphones, and other wearable devices are transforming how we track our physical health and behavior. Researchers are also exploring whether these devices might provide insights into our mental health, with the goal of developing AI tools that can help identify when people need mental health support or professional care. However, research supported by the National Institute of Mental Health suggests that AI tools built on smartphone data may struggle to accurately predict clinical outcomes like depression in large and diverse groups of people.

What did the researchers do?

Lead author Daniel Adler of Cornell University and colleagues from Northwestern University Feinberg School of Medicine, Weill Cornell Medicine, and Michigan Medicine analyzed behavioral data from 650 people, collected via their smartphones. While the study was larger and more diverse than previous studies, participants were primarily female, White, middle to high income, and between 25 to 54 years old.

The smartphone data included behavioral measures related to mobility, phone usage, and sleep. Participants also completed the PHQ-8, a standard self-report measure of depression symptoms.

Drawing from recent studies, the researchers developed AI models that analyzed the smartphone data to produce a depression risk score for each participant, indicating the likelihood of clinically significant depression. The researchers then assessed the reliability of the models by identifying age, race, sex, and socioeconomic subgroups for whom the model predictions were less accurate.

What did the researchers find?

Overall, the best-performing AI model proved to be only moderately accurate in predicting who had clinically significant depression (as measured by the PHQ-8). While the model identified some patterns, it consistently underperformed for specific groups of people. For instance, the researchers found that the model was skewed toward identifying people as having a higher risk of depression if they were older, female, Black or African American, low income, unemployed, or on disability. On the other hand, the model was skewed toward identifying people as having a lower risk of depression if they were younger, male, White, high income, insured, or employed.

To better understand these results, the researchers examined how the AI model associated different behaviors with depression risk.

For example, the AI model predicted that higher phone usage in the morning was generally associated with lower depression risk. However, when the researchers looked at the data, they found this association did not hold across all age subgroups. While higher morning phone usage was linked with lower depression risk for young adults (ages 18 to 25 years), it was associated with higher risk for older adults (ages 65 to 74 years)

The AI tool also predicted that measures of increased mobility, as captured by GPS, were generally associated with lower depression risk. However, the underlying data showed these associations did not hold across all income-related subgroups. For people who came from low-income households, who were on disability, and who were uninsured, greater mobility was associated with higher depression risk.

What do the findings mean?

The findings highlight the challenges of using AI models built on smartphone data to predict mental health outcomes across a large, diverse group of people. When associations between people’s behavioral patterns and their mental health outcomes vary across demographic groups, AI models may be more likely to make incorrect predictions for some of those groups, leading to skewed results.

According to the researchers, the results underscore the importance of developing AI tools using data from people whose behavioral patterns are similar to those of the intended population. One way to increase the effectiveness of AI models may be to develop predictive models that are focused on smaller, more targeted populations.

The researchers note that their study focused on associations between behaviors and depression risk across individuals. It is possible that personalized models—models built on behavioral data from one person over time—may be able to predict individual depression risk more accurately. 

Reference

Adler, D. A., Stamatis, C. A., Meyerhoff, J., Mohr, D. C., Wang, F., Aranovich, G. J., Sen, S., & Choudhury, T. (2024). Measuring algorithmic bias to analyze the reliability of AI tools that predict depression risk using smartphone sensed-behavioral data. npj Mental Health Research, 3(17). https://doi.org/10.1038/s44184-024-00057-y 

Grants

MH111610 , MH128640 , MH115882