Many psychologists and psychiatrists have shared the vision, noting that fewer than half of people with a mental disorder receive therapy, and those who do might get only 45 minutes per week. Researchers have tried to build tech so that more people can access therapy, but they have been held back by two things.
One, a therapy bot that says the wrong thing could result in real harm. That’s why many researchers have built bots using explicit programming: The software pulls from a finite bank of approved responses (as was the case with Eliza, a mock-psychotherapist computer program built in the 1960s). But this makes them less engaging to chat with, and people lose interest. The second issue is that the hallmarks of good therapeutic relationships—shared goals and collaboration—are hard to replicate in software.
In 2019, as early large language models like OpenAI’s GPT were taking shape, the researchers at Dartmouth thought generative AI might help overcome these hurdles. They set about building an AI model trained to give evidence-based responses. They first tried building it from general mental-health conversations pulled from internet forums. Then they turned to thousands of hours of transcripts of real sessions with psychotherapists.
“We got a lot of ‘hmm-hmms,’ ‘go ons,’ and then ‘Your problems stem from your relationship with your mother,’” said Michael Heinz, a research psychiatrist at Dartmouth College and Dartmouth Health and first author of the study, in an interview. “Really tropes of what psychotherapy would be, rather than actually what we’d want.”
Dissatisfied, they set to work assembling their own custom data sets based on evidence-based practices, which is what ultimately went into the model. Many AI therapy bots on the market, in contrast, might be just slight variations of foundation models like Meta’s Llama, trained mostly on internet conversations. That poses a problem, especially for topics like disordered eating.
“If you were to say that you want to lose weight,” Heinz says, “they will readily support you in doing that, even if you will often have a low weight to start with.” A human therapist wouldn’t do that.
To test the bot, the researchers ran an eight-week clinical trial with 210 participants who had symptoms of depression or generalized anxiety disorder or were at high risk for eating disorders. About half had access to Therabot, and a control group did not. Participants responded to prompts from the AI and initiated conversations, averaging about 10 messages per day.
Participants with depression experienced a 51% reduction in symptoms, the best result in the study. Those with anxiety experienced a 31% reduction, and those at risk for eating disorders saw a 19% reduction in concerns about body image and weight. These measurements are based on self-reporting through surveys, a method that’s not perfect but remains one of the best tools researchers have.