Research & Papers

LLM-Augmented Therapy Normalization and Aspect-Based Sentiment Analysis for Treatment-Resistant Depression on Reddit

A new study uses LLM-augmented analysis on 23,399 medication mentions to reveal real-world patient experiences with treatment-resistant depression.

Deep Dive

A research team from Emory University and the University of Florida has published a novel study applying advanced natural language processing to understand patient experiences with treatment-resistant depression (TRD). The researchers curated a substantial dataset of 5,059 Reddit posts from 3,480 users across 28 mental health subreddits, spanning from 2010 to 2025. Within these posts, they identified 23,399 mentions of 81 different generic medications, after normalizing brand names and colloquial terms using a lexicon-based approach. This creates one of the largest real-world, patient-generated corpora for analyzing TRD treatment perceptions.

To analyze sentiment, the team fine-tuned Microsoft's DeBERTa-v3 language model on the SMM4H 2023 therapy-sentiment Twitter corpus, employing large language model (LLM) based data augmentation to improve performance. Their classifier achieved a robust micro-F1 score of 0.800 on a standard test set. Applying this model to the Reddit data revealed that overall sentiment was predominantly neutral (72.1%), with negative mentions (14.8%) slightly outnumbering positive ones (13.1%). The analysis uncovered significant sentiment differences between drug classes, with conventional antidepressants like SSRIs and SNRIs showing consistently higher negative proportions, while newer treatments like ketamine and esketamine had more favorable sentiment profiles.

This research demonstrates a powerful methodology for extracting patient-centric insights from online forums at scale. By combining automated medication normalization with aspect-based sentiment analysis, the study provides a complementary lens to clinical trials, capturing nuanced patient-reported tolerability and experiences that are often missed in traditional research settings. The year-by-year and subreddit-specific tracking also offers a dynamic view of how patient discourse and perception evolve over time and across different online communities.

Key Points
  • Analyzed 5,059 Reddit posts (2010-2025) containing 23,399 mentions of 81 medications for treatment-resistant depression.
  • Fine-tuned DeBERTa-v3 model with LLM-augmented data achieved a 0.800 micro-F1 score for aspect-based sentiment classification.
  • Found ketamine/esketamine had more favorable sentiment than conventional SSRIs/SNRIs, with overall sentiment being 72.1% neutral, 14.8% negative, and 13.1% positive.

Why It Matters

Provides large-scale, real-world patient sentiment data to complement clinical trials, potentially informing treatment decisions and drug development for severe depression.