[D] Modeling online discourse escalation as a state machine (dataset + labeling approach)
New framework treats Reddit arguments as state transitions with identity activation layers.
An independent researcher is proposing a novel computational framework that models online discourse escalation as a state machine, moving beyond simple toxicity classification to capture the dynamic progression of arguments. The model defines six distinct states: Neutral (information exchange), Disagreement, Identity Activation, Personalization, Ad Hominem, and Dogpile (multi-user targeting, deemed non-recoverable). Each comment in a thread receives a local state label, while the entire thread has a global state that evolves, creating a sequence classification problem. The researcher plans to collect and label public data from platforms like Reddit to build a training dataset.
Key features for classification include linguistic signals like increases in second-person pronouns and sentiment shifts, structural patterns like reply velocity and thread depth, and contextual factors such as topic sensitivity. A unique second layer of analysis focuses on identity activation—personal, ideological, and group—with the hypothesis that simultaneous activation correlates with rapid escalation. The core research questions explore whether to model this as per-comment classification or sequence modeling (using HMMs, RNNs, or transformers), and how to best define ambiguous states or treat emergent properties like 'dogpiling.'
- Proposes a 6-state model (Neutral to Dogpile) to classify discourse escalation as a sequence problem.
- Uses multi-layered features: linguistic (pronoun shifts), structural (reply velocity), and identity activation signals.
- Aims to build a labeled dataset from Reddit to train ML models, moving beyond basic toxicity detection.
Why It Matters
Could enable platforms to detect and de-escalate toxic arguments early, improving online community health.