New framework shifts cyberbullying governance from static detection to proactive moderation
A unified four-stage model tackles online toxicity continuously, not just after posts go viral.
Get AI news that actually matters
One email a day. Zero fluff. Join 10,000+ professionals.
A new arXiv paper by Yiting Huang and eight co-authors introduces a unified full-lifecycle governance framework for cyberbullying on social media. The authors argue that existing research treats moderation as passive, static detection at the individual post level, ignoring user behavior dynamics, how toxicity diffuses structurally, and the need for proactive intervention. Their proposed paradigm shifts governance toward integrated, continuous, and proactive moderation across four interconnected stages: (1) Content Identification, (2) User and Behavior Modeling, (3) Diffusion Dynamics and Early Warning, and (4) Intervention and Governance.
Beyond outlining the framework, the paper systematically reviews state-of-the-art literature, datasets, and evaluation practices for each stage. It highlights emerging challenges such as handling multimodal content (text, images, video), ensuring explainability in moderation decisions, maintaining algorithmic fairness across demographics, and managing the dual-use risks of generative AI that could amplify toxicity. The authors present a roadmap for future research aimed at building safer, more resilient digital ecosystems. This work provides a comprehensive foundation for both researchers and platform designers seeking to move from reactive censorship to intelligent, preventive governance of online harm.
- Proposes four-stage framework: content identification, user/behavior modeling, diffusion dynamics/early warning, and intervention/governance.
- Shifts cyberbullying governance from isolated post-level detection to continuous, proactive moderation.
- Addresses emerging challenges including multimodality, explainability, algorithmic fairness, and generative AI dual-use risks.
Why It Matters
This framework could enable platforms to preemptively disrupt cyberbullying diffusion rather than only reacting after harm occurs.