Learning to Defer in Content Moderation: The Human-AI Interplay

Computer Science > Machine Learning

Successful content moderation in online platforms relies on a human-AI collaboration approach. A typical heuristic estimates the expected harmfulness of a post and uses fixed thresholds to decide whether to remove it and whether to send it for human review. This disregards the prediction uncertainty, the time-varying element of human review capacity and post arrivals, and the selective sampling in the dataset (humans only review posts filtered by the admission algorithm).

Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Science and Game Theory (cs.GT); Human-Computer Interaction (cs.HC); Performance (cs.PF)
Cite as:	arXiv:2402.12237 [cs.LG]
ㅤ	(or arXiv:2402.12237v3 [cs.LG] for this version)
ㅤ	https://doi.org/10.48550/arXiv.2402.12237

Submission history

From: Wentao Weng [view email]

[v1] Mon, 19 Feb 2024 15:47:47 UTC (205 KB)

[v2] Fri, 26 Apr 2024 14:33:24 UTC (207 KB)

[v3] Sun, 2 Jun 2024 16:02:24 UTC (407 KB)