Learning to Defer in Content Moderation: The Human-AI Interplay

Created time
Jun 18, 2024 09:41 PM
notion image

Computer Science > Machine Learning

Successful content moderation in online platforms relies on a human-AI collaboration approach. A typical heuristic estimates the expected harmfulness of a post and uses fixed thresholds to decide whether to remove it and whether to send it for human review. This disregards the prediction uncertainty, the time-varying element of human review capacity and post arrivals, and the selective sampling in the dataset (humans only review posts filtered by the admission algorithm).
Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Science and Game Theory (cs.GT); Human-Computer Interaction (cs.HC); Performance (cs.PF)
Cite as:
(or arXiv:2402.12237v3 [cs.LG] for this version)

Submission history

From: Wentao Weng [view email]
[v1] Mon, 19 Feb 2024 15:47:47 UTC (205 KB)
[v2] Fri, 26 Apr 2024 14:33:24 UTC (207 KB)
[v3] Sun, 2 Jun 2024 16:02:24 UTC (407 KB)