Cove | AI-Powered Trust & Safety

Created time
Oct 2, 2024 10:45 PM
Posted?
Posted?
notion image
Image by upklyak on Freepik
Welcome to the second part of our series on why platform-specific content policies matter. In the last post, we covered what content policies are, why they’re critical to a well-functioning platform, and why it’s important that they capture the unique nuances and characteristics of the platform for which they’re written. In this edition, we’ll explore the challenges of enforcing platform-specific policies, and ways to overcome these challenges.

Existing Solutions: One Size Fits None

Despite the critical importance of platform-specific policies, most existing solutions offer a generic approach to policy enforcement. These solutions fall into two main categories: rules-based and keyword matching tools, and generic AI models. Each approach presents significant drawbacks.

Rules-Based and Keyword Matching Tools

Rules-based and keyword matching tools rely on predefined sets of rules and keywords to identify policy violations. While these tools can be straightforward and easy to implement, they have significant limitations.
  • Lack of Nuance: These tools often miss the context and intent behind user actions. For example, keyword-based hate speech detection will flag innocuous uses of certain words while missing more sophisticated or coded language used to harass or intimidate.
  • Evolving Harms: As harmful behavior evolves, these static tools struggle to keep up. New slang, coded language, and context-specific terms can easily bypass keyword filters, rendering them ineffective.
  • Crude Enforcement: The blunt nature of these tools often leads to high rates of false positives (flagging acceptable content) and false negatives (missing harmful content), undermining user trust and satisfaction.
  • High Effort: It’s nearly impossible to curate robust keyword lists that cover all types of harm (e.g. hate speech, violence, scams) in all languages and social contexts. Keywords will therefore never be sufficient on their own.

Generic AI Models

For the last decade, generic AI models were the gold standard in automated Trust & Safety. By leveraging machine learning to detect policy violations, they provide a more advanced approach than keyword-based detection. However, these models also come with significant drawbacks, particularly their lack of customization and fixed policy sets.
  • Policy Definition Mismatch: As we explain above, platform-specific content policies are critical. Each platform has unique interpretations of what constitutes spam, hate speech, or other violations. A handmade goods marketplace might have a specific definition of spam that includes mass-produced items misrepresented as handmade, which a generic model might not recognize. An ed-tech platform might have a much more stringent interpretation of what constitutes overly sexually explicit content as compared to a dating app. There is no universal consensus on how content policies should be defined, but these generic AI models are trained on one specific policy definition decided by the vendor, which likely does not reflect their customers’ policy definitions, leading to lots of enforcement errors.
  • Poor Generalization: Models trained on social media posts might perform poorly when applied to marketplace listings or user profiles. Models learn from the data on which they are trained, and they often don’t generalize well to new types of data they haven’t seen before. Most generic AI models are trained on “traditional” user-generated content, which is largely social media content. So for platforms that don’t look like Facebook or Reddit, these generic AI models may not be a good fit.
  • Incomplete Policy Coverage: Generic AI model vendors often have a fixed set of models that they offer (e.g. one for hate speech, one for violence, etc.). These providers cannot cover the full range of policies required by different platforms. If you have a policy against drug sales and your provider doesn’t have a model to detect drug sales, you’re out of luck. This leaves significant gaps in policy enforcement, necessitating manual intervention or leaving some areas entirely unchecked.
  • Inflexibility: As platforms evolve, their policies need to adapt. However, generic models are static and cannot be easily updated to reflect these changes. Most large vendor solutions struggle to incorporate new labeled data or adapt to changes in policy definitions, such as recognizing new types of hate speech in response to new world events. When policy teams make such crucial updates to platform policies, they are not effectively enforced by these generic models.

The Pitfall of Uniform Enforcement

Uniform enforcement across different platforms leads to a lack of differentiation and competitive edge. If every platform uses the same generic models, the unique aspects of each platform’s community standards and values are lost. This not only impacts user experience but also diminishes the platform’s ability to stand out in a crowded market.

The Need for Custom AI Models

To address these challenges, platforms need AI models that are customized to their specific policies. These models should be built from the ground up - starting from the policy itself - to reflect the unique guidelines and data characteristics of each platform, ensuring effective, adaptable, and comprehensive enforcement.

Cove’s Custom AI Models

We solve this problem by starting from your policies and constructing models directly based on them. By leveraging the expertise of your policy team, we ensure our models accurately reflect the unique requirements and nuances of your platform. Instead of relying on generic models that may not align with your specific needs, our approach ensures precise, effective enforcement tailored to your evolving policies. With Cove, you can maintain the integrity of your platform, foster user trust, and stay ahead of the challenges unique to your digital community.
Want to learn more about building your Custom AI Models?
Schedule a demo today