sia.hackernoon.com

In this article, Aleksandr Karabatov, Project Manager at Social Discovery Group, explores the development and implementation of an AI-based moderation system designed to handle profiles and user-generated content on a large dating platform. The piece delves into the challenges faced, technological choices made, and the significant business impacts of transitioning from manual to AI-driven moderation.

Within nine months, our SDG team successfully developed a fully autonomous AI-based moderation system for profiles and user-generated content on an online dating platform with more than 1 million daily active users, sending 5 million messages and uploading hundreds of thousands of photos. 📸

This project became a significant research initiative, essentially a large-scale experiment that yielded valuable insights. 🚀 In this article, I want to share the critical decisions we made and their outcomes.

Moderation System Challenges

Speed ⚡

From a user’s perspective, the moderation process during dating site registration may appear straightforward — a brief self-description and a few images. However, at scale, for high-traffic platforms, this results in substantial processing queues. Accounting for fluctuations due to marketing campaigns, these processing queues can be reasonably forecasted.

The next issue concerns the balancing of SLA (Service Level Agreement) and CR2 (Conversion Rate to Registration). In a manual moderation system, every second saved equals one more staff member onboarded. A human moderator has a work schedule, a workstation, and might fall ill.

Additionally, employee attrition makes recruitment a continuous, costly cycle. Even with near-infinite resources, manual moderation takes time: open a case, make a decision, send a response. All of this negatively impacts the amount of user registrations — people don’t get the initial service for some time and drop off. Then, there’s another major issue: errors.

Quality🎯:
Mistakes are inevitable. Even advanced systems have yet to achieve 100% accuracy. The variability in results depends on several factors, primarily the clarity of the task. For example, if you ban users with names starting with the letter “A,” you’ll get a minimal error rate. But should we allow “The Alex”? Or “mr.Alex”? A heuristic system will say yes, a human would say no. An AI would handle this task with a well-crafted prompt.

A harder challenge is determining whether a user intended to scam other users. 🚩 One more challenge observed in manual moderation is moderators making speculative or subjective judgments in such scenarios, such as “Scammer-style email”; “I’ve definitely seen this photo before”; “Ten registrations in a row from Paris on an Asian dating site? Definitely a fake!”

Refining moderation checklists can help ✅, but errors will still happen. Plus, experience cannot simply be transferred digitally via the cloud; restraining consumes valuable time. Fatigue can lead to a decline in performance quality, necessitating additional staff and increased costs. Additionally, there is always the risk of intentional misconduct.

Conclusion 💡:

Manual moderation systems suffer from slow decision-making, complex staffing, and difficult knowledge transfer — making them slow, expensive, and not consistently reliable. Clearly, there is an opportunity for innovation.

The Choice of The Technology

We decided to add AI elements to our moderation system. The technology selected depends significantly on the task and available budget. Since we needed to work with both text and images, we required models with strong vision capabilities.

We reviewed services specializing in automated moderation, considered fine-tuning open-source models, but eventually landed on ChatGPT. 📚 By prompting the model clearly — such as “You’re a moderator on a dating site,” — it efficiently identifies financial scams, background individuals, and other complex issues with over 80% accuracy. It’s already cheaper than a full moderation team for the same workload, and there’s still plenty of room to optimize.

Prompting

Using LLM models for moderation requires a well-crafted prompt. Prompt engineering is not something typically taught at universities. Our ML engineers confirmed that no straightforward, ready-to-use solutions currently exist, so we began conducting our own experiments.

We quickly rejected the idea of outsourcing prompt creation. ⚠️ It’s preferable to have an in-house developer who, even without specialist experience, can continuously refine prompts. Our task demands constant adjustments due to frequently changing inputs impacting moderation outcomes. Fraudulent users discovering loopholes necessitate prompt modification. When trends change, the prompt must be updated. Model upgrades also require prompt revisions. For example, on Halloween, you risk falsely rejecting photos of happy customers posing with a plastic knife in their head.

I’ve highlighted a few basic key findings about prompt:

Introduction: Briefly explain the task and purpose to the model;
Actions: it is necessary to precisely and concisely instruct the models on what constitutes problematic content and specify the appropriate response to return upon detection or absence of such content;
Examples: It is extremely important to provide several examples for each type of issue, showing what we consider acceptable and what we do not. Any exceptions should be clearly described;
**Notes:**Context is essential for the model to arrive at accurate decisions; however, longer prompts increase the likelihood of hallucinations. It is essential to manage the quantity and context within prompts carefully to find the balance between the number of different prompts and the amount of context they contain.

Data Labeling

Effective prompt development inevitably requires precise data labeling. The quality of labeled data directly impacts model performance. While there are many crowd-sourcing services, we decided to build our own labeling team.

You can find lots of guides online on how to set up a data labeling pipeline, and these are the key points I think matter most:

Clear Taxonomy: Provide a comprehensive list of all possible labels, each with detailed definitions and multiple examples.
Team: Each data item must be labeled by at least three independent labelers to reduce bias and improve reliability through consensus.
Communication: Do onboarding sessions, go over sample cases together, and allow labelers to mark something as “unclear” if necessary.
Improvement: After each round of labeling, refine the taxonomy and adjust the team if quality concerns arise.

One of the most important criteria for a complete dataset is the inclusion of a sufficient number of randomly selected positive and negative events across all anticipated content types. 📊For instance, a prompt optimized for adult-related content may fail to deliver accurate results when used with child-related content.

System Architecture

At this stage, we have an initial prompt that produces satisfactory results on a trusted dataset. We’ve aligned on the acceptable thresholds for our AI moderator in terms of precision and recall, and are ready for system integration. 🔄 There are a few important points to keep in mind.

Such systems require ongoing validation and refinement. It’s essential to set up a process for regularly annotating fresh results and monitoring accuracy across different content types.
Human-in-the-loop methodology. A human should review corner cases where the AI isn’t 100% confident. This reduces immediate risks and, more importantly, enables continuous model improvement and prompt refinement.

Business Impact 🎯

The initial integration significantly improved user registration efficiency, reducing processing time by a factor of 60 while maintaining moderation quality. Additionally, automation helped us standardize the objectivity of decisions, enabling us to quickly identify new issues and needs and improve the process. The system is already reducing costs for the company, and further optimization lies ahead.

Up to this point, we have used the most popular and sophisticated solutions available on the market. However, due to the rapid growth of AI, we now have access to a broad range of providers. Moreover, OpenAI released several significant updates during our project development.

We designed a flexible architecture capable of supporting multiple models simultaneously, enabling precise tuning of specific components for different models and swift replacement as needed. This extends to applying distinct models to various categories across content groups — for instance, one model addresses straightforward requests from the Asian market, while another manages complex queries from the European market.

Stage 0–1 Summary

Our internal startup was developed by a dedicated team of just six permanent members. At various project stages, we leveraged additional expertise from colleagues but the core team remained small. In a short period, our work was successfully integrated into a full-scale product, significantly motivating our team. Throughout the project, we experienced a true sense of experimentation, rapidly testing a wide range of hypotheses, developing unique approaches, and implementing them live.

✅ The system has become faster, more cost-effective, and more consistent in quality, leading me to conclude that the objectives of the first phase have been met.

Case Study: How We Built an AI-Based Moderation System