🤖 Real or Fake

The Most Valuable Dataset
on the Internet

Every user who identifies AI-generated text creates a human-labeled detection example. Collect thousands per day. Train a detector that actually works.

Two Texts. One Question.
Which Is Human?

Human Text

"my dog kept waking me up at 3am so I finally just got up and we watched TV together, he seemed very pleased with himself"

AI Text

"As the dawn broke over the horizon, painting the sky in hues of amber and rose, I contemplated the profound mysteries of existence..."

Users pick the human text. Their choice + reasoning time = a labeled training pair. At scale, you have a dataset that can train a world-class AI content detector.

Why This Data Is
Extraordinarily Valuable

🎯 High Quality Labels

Human judgment is the ground truth for AI detection. No synthetic labels — real humans deciding in real time.

📈 Exponential Demand

As AI-generated content floods the internet, AI detection is one of the fastest-growing needs in enterprise and media.

🔄 RLHF Ready

Preference pairs (human preferred over AI) are exactly the format needed for RLHF fine-tuning — the most powerful training signal.

🔬 Constantly Fresh

AI writing styles evolve. Your dataset updates daily as users interact with your site — always current, never stale.