How our detector works
No machine learning model. No server. Just statistics you can understand.
Commercial AI detectors use proprietary models trained on millions of text samples. We can't replicate that in a browser tab — and we don't try. Instead, our free checker uses five heuristic signals that research and practice suggest correlate with AI-generated text.
The result is a 0–100 score where higher = more likely AI. It's a rough estimate, not a verdict.
1. Perplexity proxy (30% weight)
Language models pick probable next words. Human writers sometimes surprise you. We measure how predictable word pairs (bigrams) are using a frequency table of common English bigrams.
Very predictable text scores higher on the AI scale. Unusual word combinations score lower.
2. Burstiness (25% weight)
Humans vary. Short punchy sentence. Then a longer one that meanders a bit because that's how people actually write when they're thinking out loud.
AI tends toward uniform sentence and paragraph lengths. We calculate the standard deviation of sentence lengths and paragraph lengths. Low variation = higher AI score.
3. Vocabulary richness (20% weight)
Two measures:
- Type-token ratio — unique words divided by total words. AI often reuses the same vocabulary.
- Hapax ratio — words that appear only once. Humans use more one-off word choices.
4. Structural patterns (15% weight)
We scan for phrases commonly overused by AI: "delve," "it's worth noting," "in today's landscape," "on the one hand / on the other hand," and dozens more.
We also check for excessive hedging ("however," "nevertheless") and repetitive sentence starters — both AI habits.
5. Punctuation patterns (10% weight)
AI text tends toward em dashes, semicolons, and parenthetical asides. Human informal writing uses more exclamation marks and question marks. We weigh formal vs informal punctuation density.
Scoring thresholds
- 0–25: Likely human-written
- 26–55: Uncertain — mixed signals
- 56–100: Likely AI-generated
Known limitations
- Short text (<50 words) doesn't give enough signal. Results are unreliable.
- Edited AI text — especially humanized text — can score as human.
- Formal human writing — academic prose, legal text — can score as AI.
- Non-native English — detectors of all kinds struggle here. Ours is no exception.
- No training data — we don't know which LLM wrote the text. We estimate "machine-like" vs "human-like" patterns.
How we test commercial tools
For our comparison reviews, we use three standard test texts:
- A 500-word ChatGPT essay on a neutral topic (renewable energy)
- A human-written personal blog post with informal tone
- The same ChatGPT essay passed through a humanizer tool
We record each tool's score and whether we'd call it correct, uncertain, or wrong. No tool gets a perfect score on all three.