AIVerify · Accuracy & Methodology

Benchmark Results

Tested on a dataset of known AI-generated images and verified real photographs.

92%

AI Images Caught

Out of 200 known AI-generated images

8%

False Positive Rate

Real photos incorrectly flagged as AI

75%

Modern Models

Midjourney v6, DALL-E 3, Flux (the hardest cases)

80%

Text Detection

AI-written content correctly identified

These numbers will change as AI models improve. We update this page when we run new tests. Last updated: June 2, 2026

Tested against: Midjourney v6, DALL-E 3, Stable Diffusion XL, Flux, real camera photographs from various devices

How We Reach a Verdict

Three independent signals are combined into a single weighted confidence score.

55% weight

Visual Analysis

Claude AI examines the image for generation artifacts, unnatural patterns, diffusion model signatures, and logical inconsistencies. Contributes 55% of the combined score.

25% weight

Metadata (EXIF)

Checks for missing camera data, suspicious software markers, and date inconsistencies. Abstains entirely if no metadata is present, with no penalty for clean images. Contributes 25% of the combined score.

20% weight

Error Level Analysis

Re-compresses the image and measures where pixels changed. Edited or synthetically generated regions show abnormal error patterns. Contributes 20% of the combined score.

Where It Struggles

No detector is perfect. Here’s where AIVerify is most likely to have trouble, so you can weigh results accordingly.

New model images are harder to catch. Every new generation of AI image models produces cleaner outputs that defeat detectors trained on older data. We improve continuously but we will never claim 100% accuracy.
Heavy JPEG compression can affect ELA results. Images that have been saved and re-saved multiple times may show false positives on the ELA signal.
PNG files have no EXIF data. This is normal: our metadata checker abstains rather than penalizing clean PNG files.
Screenshots of AI images may score differently than the original file due to additional compression introduced by the screenshot.

We publish our limits.
No other tool does.

Benchmark Results

How We Reach a Verdict

Visual Analysis

Metadata (EXIF)

Error Level Analysis

Where It Struggles

Try it yourself

We publish our limits.No other tool does.

Benchmark Results

How We Reach a Verdict

Visual Analysis

Metadata (EXIF)

Error Level Analysis

Where It Struggles

Try it yourself

We publish our limits.
No other tool does.