AI Content Detection Comparisons

Which AI Writers Pass AI Content Detection?

Given all the discussion around AI writing and AI content detectors, we decided to put the most popular tools to the test.

We want to see if AI writers can produce content that passes as human-written across all AI detectors.

Using a list of 20 random blog post topics, we:

Generated long form articles in the most popular AI writers: ChatGPT, Jasper, Article Forge, Copy.ai, and Writesonic.
Searched each topic in Google and took the article in the first result (Google clearly loves this content so it will be our human control group).

We then ran each article dataset through the 5 most popular AI content detectors: Originality.ai, OpenAI, Hugging Face detector, Writer AI detector, and GPTZero and aggregated the results below.

Results

You can view all 120 articles and how we created them here.

What do the scores for each category mean?

Originality.ai: The average probability that the content was "Original" (created by a human).

Hugging Face Detector: The average probability that the content is "Real" (not generated by GPT models).

Writer AI Detector: The average probability that the content was created by a human.

Openai Detector: The percent of articles that OpenAi did not consider to be AI-generated.

GPTZero: The percent of articles that GPTZero thought were most likely or entirely human written.

Note: We would not recommend relying on GPTZero. Even though Article Forge content was rated the most human, GPTZero detected that 70% of the human articles we tested contained at least some AI generated content, indicating that GPTZero is currently too unreliable to be trusted.

Takeaways

Human-Quality: Article Forge

In all AI content detectors, Article Forge content passes as human-written at the same rate as actual human-written content that ranks first on Google. So you can safely use Article Forge content without worrying that the content will be detected as AI-generated.

Almost Undetectable: Jasper

Jasper is a close second. Its content passes all AI detectors as human-written at almost the same rate as actual human-written content on Google, falling only slightly short with Originality.ai and GPTZero. So you can mostly use Jasper without worrying about AI detectors flagging the content as AI-generated.

Detected as AI: All other AI writing tools

All other AI content generators: Copy.ai, ChatGPT, and Writesonic were easily detected by every AI content detector.

Therefore, you cannot safely use content generated from these three tools on your website or for your clients unless you heavily edit the content manually or use an AI Evasion tool.

Wrapping up

At the end of the day, people want content that is human readable. That should be a no-brainer, as humans, we want content that flows naturally, is easy to understand, and provides us value.

AI writing has the potential to make any content creation process more efficient, but that is only possible if the content is human readable.

The results of this test show that Article Forge and Jasper can produce content that is not only readable to humans, it appears human-written to machines, too.

So as long as your content provides value to your readers and is indistinguishable from human-written content, you will have nothing to worry about!

And if you want us to test more AI writers or AI content detectors, send us a message!

Extended Methodology

To give more context around the content used for this case study, we outlined our content creation processes and included the content datasets we tested below.

Human Ranking Articles:

We searched the 20 topics on Google and took the content from the first search result of each. We did this because Google clearly thinks this content was high quality and we determined each article was almost certainly written by humans.
Download the human content we tested here.

Article Forge Articles:

We created 750 word articles for each of the 20 topics and in most cases we used the instructions field. Most importantly, we had the "Avoid AI Detection" setting turned ON. Otherwise, all default settings were used.
Download the Article Forge content we tested here.

Jasper.ai Articles:

We used the Blog Post Workflow to create articles about each of the 20 topics. We used the default settings and accepted the first generation for each prompt.
Download the Jasper content we tested here.

Copy.ai Articles:

We used the Blog Post Wizard to create articles about each of the 20 topics. We used the default settings and accepted the first generation for each prompt.
Download the Copy.ai content we tested here.

ChatGPT Articles:

We entered "write an article about" before each of the 20 topics to generate the ChatGPT content.
Download the ChatGPT content we tested here.

Writesonic Articles:

We used the AI Article Writer 4.0 with "Premium" content quality to create articles about each of the 20 topics. We used the default settings and accepted the first generation for each prompt.
Download the Writesonic content we tested here.

AI Detector Limitations

While AI content detectors are generally accurate, they do have limitations. As seen in the study above, no type of content will pass every tool 100% of the time. Every AI detector had false positives where they incorrectly classified human content as AI-generated.

But given the way Article Forge and Jasper are structured, their content tends to be indistinguishable from content written by humans. So, at times, their content may be flagged as AI generated, but overall, it will follow the same patterns as human-written content.

And in any edge cases, we recommend prioritizing the quality of a piece of content over the results of an AI content detector.

Finally, to give you a sense of the different AI detectors' limitations, we included their messaging below:

GPTZero Warning: