Which AI Writers Pass AI Content Detection?
Given all the discussion around AI writing and AI content detectors, we decided to put the most popular tools to the test.
We want to see if AI writers can produce content that passes as human-written across all AI detectors.
Using a list of 20 random blog post topics, we:
- Generated long form articles in the most popular AI writers: ChatGPT, Jasper, Article Forge, Copy.ai, and Writesonic.
- Searched each topic in Google and took the article in the first result (Google clearly loves this content so it will be our human control group).
We then ran each article dataset through the 5 most popular AI content detectors: Originality.ai, OpenAI, Hugging Face detector, Writer AI detector, and GPTZero and aggregated the results below.
What do the scores for each category mean?
Originality.ai: The average probability that the content was "Original" (created by a human).
Hugging Face Detector: The average probability that the content is "Real" (not generated by GPT models).
Writer AI Detector: The average probability that the content was created by a human.
Openai Detector: The percent of articles that OpenAi did not consider to be AI-generated.
GPTZero: The percent of articles that GPTZero thought were most likely or entirely human written.
Note: We would not recommend relying on GPTZero. Even though Article Forge content was rated the most human, GPTZero detected that 70% of the human articles we tested contained at least some AI generated content, indicating that GPTZero is currently too unreliable to be trusted.
Human-Quality: Article Forge
In all AI content detectors, Article Forge content passes as human-written at the same rate as actual human-written content that ranks first on Google. So you can safely use Article Forge content without worrying that the content will be detected as AI-generated.
Almost Undetectable: Jasper
Jasper is a close second. Its content passes all AI detectors as human-written at almost the same rate as actual human-written content on Google, falling only slightly short with Originality.ai and GPTZero. So you can mostly use Jasper without worrying about AI detectors flagging the content as AI-generated.
Detected as AI: All other AI writing tools
All other AI content generators: Copy.ai, ChatGPT, and Writesonic were easily detected by every AI content detector.
Therefore, you cannot safely use content generated from these three tools on your website or for your clients unless you heavily edit the content manually or use an AI Evasion tool.
At the end of the day, people want content that is human readable. That should be a no-brainer, as humans, we want content that flows naturally, is easy to understand, and provides us value.
AI writing has the potential to make any content creation process more efficient, but that is only possible if the content is human readable.
The results of this test show that Article Forge and Jasper can produce content that is not only readable to humans, it appears human-written to machines, too.
So as long as your content provides value to your readers and is indistinguishable from human-written content, you will have nothing to worry about!
And if you want us to test more AI writers or AI content detectors, send us a message!
To give more context around the content used for this case study, we outlined our content creation processes and included the content datasets we tested below.
Human Ranking Articles:
Download the human content we tested here.
Article Forge Articles:
Download the Article Forge content we tested here.
Download the Jasper content we tested here.
Download the Copy.ai content we tested here.
Download the ChatGPT content we tested here.
Download the Writesonic content we tested here.
AI Detector Limitations
While AI content detectors are generally accurate, they do have limitations. As seen in the study above, no type of content will pass every tool 100% of the time. Every AI detector had false positives where they incorrectly classified human content as AI-generated.
But given the way Article Forge and Jasper are structured, their content tends to be indistinguishable from content written by humans. So, at times, their content may be flagged as AI generated, but overall, it will follow the same patterns as human-written content.
And in any edge cases, we recommend prioritizing the quality of a piece of content over the results of an AI content detector.
Finally, to give you a sense of the different AI detectors' limitations, we included their messaging below:
OpenAi Detector Limitations:
Originality.ai Description and Limitations: