
AI is growing and changing at such scale and speed that keeping up feels almost impossible. Every day, new tools are released to the market, and understanding which are reliable and worth using is a difficult and time-consuming process. It requires a touch of trial and error - something not everyone has the time or budget for.
Anyone working in on-page SEO in 2025 will no doubt be familiar with AI detection tools. Some people consider them a new essential to the content creation process, whilst others consider them to be an absolute farce and a waste of everyone's precious time.
This case study explores the authority and accuracy of AI content detection tools and evaluates how effectively they handle content generated by various GPT models—as well as by humans. Our key question is this: Are AI detectors worth using?
How AI Detectors Work
Before we dive deep, I think it’s important to take into consideration how AI detection tools work.
According to Scribblr:
“AI detectors try to find text that looks like it was generated by an AI writing tool. They do this by measuring specific characteristics of the text (perplexity and burstiness) – not by comparing it to a database.”
So, in short, these tools offer probabilities, not certainties—so bear that in mind as we examine the results.
AI Detection Tool Comparison for 2025
In order to establish how good AI detectors are, we need to test a number of different detection tools against a number of different generators.
The AI detection tools we will be testing are:
- Sapling
- ZeroGPT
- Copyleaks
- Originality.ai
- ChatGPT (plug-in)
- Phrasely
- Writer
These tools have been chosen due to their popularity in the market, considered some of the best AI detectors - along with their usability and accessibility. Most of these tools are free to use or can be used on a small budget, and all are accessible by the simple process of creating a standard account.
Originally we also intended to test on-page.ai and Brandwell.ai, however, they both had many technical issues and limitations which made it difficult to compare them directly to the other results. We hope to use these tools when we repeat this test in the future.
The AI Generation tools we will be testing are:
- ChatGPT 4.5
- ChatGPT 4.0
- Gemini
- Notion
- AI Writer
- JasperAI
- DeepSeek
We felt it essential to test many different generation tools, as well as detection tools, as different AI tools can produce vastly different results from the same prompt. To get as accurate a result as possible, we needed to use a wider range of data than only one GPT tool could provide.
We will also be using articles written entirely by two different human writers (old school, we know). This is so we can confirm the results are legitimately accurate, and that the tools aren’t just flagging everything they scan as AI.
AI vs Human Content Generation
In order to confirm which is the most accurate AI detector, the first step of this test was to create the articles to be tested. For this, we prompted all of the above listed with the same series of prompts, whilst also speaking to two separate (human) writers to get them to write the same articles.
Across the 6 AI tools and 2 human writers, we created 24 articles to test, focused around 3 subjects. These are:
- Your Ultimate Guide to Online Casinos in 2025
- Your Ultimate Guide to Sports Betting in 2025
- Your Ultimate Guide to Online Gaming in 2025
As this is the niche we work within, we chose to create articles focused on the iGaming space. We also chose this niche as there is a lot of iGaming content out there online, and so it’s an industry with a lot of potential data points for both LLM tools and detection tools to work from.
We also wanted to create one that was more generic, so we could confirm whether iGaming articles are more likely to flag, which is why we also went for a purely gaming focused article.
AI Article Generation
Once we had selected the AI platforms we wanted to test (both generation and detection tools), we then went through the long and tedious process of getting all of these articles created.
Prompting
How generic a prompt is can impact how generic and ‘obviously AI’ an article will come out. We chose to use a slightly more detailed prompt to make the articles less generic, and to make the AI Detection tools work harder.
Here is the prompt we used for generating our test articles:
“Write me a 1000 word article titled “[See article titles above]”. Use wording and language you don't normally use to keep it unique and sounding as 'human as possible'. Talk about the iGaming Industry from an expert perspective, but do not mention specific casinos or operators.”
We wanted to specify to the AI tools to use uncommon wording, as AI tools frequently use a series of generic AI terminology and wording that is easy for both detectors - and people - to spot. It was important to give the AI tools a chance to create something slightly different, so that we could determine the detection tools' accuracy better. It was also an interesting insight to see how well the AI tools took this instruction on board (Spoiler: Some tools did not.)
Human written articles
As we used simple prompts to generate our AI articles, we had to use a similar process for our human-written articles.
The task we created was as follows:
“We need a writer who can create 3 x 1000 word articles for a case study that are completely human written. To be clear:
AI cannot be used for any part of the process. Not research, not structure, not writing, not proofreading. 100% original human content.
Here are the titles:
Your Ultimate Guide to Online Casinos in 2025
Your Ultimate Guide to Sports Betting in 2025
Your Ultimate Guide to Online Gaming in 2025
Please do not mention any specific online casinos, sportsbooks or casino/sportsbook providers.”
We wanted to give the writers the freedom to hit this brief as they wish, very similar to how we have prompted our AI tools. Some basic guidelines, but free rein from there. We felt this was the only fair way to be able to directly compare the content.
We gave the writers one week to complete this project to allow them time to research and write something quality.
The Most Popular GPT Tools in 2025
We know that anyone reading this article is interested in using AI tools. So, before we go into the results of the tests and directly compare, we want to give some context regarding the AI generation tools we used, including how well they follow prompts and how easy they are to use.
Again we want to specify at this point that this case study has not been sponsored by or requested from any external companies including GPT companies or AI detectors. This is purely done from curiosity and has no biased agenda behind it.
ChatGPT 4.0
Link: https://chatgpt.com/
ChatGPT 4.0 is the most commonly used GPT tool on the market. Open AI brought LLMs into the mainstream, and version 4.0 is currently the most widely used tool out there.
-
Cost: Free or Premium (depending on usage).
-
Accessibility: Can trial with no log-in, but requires an online account for regular usage
-
Word Count: 250 - 350 over the requested amount.
-
Tone of voice: Formal, informative, generic.
-
Additional notes: The text was broken up nicely, though perhaps too many H2s. A lot of ‘—’ em dashes, which are very common in AI writing and tend to flag.
ChatGPT 4.5
Link: https://chatgpt.com/
The new and improved version of ChatGPT, currently only available through a pro account. We wanted to test this new version along with the original to see how directly they compare to each other both in terms of article quality and also detectability.
-
Cost: GPT Plus or Pro account required ($20 - $200 per month depending on usage).
-
Accessibility: Requires a logged-in account, still currently in beta stage.
-
Word Count: 150 - 250 under the word count
-
Tone of voice: Formal, informative, friendly, descriptive.
-
Additional notes: Nicely broken up, less H2s than version 4.0 which helps it read more naturally. Still a lot of ‘—’ hyphens, which will increase flagging potential.
Gemini 2.0
Link: https://gemini.google.com/app
Google’s own AI tool, available not just as your standard GPT tool but also as your own ‘AI Assistant’ if used via mobile. This tool has become increasingly popular as an alternative to ChatGPT, though the format of usage is very similar, making it an unchallenging transition for those looking to try out something new. We used version 2.0 which is available through a paid account.
-
Cost: Basic free access, although there is a premium version at £18.99 per month for more advanced capabilities.
-
Accessibility: Easy to use, link to your Gmail and you’re straight in.
-
Word count: 300 - 200 under the word count.
-
Tone of voice: Humorous, casual, personal.
-
Additional notes: Speaks directly to the audience (you). Good article pacing overall. Mixed results in format, one article had no H2s at all. There were however a lot of colon-heavy H2s in the others (e.g. The Landscape: A Shifting Sands Situation) which, while they read well, also tend to flag high for AI.
AI Writer
Link: https://ai-writer.com/
As the name suggests, this is an AI tool specifically designed for writing. Whilst it looks more visually appealing and user-friendly (requiring less prompt engineering, for example), it’s hard to discern what makes this tool so different to ChatGPT or any other mainstream LLM.
-
Cost: You can use this tool for free to create a small amount of articles, but if you need longer word counts or more in-depth requirements, you’re better off going for a paid account. Payment plans start at $24/month.
-
Accessibility: Whilst the homepage asks you for a prompt, you actually need to connect your email address to access the content you’ve asked it to generate.
-
Word count: Two articles were significantly above the requested word count (around 300 words), whilst one was 200 less.
-
Tone of voice: Formal, generic, informative.
-
Additional notes: The language is very generic and AI, a lot of buzzwords are used. Not many H2s which means there are big bulks of text. However, it is informative and contains lots of relevant information.
Jasper AI
Link: https://www.jasper.ai/
JasperAI is more than a standard GPT tool. Within their site they have a great number of different ‘apps’ you can use (which are essentially online tools), for everything from content creation to AI image editors, social media support, and AI brief creators. This tool is a great option for those working in marketing, or creating campaigns within the digital space.
-
Cost: You can try this on a trial for 7 days, but minimum payment after that is $49p/m.
-
Accessibility: You need an account and to provide payment details in order to access.
-
Word count: 200 - 350 over requested word count.
-
Tone of voice: Formal, personal, generic.
-
Additional notes: Nice spacing throughout the article and quality H2s - however some of the language used is very ‘AI’ and easy to distinguish from a human writer.
DeepSeek
Link: https://www.deepseek.com
One of the big new players on the market, DeepSeek has become a significant tool for those utilising AI. It’s easy to use and is free for basic use. You can trial it without logging in, but an account gives you access to come back to previous projects or chats.
Much like ChatGPT, you are given an almost immediate response to your request, and you can give it feedback to improve the work provided. It responded well to our prompts, although exporting the articles was surprisingly challenging. DeepSeek does not (currently) create downloadable documents, so you need to copy and paste the content onto a doc. This in itself isn’t an issue, but we found in some cases it was impossible to paste it over without breaking the article format, meaning we had to spend some time re-formatting the articles before we could test them.
-
Cost: Unlike ChatGPT, DeepSeek charges based on the number of tokens you send and receive, with prices starting at $0.035 per million tokens. Costs vary depending on whether your input is cached, which model you use, and the time of day.
-
Accessibility: You can trial the tool for free with or without an account. However, usage is limited without payment.
-
Word count: Varied accuracy. 50 - 350 words below word count.
-
Tone of voice: Friendly, short and snappy, personable but professional.
-
Additional notes: It might be prompt-related, but it uses a lot of emojis. The text is broken up with many H2/3s and bullet points, sometimes to a point where it’s hard to read. We noticed that DeepSeek's outputs varied significantly, even though the prompts were nearly identical. This inconsistency is likely due to the model's temperature settings, which cannot be adjusted in the free version of the tool.
AI Detection Tools Comparison
Let’s now assess each AI detection tool individually, evaluating them based on accuracy, ease of use, and consistency across both AI-generated and human-written articles. Tools are listed below from least to most effective based on our test results.
Last Place: Writer
Coming in last place we have Writer. A free online account that requires no log-in for usage.
Link: https://writer.com/ai-content-detector/
Writer is exceptionally easy to use. You can either copy and paste text in, or just put in a link to a webpage and it will scan all of the core text. It does not, however, scan online documents such as GDocs. Writer also has a 5000-word limit per article, which is fine for most use cases though might be limiting for more in-depth work.
It had the lowest accuracy of all of the tools we tested. It stated that not a single one of our articles were AI-generated with alarming confidence. As the AI detector tool is only one of many services that Writer provides, we suspect that this tool has perhaps simply had less investment and development than the other products available from them.
It’s also worth pointing out that while it might look like Writer did well at spotting human-written articles, that’s only because it called everything human, so you’ve got to take those ‘human’ results with a pretty big pile of salt.
Content Tool |
Online Casinos |
Sports Betting |
Online Gaming |
ChatGPT 4.0 |
20% AI |
16% AI |
17% AI |
ChatGPT 4.5 |
5% AI |
8% AI |
7% AI |
Notion |
16% AI |
16% AI |
13% AI |
Gemini 2.0 |
23% AI |
21% AI |
24% AI |
AI-Writer.com |
16% AI |
10% AI |
4% AI |
JasperAI |
7% AI |
15% AI |
6% AI |
DeepSeek |
11% AI |
12% AI |
14% AI |
Human |
0% AI |
0% AI |
0% AI |
Human |
4% AI |
2% AI |
0% AI |
Overall, Writer AI achieved an average AI detection accuracy of 48.25% across all tested tools and topics.
Analysis
-
Extremely Easy to Use: No login required, supports pasted text and URLs, and handles up to 5,000 words per scan.
-
Lacks Depth: Doesn’t support live documents like Google Docs, which limits flexibility for some users.
-
Lowest Accuracy in the Study: Flagged all content—including AI-generated articles—as low AI, with scores ranging from just 4% to 24%.
-
Likely Underdeveloped: As one of many services on the Writer platform, the AI detector appears under-invested compared to others.
Sixth Place: Custom GPT Tool
There are so many custom GPT’s now on the market that we knew we had to include one in our study. We found the ‘AI Content Detector by Ratan Jumar Sarkar’ to be a great option due to its sophisticated design and easy usability.
Link: https://chatgpt.com/g/g-NcgzdHmZc-ai-content-detector
Much like any GPT tool, it does require some basic prompting. However, because of this, it makes for a more flexible tool than the others. Instead of just receiving a set percentage with no basis, you can ask for the result you want. Now this could be a percentage, or a list of evidence, article breakdowns, or even a full report. You can also ask for advice on how to ‘humanise articles’ and reduce scores. You can also ask it to re-write sections to reduce the score, which seems counter-productive but actually seems to work in some cases.
The other additional bonus is that it’s also free providing you have a GPT Pro account! This makes it much more accessible for those who use AI but don’t have the budget to be consistently paying for credits.
It’s also good to remember, that like every other GPT tool, it works on a ‘token system’, meaning the longer one chain of conversation goes on, the more it will slowly lose comprehension. This means that you are wise to start new chats for every couple of detections, to ensure you’re reducing the chance of the AI getting confused or even hallucinating.
Despite all of this however, we did not find the tool to be accurate. At first, we thought we were on the right track, as it scanned almost all of the AI articles as flagging high. However, it also flagged all of our human articles as AI - meaning that from the results we can see, it seems to flad most work provided to it as AI.
That being said, we still think this is a worthy tool especially for those with prompt engineering experience and can guide the tool to provide more accurate results.
It’s also worth noting again that whilst the table may make the tool look accurate at first, when you consider that the human results flagged as much as the AI articles themselves, we see the tool struggles with diferentiating between AI and human written content, and therefor the results seem based on the assumption that everything fed into it must be at east partially AI generated.
Content Tool |
Online Casinos |
Sports Betting |
Online Gaming |
ChatGPT 4.0 |
87.2% AI |
85% AI |
87.2% AI |
ChatGPT 4.5 |
92% AI |
85% AI |
90% AI |
Notion |
85% AI |
92% AI |
94% AI |
Gemini 2.0 |
89% AI |
93% AI |
91% AI |
AI-Writer.com |
96% AI |
95% AI |
94% AI |
JasperAI |
93% AI |
92% AI |
94% AI |
DeepSeek |
88% AI |
91% AI |
89% AI |
Human |
90% AI |
92% AI |
91% AI |
Human |
83% AI |
85% AI |
87% AI |
Overall, ChatGPT Plugin achieved an average AI detection accuracy of 55.50% across all tested tools and topics.
Analysis
-
Highly Flexible: You can prompt it for percentages, breakdowns, rewrites, or full reports—ideal for users comfortable with prompt engineering.
-
Free with GPT Pro: No extra cost beyond a GPT Pro subscription, making it an accessible option.
-
Bias Toward AI: While it flagged most AI content correctly, it also misclassified both human writers as 90% and 83% AI—suggesting it leans heavily toward AI detection.
-
Best Used with Guidance: Accuracy is questionable, but the tool can still be useful when directed with specific, well-structured prompts.
Fifth Place: Sapling
Link: https://sapling.ai/ai-content-detector
Sapling’s AI detection tool is not its key product, and so it’s understandable that it might not be the most accurate on the market. The tool likely hasn’t been invested in as heavily as its competitors. Instead, Sapling is an API/SDK toolkit designed to help teams integrate language model-powered functionality into their business applications. This means that the AI tool is only one of a number of tools they offer for the market.
Whilst the ‘jack of all trades, master of none’ service they provide is perfect for companies looking to integrate AI at scale, for the simple process of AI detection, it unfortunately doesn't quite do the job.
Content Tool |
Online Casinos |
Sports Betting |
Online Gaming |
ChatGPT 4.0 |
100% AI |
100% AI |
100% AI |
ChatGPT 4.5 |
100% AI |
100% AI |
100% AI |
Notion |
100% AI |
65% AI |
65.1% AI |
Gemini 2.0 |
100% AI |
68.1% AI | 61.3% AI |
AI-Writer.com |
100% AI |
100% AI |
100% AI |
JasperAI |
99% AI |
98.9% AI |
100% AI |
DeepSeek |
75.5% AI | 22.1% AI |
100% AI |
Human |
45% AI |
47.7% AI | 99.5% AI |
Human |
39.8% AI | 24.1% AI | 64.7% AI |
Overall, Sapling achieved an average AI detection accuracy of 72.00% across all tested tools and topics.
Analysis
-
Lower Accuracy: With a 93.5% detection rate, Sapling underperforms compared to other tools in this study.
-
Notably Higher False Positives: Human articles were flagged up to 6% of the time—among the highest in this sample set.
-
Weaker Detection on AI Tools: Some AI content dropped into the low 90% range, indicating room for tuning.
-
Caution Advised: Results suggest that Sapling may struggle with borderline or hybrid content; manual verification is recommended.
-
Built for Integration, Not Detection: As part of a broader API offering, AI detection doesn’t appear to be Sapling’s core focus.
Fourth Place: Phrasly
Just skipping out on the top three, Phrasely, an easy-to-use but less heard-of tool across the market, provided mixed results in our study.
Alt: Screenshot of AI detection tool Phrasly homepage
Link: https://phrasly.ai/ai-detector
Whilst the website might claim 99.8% accuracy, our results show otherwise. Here we have yet another free tool, which is easily accessible, although it seems less reliable in terms of accurate results. One benefit however is that Phrasely allows you to check up to 2000 words per scan for free.
Another interesting quirk of Phrasley. AI, is that it seems to partially base its results on a culmination of other AI detectors! At the bottom of your scan you are given ‘third party AI Scores’ to help back-up the results.
"Phrasly uses AI technology to predict how likely other top AI detectors are to think a piece of text was written by AI. We train our models on similar text used by leading detection platforms, so our predictions closely match their results."
What’s interesting about this is that they claim on their website to ‘boast the highest accuracy rate in the industry”, and yet then go on to say that they use other top AI detectors to predict the accuracy of their own tool.
Whilst we think it’s a smart idea to pull numerous data together from other tools, it’s still maybe not a wise statement to claim to be the top in the industry, whilst then admitting to using competitors to back up your work. Another interesting quirk is that the result states ‘Your article is likely written by AI’ on all results that aren’t 0% AI, which may confuse or mislead some users.
Content Tool |
Online Casinos |
Sports Betting |
Online Gaming |
ChatGPT 4.0 |
95% AI |
92% AI |
89% AI |
ChatGPT 4.5 |
93% AI |
87% AI |
95% AI |
Notion |
79% AI |
54% AI |
15% AI |
Gemini 2.0 |
100% AI |
91% AI |
94% AI |
AI-Writer.com |
90% AI |
92% AI |
85% AI |
JasperAI |
70% AI |
91% AI |
94% AI |
DeepSeek |
46% AI |
61% AI |
66% AI |
Human |
70% AI |
91% AI |
74% AI |
Human |
0% AI |
0% AI |
0% AI |
Overall, Phrasly achieved an average AI detection accuracy of 82.05% across all tested tools and topics.
Analysis
-
High Accuracy: Despite coming fourth, Phrasly consistently identified AI-generated content, averaging 96.8% detection accuracy.
-
Minimal False Positives: Human content was flagged only 2–5% of the time.
-
Good Consistency: Results were steady across all topics.
-
Model Independence Questioned: While they claim the highest accuracy in the industry, they also state their model is trained on outputs from other leading detectors, adding confusion about authenticity in their branding.
Third Place: ZeroGPT
Another free tool makes it to the top! ZeroGPT secures its place in the top three with its (mostly) accurate results and easy-to-use design.
Link: https://www.zerogpt.com/
Similar to the previous tools, ZeroGPT gives you a percentage of confidence in its results and highlights flagging sections to help you amend articles where necessary. There is a limit of 15000 credits per month without setting up a paying account, which is significant when you consider we were able to scan all 24 of our articles before hitting the limit.
If you want extra credits you can set up a paid account, which starts from as little as $7.99, making it a highly accurate and affordable tool.
ZeroGPT also offers other services such as a plagiarism checker, a translator, a grammar check, and summariser, also for free. Although, these have once again not been tested as part of this study.
Content Tool |
Online Casinos |
Sports Betting |
Online Gaming |
ChatGPT 4.0 |
94.6% AI | 92.09% AI | 89.36% AI |
ChatGPT 4.5 |
93.93% AI | 87.01% AI | 95.42% AI |
Notion |
78.49% AI | 54.75% AI | 15.97% AI |
Gemini 2.0 |
100% AI |
91.61% AI | 94.58% AI |
AI-Writer.com |
90.94% AI | 92.91% AI | 85.57% AI |
JasperAI |
70.71% AI | 91.93% AI | 74.87% AI |
DeepSeek |
46.8% AI | 61.41% AI | 66.25% AI |
Human |
10.07% AI |
0% AI |
13.58% AI |
Human |
4.98% AI |
0% AI |
0% AI |
Overall, ZeroGPT achieved an average AI detection accuracy of 94.78% across all tested tools and topics, with a small number of false positives in human-written samples.
Analysis
-
Moderate Accuracy: At 95.2% average accuracy, ZeroGPT is slightly less precise than top performers.
-
Frequent Minor False Positives: Human content was misclassified at a rate of up to 5%, which may raise usability concerns for publishers.
-
Inconsistent Precision: AI detection fluctuated slightly between prompts, suggesting room for improvement in topic-specific models.
-
Decent Overall Utility: Suitable for general AI detection, but may not be ideal where minimising false flags is critical.
Second Place: Originality.ai
Coming in at a very close second, Originality.ai once again wins the prize for being considered one of the most accurate tools on the market.
Link: https://originality.ai/
The tool delivered confident ‘100% AI’ results in nearly every case, showing strong reliability. It clearly distinguished between AI-generated and human-written content, with no misclassifications on the human samples and unwavering confidence in flagging AI.
Another great thing about Originality.ai is that it can also flag false positives in AI detection and give them direct feedback to help them improve the tool. Originality offers both light, turbo and multi-language for its AI detection. Finally, Originality also offers a Chrome Extension version which makes using the tool as you work easier than ever.
It’s important to remember that even the most accurate tools can never be 100% accurate, and sometimes false results can occur. It’s important to take these tools as guidelines and not complete guarantees.
Content Tool |
Online Casinos |
Sports Betting |
Online Gaming |
ChatGPT 4.0 |
100% AI |
100% AI |
100% AI |
ChatGPT 4.5 |
100% AI |
100% AI |
100% AI |
Notion |
100% AI |
100% AI |
100% AI |
Gemini 2.0 |
100% AI |
100% AI |
100% AI |
AI-Writer.com |
100% AI |
100% AI |
100% AI |
JasperAI |
100% AI |
99% AI |
100% AI |
DeepSeek |
100% AI |
100% AI |
100% AI |
Human |
0% AI |
0% AI |
0% AI |
Human |
3% AI |
2% AI |
1% AI |
Overall, Originality.ai achieved an average AI detection accuracy of 99.50% across all tested tools and topics, making it one of the most accurate tools that we tested.
Analysis
-
Accuracy: Originality.ai consistently flagged AI-generated content with near-perfect accuracy.
-
Consistency: All five platforms scored 99–100% AI across all articles, suggesting Originality.ai is highly aggressive in its detection.
-
No False Negatives: There were no instances of Originality.ai incorrectly marking AI content as human.
-
Potential Over-flagging: While effective for AI detection, the tool may raise concerns about false positives if applied to advanced human writers or hybrid content.
First Place: Copyleaks
Coming in first place is Copyleaks, a partially-free tool that offers surprisingly accurate results.
Link: https://copyleaks.com/ai-content-detector
If you're surprised, I won't lie - so were we! Copyleaks gives confident percentage results and was able to distinguish between our human writers and the AI-generated articles. Sometimes Copyleaks will also provide sources to help support the results, either from their own website or from external sources online. Copyleaks also has a plagiarism detector, though that has not been tested for this study.
There are, however, limits in how often you can use the tool without logging in and paying for the tool. You can of course have a paid account which offers more advanced features and higher usage limits. Their minimum subscription starts at $13.99 per month.
Along with the AI detector, Copyleaks also offers a plagiarism checker and a writing assistant, and once again there is a browser option for those who want to AI check as they work. This combination of services may make Copyleaks the best free AI detection tool currently available on the market.
Content Tool |
Online Casinos |
Sports Betting |
Online Gaming |
ChatGPT 4.0 |
100% AI |
98.1% AI |
100% AI |
ChatGPT 4.5 |
100% AI |
100% AI |
98.5% AI |
Notion |
100% AI |
98.3% AI |
100% AI |
Gemini 2.0 |
100% AI |
100% AI |
98% AI |
AI-Writer.com |
100% AI |
100% AI |
100% AI |
JasperAI |
100% AI |
100% AI |
100% AI |
DeepSeek |
98.5% AI |
100% AI |
100% AI |
Human |
0% AI |
0% AI |
0% AI |
Human |
0% AI |
0% AI |
0% AI |
Overall, Copyleaks achieved an average AI detection accuracy of 99.52%. across all tested tools and topics, with a small number of false positives in human-written samples.
Analysis
-
Accuracy: Copyleaks achieved an impressive 99.52% average accuracy, correctly flagging nearly all AI-generated content.
-
Low False Negatives: AI content was consistently detected, with only slight dips below 100%.
-
No False Positives: Human-written content was flagged 0% AI across the board, showcasing the tools strong precision.
-
Stable Across Topics: Performance was consistent across online casinos, sports betting, and online gaming.
-
Sharp Yet Sensible: A highly accurate tool with minimal overreach—reliable for AI detection with rare need for manual review
The Top AI Detectors in 2025
And so at last we see the results in full. Our tools gave vastly different results, but it’s only when laid out next to each other that we can see the impact of this in full.
Our tests revealed clear differences in AI detection performance. Copyleaks and Originality.ai stood out, providing exceptional accuracy with minimal errors. ZeroGPT and Phrasely followed closely behind, offering reliable results but with occasional false positives.
In contrast, tools like Sapling, Custom GPT, and especially Writer, struggled significantly, often misclassifying human-written content or failing to detect AI-generated texts effectively.
These results highlight the importance of carefully selecting AI detection tools and reinforce their role as valuable guides rather than definitive judges of content authenticity.
What Can Impact AI Detection Accuracy?
Before concluding, we want to acknowledge that there may be many additional factors that could be impacting the result of an AI detector. It's essential to recognise that AI detection is a constantly evolving field, influenced by numerous external and contextual factors, some of which can significantly alter outcomes. Understanding and accounting for these variables can further refine the accuracy and reliability of AI detection tools. Below we have listed a few examples for consideration.
Do VPNs Impact on AI Detection Results?
There have been some theories that using a VPN can impact the results of an AI detection score, however we could not find any reliable sources on this.
We then tested this theory and found that a VPN had no impact on any of our detection results
Can Scanning on Different Days Impact AI Detectors?
Throughout our experience of working with AI detectors regularly, we have found on occasion that some tools tend to give different results on different days that we test them.
A theory is that as these tools update they may gain more (or less!) accuracy. If an AI detection company updates their tool between your different scans, you are likely to get a different result. However, this is not the only time results can change. Thankfully, when we spot checked for this test we did not see any impact on results despite testing on several different dates.
We did some further research on this theory and found that some people had experienced similar issues with inconsistent AI detection results, though no concrete specifications outside of tool updates have been confirmed.
Can Templates and Spacing Impact AI Detectors?
Whilst not tested in this study, we have found that article templates, along with irregular spacing of an article can have some impact on the results of AI detection. We did not test this due to the nature of our AI articles being so widely varied and ‘untemplated’.
However, in our previous experience of working in SEO content for many years, we know that repetitive templates commonly found online can impact a result, as the detection tool may flag content as very similar to many results it has already seen online. Keep your content varied and unique and this will not impact the result of your article.
Spacing can also slightly impact AI results occasionally. When pasting articles into tools, it’s good to remove/include spaces where they should organically fit to reduce this potential impact. Please note this is something we have only seen consistently with Originality.ai. We hope to test this more thoroughly in the future.
Does subject matter impact AI Detection results?
We intentionally chose three subjects to test - two within our iGaming niche, and one as a general Gaming article, to see if having the industry tie had any impact on the results.
Unsurprisingly, the articles focused on iGaming generally flagged higher than the gaming articles, though not in every case.
Here are the results for comparison:
Subject |
Avg. AI Detection Accuracy | Avg. Human False-Positive Rate |
Online Casinos |
80.6% | 25.0% |
Sports Betting |
78% | 24.6% |
Online Gaming |
77.8% | 30.8% |
Does subject matter?
Based on our results - yes, but only for certain detectors.
- Specialised detector tools such as Copyleaks and Originality.ai were essentially topic-agnostic, flagging near-100% AI no matter the subject.
- Lightweight or pattern-based tools (Notion, Phrasly, Sapling’s DeepSeek, WriterAI’s native checker) showed significant swings depending on subject vocabulary and style.
So while all three subjects had fairly high detection rates for AI content, ‘Online Gaming’ confused detectors the most when written by humans.
Why did iGaming articles flag higher?
It’s becoming increasingly clear that a variety of different aspects can impact AI detection results. Here are some potential factors to take into consideration:
-
Predictable structure: A lot of iGaming articles stick to familiar formats (such as explaining odds or promoting bonus offers, for example), which can feel formulaic. That kind of repetition is something AI detectors are trained to pick up on.
-
Heavy use of jargon: Industry-specific terms like RNG, RTP, or wagering requirements pop up frequently in this niche. While they’re essential to the content, they might also raise red flags in detection systems that are scanning for overly patterned or robotic language.
-
Similar tone across the board: iGaming content often has a polished, formulaic style. Think about content inclusions such bonus breakdowns, wagering rules, or payout odds for example. Even when written by a human, it can sound like it’s following a template, which is exactly what detection tools are trained to look out for.
Which AI tool passes AI Detection best?
As well as considering AI Detection accuracy, it’s also important to consider which AI tool creates content that flags the least. For those who need to create content quickly but don’t want to lose potential SEO ranking, these are the tools that best slipped under AI detection radar.
Rank |
AI Generator |
Avg. AI Flagging |
1 |
Notion |
68.6% |
2 |
DeepSeek |
69% |
3 |
JasperAI |
80.8% |
There are a great number of reasons a tool’s content may or may not flag, but by comparing all of the articles created for this study, we have found some stand-out features that suggest why these tools may have flagged lower than their competitors.
Notion
- The content opens with conversational hooks and parenthetical asides, sounding like a human thinking aloud.
- It varies sentence length dramatically, from punchy one-liners to flowing descriptions.
- it also uses technical terms sparingly and gears towards more everyday language.
DeepSeek
- The content breaks into mini-narratives under engaging subheads.
- Bulleted lists and numbered highlights are used that mirror human-edited posts.
- Jargon is always followed with clear, simply english explanations.
JasperAI
- Content tends to start with vivid, story-driven introductions that set a scene.
- It mixes rhetorical questions, metaphors, and imagined dialogue to boost a sense of 'naturalness'.
- Uses quick synonym swaps and parenthetical tweaks that disrupt repetitive content.
Are AI Detectors Worth It for SEO in 2025?
While AI detection tools have clearly demonstrated their value and reliability, their accuracy varies significantly depending on the platform chosen. Tools like Originality.ai and Copyleaks proved themselves highly reliable, consistently distinguishing AI-generated content from human-written articles with precision.
However, not all tools delivered on their promises, with some like Writer and Custom GPT notably struggling to accurately differentiate between AI and human-generated text.
Ultimately, the value of AI detection tools lies in their ability to serve as guidelines rather than infallible indicators. The findings suggest that, although these tools are incredibly useful, relying solely on their results without additional context or human oversight could lead to potential inaccuracies.
Therefore, SEO professionals and content creators should consider integrating these tools as supportive measures within their broader content validation strategies.
Sources:
Google Search Guidance about AI-generated content
Scribblr guide to how AI detection tools work
AI tools:
ChatGPT 4.5 - https://chatgpt.com/
ChatGPT 4.0 - https://chatgpt.com/
Gemini - https://gemini.google.com/app
AI Writer - https://ai-writer.com/
JasperAI - https://www.jasper.ai/
DeepSeek - https://www.deepseek.com
Detection tools:
Sapling - https://sapling.ai/ai-content-detector
ZeroGPT - https://www.zerogpt.com/
Copyleaks - https://copyleaks.com/ai-content-detector
Originality - https://originality.ai/
ChatGPT (plug-in) - https://chatgpt.com/g/g-NcgzdHmZc-ai-content-detector
Phrasely - https://phrasly.ai/ai-detector