GPTBot vs OAI-SearchBot: the real difference
Want to know if AI search engines can actually reach and read your site? Check it free. Run the AI visibility check.
The short answer
One line in robots.txt can cost you ChatGPT search traffic if you block the wrong bot.
GPTBot and OAI-SearchBot serve different jobs. GPTBot is OpenAI's training crawler. Disallowing GPTBot states that you do not want GPTBot to crawl your pages for model training. OAI-SearchBot is for ChatGPT search. Disallowing OAI-SearchBot can remove your pages from ChatGPT search results and answers.
That is the core difference: GPTBot is a training opt-out. OAI-SearchBot is search visibility. If your goal is to appear in AI answers, check whether OAI-SearchBot can crawl and read the page. The same review should include Claude-SearchBot for Claude, PerplexityBot for Perplexity, Googlebot for Google AI Overviews, and Applebot for Apple Intelligence. You can test this with the free AI visibility checker.
- Allow OAI-SearchBot when you want ChatGPT search visibility.
- Disallow GPTBot when your policy says GPTBot should not crawl your content for model training.
- Treat each crawler line as a separate choice. Do not assume one OpenAI rule covers all OpenAI uses.
Which bots affect AI answers
AI search has split into two groups: crawlers that help answers find your site, and controls that limit training use.
The crawlers that decide whether you appear in AI answers are OAI-SearchBot for ChatGPT search, Claude-SearchBot for Claude, PerplexityBot for Perplexity, Googlebot for Google AI Overviews, and Applebot for Apple Intelligence. Disallowing these in robots.txt removes you from that engine's crawl path.
By contrast, GPTBot, ClaudeBot, CCBot, Google-Extended, and Applebot-Extended are training or opt-out controls. Blocking them does not affect live AI-search visibility. Google-Extended and Applebot-Extended are robots-only control tokens with no separate crawl user-agent.
robots.txt is a stated policy. It is not proof of what a bot did. Server logs, verified IP ranges, and vendor docs are the evidence. Perplexity-User and Bytespider have been reported to ignore robots.txt, so do not use a robots file as proof that either one behaved a certain way.
- For AI answer visibility, audit OAI-SearchBot, Claude-SearchBot, PerplexityBot, Googlebot, and Applebot.
- For training opt-out, audit GPTBot, ClaudeBot, CCBot, Google-Extended, and Applebot-Extended.
- For evidence, compare robots.txt with logs. A policy line and a crawl event are different things.
What to put in robots.txt
Start with your business rule. Then write the smallest robots rule that matches it.
If you want ChatGPT search visibility but want to opt out of GPTBot training crawls, allow OAI-SearchBot and disallow GPTBot. If you want no OpenAI crawling for either documented bot-controlled use, disallow both. If you want maximum AI search reach, also allow Claude-SearchBot, PerplexityBot, Googlebot, and Applebot.
Do not hide important copy behind client-side rendering alone. Google documents JavaScript rendering for Googlebot. The other AI search crawlers may or may not render the same way, and the risk is undocumented. Put the page title, answer, body copy, canonical link, and key internal links in the first HTML response when you can.
Use official sources when the stakes are high: OpenAI, Anthropic, Perplexity, Google, Apple, and Common Crawl crawler docs. For email rules, use RFC 7208 for SPF, RFC 6376 for DKIM, RFC 7489 for DMARC, and the Google and Microsoft sender guidelines. A stale SEO list can be worse than no list.
- Keep search crawlers allowed when you want citations and answer visibility.
- Keep training crawlers blocked when your content policy requires it.
- Make sure the page returns a 200 status, indexable HTML, a canonical URL, and useful text without a login.
- Link the topic from related pages so crawlers can discover it from normal site paths, such as your guide hub at /a.
Email deliverability still matters
AI visibility brings people to the site. Email deliverability decides whether your replies, trials, and sales notes reach the inbox.
For InboxRadar users, the same domain hygiene work often touches both search and email. A clean site can still lose deals if Gmail or Outlook sends your follow-up to spam. Mailbox providers look at authentication, alignment, reputation, complaint rate, sending patterns, content, and blocklists.
SPF is a DNS TXT record that lists the systems allowed to send mail for your domain. Keep it to one SPF record, watch the 10-DNS-lookup limit, and choose the ending with care. ~all is a soft fail, often used while testing. -all is a hard fail, used when you are sure every valid sender is listed.
DKIM adds a cryptographic signature to mail. Your sender signs with a private key, and the receiving mailbox checks the public key at a selector record in DNS. A broken selector, missing key, or unsigned message can weaken trust even when SPF passes.
DMARC ties SPF and DKIM to the visible From domain through alignment. Start with p=none to collect reports, then move to p=quarantine or p=reject when legitimate senders pass. Add rua so aggregate reports come back to an address you monitor. The free DMARC report reader helps turn those XML reports into plain patterns.
Also check MX records, because they tell the internet where inbound mail for your domain should go. Check blocklists if mail suddenly drops, but do not treat one listing as the whole diagnosis. Blocklists are one signal. Broken SPF, DKIM, or DMARC can be enough to push mail toward spam even with no listing.
A quick domain scorecard can catch the common DNS issues before you blame copy, timing, or a sales tool.
- SPF: one record, valid includes, under 10 lookups, correct
~allor-allchoice. - DKIM: active selector, valid public key, messages actually signed by each sending service.
- DMARC: aligned SPF or DKIM, useful policy, and working
ruareporting. - MX: present and pointed at the mailbox provider you really use.
- Reputation: steady volume, low complaints, low bounces, and no surprise blocklist hit.
FAQ
Does blocking GPTBot block ChatGPT search?
No. GPTBot is the training crawler. For ChatGPT search visibility, check OAI-SearchBot. If OAI-SearchBot is disallowed, ChatGPT search loses that documented crawl path to your page.
Should I allow OAI-SearchBot and block GPTBot?
That is the common setup when a site wants visibility in ChatGPT search but does not want GPTBot to crawl its content for model training. The right policy depends on your content rights and business goals.
Does Google have a separate AI Overviews bot?
No. Google AI Overviews ride the normal Google Search index. Googlebot is the crawl control for Search access. Google-Extended is a training and opt-out control for some other Google AI systems, and it does not replace Googlebot.
Can AI crawlers read JavaScript pages?
Google documents JavaScript rendering for Googlebot. For other AI crawlers, treat client-side-only content as a risk unless the vendor documents support. Server-rendered HTML is the safer default for content you want cited.
Why is email authentication in a crawler guide?
Many teams fix AI visibility to get found, then lose the reply because their domain fails SPF, DKIM, or DMARC. Search visibility and inbox placement are separate systems, but both depend on clean domain records.