All guides

Does ChatGPT See My Website? Real Crawl Checks

InboxRadar grades your email deliverability free and emails you when it changes. Check your domain.

The short answer

A public page can still be invisible to ChatGPT search.

For ChatGPT search, the crawler to check is OAI-SearchBot. If that bot is blocked by robots.txt, blocked by your firewall, sent through a bot challenge, or served a 403, 5xx, login page, noindex header, or empty page shell, ChatGPT search has a poor path to use the page as a source.

Start with the page that matters most. Check that it returns a clean 200 status, that the canonical URL points to itself or the right page, that the title and main copy are present in the first HTML response, and that robots.txt allows OAI-SearchBot. The free AI visibility checker is built for that first pass.

The AI crawlers that matter

Crawler names are easy to mix up. Some affect live answers. Some are training controls.

  • OAI-SearchBot is the crawler for ChatGPT search. Disallowing OAI-SearchBot in robots.txt removes you from that engine.
  • Claude-SearchBot is the crawler for Claude search. Disallowing Claude-SearchBot removes you from Claude search.
  • PerplexityBot is the crawler for Perplexity. Disallowing PerplexityBot removes you from Perplexity.
  • Googlebot feeds Google Search, including AI Overviews. Google AI Overviews ride the normal Search index. There is no separate opt-out crawler for them.
  • Applebot is the crawler tied to Apple search and Apple Intelligence. Disallowing Applebot removes you from that Apple path.
  • GPTBot, ClaudeBot, CCBot, Google-Extended, and Applebot-Extended are training or opt-out controls. Blocking them does not block live AI-search visibility. Google-Extended and Applebot-Extended are robots.txt control tokens, not separate HTTP request user agents.

Robots.txt states your policy. It does not prove what a crawler did. Use server logs when you need proof of requests. Reports about Perplexity-User and Bytespider are a reason to avoid saying robots.txt proves a bot stayed away.

Fix the pages ChatGPT should read

Give AI search crawlers a plain, reachable page.

  • Allow the search crawlers you care about: OAI-SearchBot, Claude-SearchBot, PerplexityBot, Googlebot, and Applebot.
  • Do not block CSS, JavaScript, images, or API routes needed to show the main content. Only Googlebot publicly documents JavaScript rendering. For other AI search crawlers, treat client-side-only content as an undocumented risk.
  • Put the key copy in server-rendered HTML when the page matters for search. Headings, product facts, pricing ranges, contact details, and proof should be readable without a click.
  • Remove accidental noindex tags and X-Robots-Tag headers from public pages.
  • Return normal status codes. A login wall, bot challenge, 403, soft 404, 5xx error, or redirect loop can make a useful page hard to use.
  • Keep a clear sitemap and internal links. Crawlers still need a route to the page.

After the crawl path is open, the engine still decides what to show. Strong source pages answer one narrow question, name the product or company, and make claims that a crawler can quote or summarize without guessing.

Email trust is a separate check

A domain can be readable by bots and still have mail land in spam.

Inbox placement depends on DNS and sending behavior. SPF is a TXT record that lists allowed senders for a domain. Publish one SPF record, include only the services that send for you, and stay under the 10 DNS-lookup limit from RFC 7208. Use ~all while testing if you are still finding senders. Move to -all only after every legitimate sender is covered. Avoid +all, because it allows any server to send as your domain.

DKIM signs each message with a selector such as selector1._domainkey. The private key signs mail at your provider, and the public key in DNS lets Gmail, Outlook, and other receivers verify it. DMARC lives at _dmarc. It checks whether SPF or DKIM aligns with the visible From domain, can send aggregate reports with rua, and sets a requested policy: p=none to monitor, p=quarantine to ask receivers to treat failures as suspicious, or p=reject to ask receivers to reject failures.

MX records tell the world where to deliver inbound mail. Blocklists, high complaint rates, bad lists, weak link reputation, and sudden volume spikes can still push mail to spam. Gmail and Outlook weigh authentication, reputation, user engagement, and abuse signals together. If mail from this domain matters, check the live DNS with the free domain scorecard.

Common questions

Can ChatGPT see my website if I block GPTBot?

Yes, for search visibility. GPTBot is a training control. ChatGPT search uses OAI-SearchBot. If you want ChatGPT search to reach your pages, allow OAI-SearchBot.

Can I block AI training but stay visible in AI search?

Yes. Block training controls such as GPTBot, ClaudeBot, CCBot, Google-Extended, or Applebot-Extended, but allow the search crawlers you care about.

Do Google AI Overviews use a separate crawler?

No. Google AI Overviews use the normal Google Search index. Googlebot is the crawler to check. Blocking Googlebot can remove you from Google Search as a whole.

Is robots.txt enough to prove a bot visited or stayed away?

No. Robots.txt states your policy. Server logs show requests. Use both when you need to audit what happened.

Why mention SPF, DKIM, and DMARC on a website visibility page?

Because the same domain often has both jobs. AI crawlers need readable public pages. Mailbox providers need authenticated mail and a domain with a clean reputation.

Related guides

Check your domain free

InboxRadar grades your email setup A to F in about three seconds, then watches it and emails you the moment something breaks. Free, no login.

Check your domain