How to Allow Chat-GPT And Other AI Crawlers to Scan Your Website

How to Block or Allow Chat-GPT Crawlers to Scan Your Website?

In October 2024, a new feature was introduced in Chat-GPT that allows it to search the internet. In short, you can now choose the option to search the web by clicking on the icon, or the chat can decide to search on its own based on the query and its understanding.

ChatGPT Search The Web Feature - Screenshot

How does the engine work?

Similar to Google’s search engine, Chat-GPT also needs “permission” to access our site, crawl it, and learn from it. When we want to do this for Google, we use a robots.txt file where we define where the crawlers can go and where they can’t.

What records are available?

OpenAI provides us with three main records (crawlers) that we can add to our robots file:

OAI-SearchBot

This crawler is the classic search engine. We allow Chat-GPT to access our site, go through the information, and index it. This way, when someone asks a question, the chat can pull the information from the data already indexed.

If you want to allow the crawler, add the following to your robots.txt:

User-agent: OAI-SearchBot
Allow: /

If you want to block the crawler, add this:

User-agent: OAI-SearchBot
Disallow: /

ChatGPT-User

This crawler is for live search. When a user asks a question in chat, and the chat doesn’t have the answer in its index, it requires a quick web search and visits websites it finds. Blocking this crawler means your site won’t appear in live search results on the chat.

If you want to allow the crawler, add the following to your robots.txt:

User-agent: ChatGPT-User
Allow: /

If you want to block it, add this:

User-agent: ChatGPT-User
Disallow: /

GPTBot

This crawler is for training OpenAI’s models. In short, AI needs a lot of data to improve its responses and provide the best answer in the future. This is what this crawler does. It collects data from the site and uses it to constantly improve the models. OpenAI states that blocking this crawler will not affect search results for the site. Personally, I think it’s worth allowing this crawler to visit the site; at worst, it won’t have an impact.

To allow this crawler, add the following to your robots.txt:

User-agent: GPTBot
Allow: /

If you want to block it, add this:

User-agent: GPTBot
Disallow: /

Additional Crawlers

You can also allow or block other crawlers, such as:

User-agent: anthropic-ai
Allow: /

User-agent: Claude-Web
Allow: /

User-agent: ClaudeBot
Allow: /

User-agent: PerplexityBot
Allow: /

Note! If you have folders or URLs you don’t want the crawlers to access, it’s a good idea to make the robots.txt more detailed. Here’s an example of what it might look like after adding everything:

robots.txt for our site:

User-agent: OAI-SearchBot
Allow: /
Disallow: /wp-admin
Disallow: /wp-content/plugins/ #block access to plugins
Disallow: /wp-login.php #block access to management
Disallow: /feed #block feeds
Disallow: /search/ #block internal search results
Disallow: /?s= #block access to internal search result pages
Disallow: /?p= #block access to pages for which permalinks fails
Disallow: /&p= #block access to pages for which permalinks fails
Disallow: /&preview= #block preview
Disallow: /tag/ #block tags
Disallow: /author/ #blocking author pages

User-agent: ChatGPT-User
Allow: /
Disallow: /wp-admin
Disallow: /wp-content/plugins/ #block access to plugins
Disallow: /wp-login.php #block access to management
Disallow: /feed #block feeds
Disallow: /search/ #block internal search results
Disallow: /?s= #block access to internal search result pages
Disallow: /?p= #block access to pages for which permalinks fails
Disallow: /&p= #block access to pages for which permalinks fails
Disallow: /&preview= #block preview
Disallow: /tag/ #block tags
Disallow: /author/ #blocking author pages

User-agent: GPTBot
Allow: /
Disallow: /wp-admin
Disallow: /wp-content/plugins/ #block access to plugins
Disallow: /wp-login.php #block access to management
Disallow: /feed #block feeds
Disallow: /search/ #block internal search results
Disallow: /?s= #block access to internal search result pages
Disallow: /?p= #block access to pages for which permalinks fails
Disallow: /&p= #block access to pages for which permalinks fails
Disallow: /&preview= #block preview
Disallow: /tag/ #block tags
Disallow: /author/ #blocking author pages

User-agent: anthropic-ai
Allow: /
Disallow: /wp-admin
Disallow: /wp-content/plugins/ #block access to plugins
Disallow: /wp-login.php #block access to management
Disallow: /feed #block feeds
Disallow: /search/ #block internal search results
Disallow: /?s= #block access to internal search result pages
Disallow: /?p= #block access to pages for which permalinks fails
Disallow: /&p= #block access to pages for which permalinks fails
Disallow: /&preview= #block preview
Disallow: /tag/ #block tags
Disallow: /author/ #blocking author pages

User-agent: Claude-Web
Allow: /
Disallow: /wp-admin
Disallow: /wp-content/plugins/ #block access to plugins
Disallow: /wp-login.php #block access to management
Disallow: /feed #block feeds
Disallow: /search/ #block internal search results
Disallow: /?s= #block access to internal search result pages
Disallow: /?p= #block access to pages for which permalinks fails
Disallow: /&p= #block access to pages for which permalinks fails
Disallow: /&preview= #block preview
Disallow: /tag/ #block tags
Disallow: /author/ #blocking author pages

User-agent: ClaudeBot
Allow: /
Disallow: /wp-admin
Disallow: /wp-content/plugins/ #block access to plugins
Disallow: /wp-login.php #block access to management
Disallow: /feed #block feeds
Disallow: /search/ #block internal search results
Disallow: /?s= #block access to internal search result pages
Disallow: /?p= #block access to pages for which permalinks fails
Disallow: /&p= #block access to pages for which permalinks fails
Disallow: /&preview= #block preview
Disallow: /tag/ #block tags
Disallow: /author/ #blocking author pages

User-agent: PerplexityBot
Allow: /
Disallow: /wp-admin
Disallow: /wp-content/plugins/ #block access to plugins
Disallow: /wp-login.php #block access to management
Disallow: /feed #block feeds
Disallow: /search/ #block internal search results
Disallow: /?s= #block access to internal search result pages
Disallow: /?p= #block access to pages for which permalinks fails
Disallow: /&p= #block access to pages for which permalinks fails
Disallow: /&preview= #block preview
Disallow: /tag/ #block tags
Disallow: /author/ #blocking author pages

User-agent: *
Allow: /
Disallow: /wp-admin
Disallow: /wp-content/plugins/ #block access to plugins
Disallow: /wp-login.php #block access to management
Disallow: /feed #block feeds
Disallow: /search/ #block internal search results
Disallow: /?s= #block access to internal search result pages
Disallow: /?p= #block access to pages for which permalinks fails
Disallow: /&p= #block access to pages for which permalinks fails
Disallow: /&preview= #block preview
Disallow: /tag/ #block tags
Disallow: /author/ #blocking author pages

Sitemap: https://golevelplus.com/sitemap_index.xml

In Conclusion

The AI revolution in organic SEO is gaining momentum, and we’re witnessing the changes unfold before our eyes. Personally, I recommend hopping on this train as soon as possible. Those who plant the seeds now will reap the rewards later.

You Might Also Like
Get Your Free
Marketing Proposal

Just tell us a bit about your business, and we’ll do the rest.