In October 2024, a new feature was introduced in Chat-GPT that allows it to search the internet. In short, you can now choose the option to search the web by clicking on the icon, or the chat can decide to search on its own based on the query and its understanding.
How does the engine work?
Similar to Google’s search engine, Chat-GPT also needs “permission” to access our site, crawl it, and learn from it. When we want to do this for Google, we use a robots.txt file where we define where the crawlers can go and where they can’t.
What records are available?
OpenAI provides us with three main records (crawlers) that we can add to our robots file:
OAI-SearchBot
This crawler is the classic search engine. We allow Chat-GPT to access our site, go through the information, and index it. This way, when someone asks a question, the chat can pull the information from the data already indexed.
If you want to allow the crawler, add the following to your robots.txt:
User-agent: OAI-SearchBot
Allow: /
If you want to block the crawler, add this:
User-agent: OAI-SearchBot
Disallow: /
ChatGPT-User
This crawler is for live search. When a user asks a question in chat, and the chat doesn’t have the answer in its index, it requires a quick web search and visits websites it finds. Blocking this crawler means your site won’t appear in live search results on the chat.
If you want to allow the crawler, add the following to your robots.txt:
User-agent: ChatGPT-User
Allow: /
If you want to block it, add this:
User-agent: ChatGPT-User
Disallow: /
GPTBot
This crawler is for training OpenAI’s models. In short, AI needs a lot of data to improve its responses and provide the best answer in the future. This is what this crawler does. It collects data from the site and uses it to constantly improve the models. OpenAI states that blocking this crawler will not affect search results for the site. Personally, I think it’s worth allowing this crawler to visit the site; at worst, it won’t have an impact.
To allow this crawler, add the following to your robots.txt:
User-agent: GPTBot
Allow: /
If you want to block it, add this:
User-agent: GPTBot
Disallow: /
Additional Crawlers
You can also allow or block other crawlers, such as:
User-agent: anthropic-ai
Allow: /
User-agent: Claude-Web
Allow: /
User-agent: ClaudeBot
Allow: /
User-agent: PerplexityBot
Allow: /
Note! If you have folders or URLs you don’t want the crawlers to access, it’s a good idea to make the robots.txt more detailed. Here’s an example of what it might look like after adding everything:
robots.txt for our site:
User-agent: OAI-SearchBot
Allow: /
Disallow: /wp-admin
Disallow: /wp-content/plugins/ #block access to plugins
Disallow: /wp-login.php #block access to management
Disallow: /feed #block feeds
Disallow: /search/ #block internal search results
Disallow: /?s= #block access to internal search result pages
Disallow: /?p= #block access to pages for which permalinks fails
Disallow: /&p= #block access to pages for which permalinks fails
Disallow: /&preview= #block preview
Disallow: /tag/ #block tags
Disallow: /author/ #blocking author pages
User-agent: ChatGPT-User
Allow: /
Disallow: /wp-admin
Disallow: /wp-content/plugins/ #block access to plugins
Disallow: /wp-login.php #block access to management
Disallow: /feed #block feeds
Disallow: /search/ #block internal search results
Disallow: /?s= #block access to internal search result pages
Disallow: /?p= #block access to pages for which permalinks fails
Disallow: /&p= #block access to pages for which permalinks fails
Disallow: /&preview= #block preview
Disallow: /tag/ #block tags
Disallow: /author/ #blocking author pages
User-agent: GPTBot
Allow: /
Disallow: /wp-admin
Disallow: /wp-content/plugins/ #block access to plugins
Disallow: /wp-login.php #block access to management
Disallow: /feed #block feeds
Disallow: /search/ #block internal search results
Disallow: /?s= #block access to internal search result pages
Disallow: /?p= #block access to pages for which permalinks fails
Disallow: /&p= #block access to pages for which permalinks fails
Disallow: /&preview= #block preview
Disallow: /tag/ #block tags
Disallow: /author/ #blocking author pages
User-agent: anthropic-ai
Allow: /
Disallow: /wp-admin
Disallow: /wp-content/plugins/ #block access to plugins
Disallow: /wp-login.php #block access to management
Disallow: /feed #block feeds
Disallow: /search/ #block internal search results
Disallow: /?s= #block access to internal search result pages
Disallow: /?p= #block access to pages for which permalinks fails
Disallow: /&p= #block access to pages for which permalinks fails
Disallow: /&preview= #block preview
Disallow: /tag/ #block tags
Disallow: /author/ #blocking author pages
User-agent: Claude-Web
Allow: /
Disallow: /wp-admin
Disallow: /wp-content/plugins/ #block access to plugins
Disallow: /wp-login.php #block access to management
Disallow: /feed #block feeds
Disallow: /search/ #block internal search results
Disallow: /?s= #block access to internal search result pages
Disallow: /?p= #block access to pages for which permalinks fails
Disallow: /&p= #block access to pages for which permalinks fails
Disallow: /&preview= #block preview
Disallow: /tag/ #block tags
Disallow: /author/ #blocking author pages
User-agent: ClaudeBot
Allow: /
Disallow: /wp-admin
Disallow: /wp-content/plugins/ #block access to plugins
Disallow: /wp-login.php #block access to management
Disallow: /feed #block feeds
Disallow: /search/ #block internal search results
Disallow: /?s= #block access to internal search result pages
Disallow: /?p= #block access to pages for which permalinks fails
Disallow: /&p= #block access to pages for which permalinks fails
Disallow: /&preview= #block preview
Disallow: /tag/ #block tags
Disallow: /author/ #blocking author pages
User-agent: PerplexityBot
Allow: /
Disallow: /wp-admin
Disallow: /wp-content/plugins/ #block access to plugins
Disallow: /wp-login.php #block access to management
Disallow: /feed #block feeds
Disallow: /search/ #block internal search results
Disallow: /?s= #block access to internal search result pages
Disallow: /?p= #block access to pages for which permalinks fails
Disallow: /&p= #block access to pages for which permalinks fails
Disallow: /&preview= #block preview
Disallow: /tag/ #block tags
Disallow: /author/ #blocking author pages
User-agent: *
Allow: /
Disallow: /wp-admin
Disallow: /wp-content/plugins/ #block access to plugins
Disallow: /wp-login.php #block access to management
Disallow: /feed #block feeds
Disallow: /search/ #block internal search results
Disallow: /?s= #block access to internal search result pages
Disallow: /?p= #block access to pages for which permalinks fails
Disallow: /&p= #block access to pages for which permalinks fails
Disallow: /&preview= #block preview
Disallow: /tag/ #block tags
Disallow: /author/ #blocking author pages
Sitemap: https://golevelplus.com/sitemap_index.xml
In Conclusion
The AI revolution in organic SEO is gaining momentum, and we’re witnessing the changes unfold before our eyes. Personally, I recommend hopping on this train as soon as possible. Those who plant the seeds now will reap the rewards later.