chatgpt.pt
robots.txt

Robots Exclusion Standard data for chatgpt.pt

Resource Scan

Scan Details

Site Domain chatgpt.pt
Base Domain chatgpt.pt
Scan Status Failed
Failure StageFetching resource.
Failure ReasonServer returned a client error.
Last Scan2025-12-21T09:33:46+00:00
Next Scan 2026-02-19T09:33:46+00:00

Last Successful Scan

Scanned2025-10-14T23:20:07+00:00
URL https://chatgpt.pt/robots.txt
Redirect https://chatgpt.com/robots.txt
Redirect Domain chatgpt.com
Redirect Base chatgpt.com
Domain IPs 212.53.160.102
Redirect IPs 104.18.32.47, 172.64.155.209, 2a06:98c1:3100::6812:202f, 2a06:98c1:310b::ac40:9bd1
Response IP 104.18.32.47
Found Yes
Hash 11ed1c362ae046ca8cfca5549b7504b682129f19e62d6fc9d6560f6ee3ec219d
SimHash 3654cb90c6b4

Groups

ccbot

Rule Path
Disallow /

img2dataset

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

anthropic-ai

Rule Path
Disallow /

claude-web

Rule Path
Disallow /

omgilibot

Rule Path
Disallow /

omgili

Rule Path
Disallow /

facebookbot

Rule Path
Disallow /

bytespider

Rule Path
Disallow /

magpie-crawler

Rule Path
Disallow /

perplexitybot

Rule Path
Disallow /

perplexity‑user

Rule Path
Disallow /

*

Rule Path
Allow /$
Allow /?*
Allow /api/share/og/*
Allow /g/*
Allow /s/*
Allow /gg/v/*
Allow /share/*
Allow /canvas/shared/*
Allow /images/*
Allow /auth/*
Allow /gpts$
Allow /codex$
Allow /search$
Allow /backend-anon/*
Allow /public-api/*
Allow /sitemap.xml
Allow /100chats
Allow /api/public_content/*
Allow /backend-api/public_content/*
Allow /?ref=dotcom
Allow /overview
Allow /*/overview
Allow /features*
Allow /*/features*
Allow /use-cases*
Allow /*/use-cases*
Allow /learn*
Allow /*/learn*
Allow /business*
Allow /*/business*
Allow /pricing
Allow /*/pricing
Allow /download
Allow /*/download
Allow /students
Allow /contact-sales
Allow /*/contact-sales
Allow /100chats-project
Allow /*/100chats-project
Allow /merchants
Allow /*/merchants
Allow /parent-resources
Allow /*/parent-resources
Allow /am/$
Allow /ar/$
Allow /bg-BG/$
Allow /bn-BD/$
Allow /bs-BA/$
Allow /ca-ES/$
Allow /cs-CZ/$
Allow /da-DK/$
Allow /de-DE/$
Allow /el-GR/$
Allow /es-ES/$
Allow /es-419/$
Allow /et-EE/$
Allow /fi-FI/$
Allow /fr-FR/$
Allow /fr-CA/$
Allow /gu-IN/$
Allow /hi-IN/$
Allow /hr-HR/$
Allow /hu-HU/$
Allow /hy-AM/$
Allow /id-ID/$
Allow /is-IS/$
Allow /it-IT/$
Allow /ja-JP/$
Allow /ka-GE/$
Allow /kk/$
Allow /kn-IN/$
Allow /ko-KR/$
Allow /lt/$
Allow /lv-LV/$
Allow /mk-MK/$
Allow /ml/$
Allow /mn/$
Allow /mr-IN/$
Allow /ms-MY/$
Allow /my-MM/$
Allow /nb-NO/$
Allow /nl-NL/$
Allow /pa/$
Allow /pl-PL/$
Allow /pt-BR/$
Allow /pt-PT/$
Allow /ro-RO/$
Allow /ru-RU/$
Allow /sk-SK/$
Allow /sl-SI/$
Allow /so-SO/$
Allow /sq-AL/$
Allow /sr-RS/$
Allow /sv-SE/$
Allow /sw-TZ/$
Allow /ta-IN/$
Allow /te-IN/$
Allow /th-TH/$
Allow /tl/$
Allow /tr-TR/$
Allow /uk-UA/$
Allow /ur/$
Allow /vi-VN/$
Allow /zh-CN/$
Allow /zh-TW/$
Allow /zh-HK/$
Disallow /
Disallow /auth/logout
Disallow /auth/login?*
Disallow /backend-anon/sentinel/*
Disallow /backend-anon/conversation$
Disallow /account-link/*

Other Records

Field Value
sitemap https://chatgpt.com/sitemap.xml

Comments

  • https://www.robotstxt.org/robotstxt.html
  • General rules for all other bots
  • Place allows first to avoid bots skipping after Disallow: /
  • Allow exactly the homepage
  • Allow the homepage with any query parameters
  • Static Landing Pages
  • Exact locale specific homepages
  • Now block everything else
  • Specific disallows (redundant for some bots, but still useful for those that respect precedence)