gpf-comics.com
robots.txt

Robots Exclusion Standard data for gpf-comics.com

Resource Scan

Scan Details

Site Domain gpf-comics.com
Base Domain gpf-comics.com
Scan Status Ok
Last Scan2024-10-13T15:53:24+00:00
Next Scan 2024-10-20T15:53:24+00:00

Last Scan

Scanned2024-10-13T15:53:24+00:00
URL https://gpf-comics.com/robots.txt
Redirect https://www.gpf-comics.com/robots.txt
Redirect Domain www.gpf-comics.com
Redirect Base gpf-comics.com
Domain IPs 172.104.24.93, 2600:3c03::f03c:91ff:fe9c:bc4
Redirect IPs 172.104.24.93, 2600:3c03::f03c:91ff:fe9c:bc4
Response IP 172.104.24.93
Found Yes
Hash 5c6bc520f53b22421bfaabc89d184b68a15dc29c90c69548c05636d90e229e4f
SimHash 6917c9015573

Groups

mediapartners-google

Rule Path
Disallow

googlebot-image

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

psbot

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

baiduspider

Rule Path
Disallow /

baiduspider-image

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

applebot-extended

Rule Path
Disallow /

*

Rule Path
Disallow /cgi-bin/
Disallow /premium/
Disallow /advertise/
Disallow /admin/
Disallow /comics/
Disallow /hidefcomics/
Disallow /forum/
Disallow /wikix/
Disallow /wiki/Special%3ASearch
Disallow /wiki/Special%3ARandom
Disallow /flashback/

Other Records

Field Value
sitemap http://www.gpf-comics.com/sitemap_index.xml

Comments

  • /robots.txt file for http://www.gpf-comics.com/
  • mail jeff@gpf-comics.com for constructive criticism
  • Allow Google AdSense to crawl everywhere:
  • Disallow Google Image searches:
  • Disallow Google generative AI crawlers:
  • Disallow Picsearch searches:
  • Disallow Majestic-12, a known suspicious bot:
  • Disallow Baidu, which has become abusive with many invalid requests:
  • Disallow OpenAI's GPTBot:
  • Disallow Common Crawl's AI scraper bot:
  • Disallow Apple's AI scraper bot:
  • Block everyone else from select parts of the site:
  • Define our sitemap file: