xerpihan.com
robots.txt

Robots Exclusion Standard data for xerpihan.com

Resource Scan

Scan Details

Site Domain xerpihan.com
Base Domain xerpihan.com
Scan Status Ok
Last Scan2025-11-12T01:48:45+00:00
Next Scan 2025-11-19T01:48:45+00:00

Last Scan

Scanned2025-11-12T01:48:45+00:00
URL https://xerpihan.com/robots.txt
Redirect https://www.xerpihan.com/robots.txt
Redirect Domain www.xerpihan.com
Redirect Base xerpihan.com
Domain IPs 216.150.1.1
Redirect IPs 216.150.1.1
Response IP 216.150.1.1
Found Yes
Hash e545bb569a49b84bbbd7bcc614a6f6f6384fd1d85611cd8483c3f443237961b6
SimHash 4cbb58f0e546

Groups

*

Rule Path
Allow /
Allow /blog
Allow /blog/*
Disallow /api/
Disallow /dashboard/
Disallow /admin/
Disallow /_next/
Disallow /tmp/
Disallow /functions/
Disallow /node_modules/
Disallow *.json$
Disallow /web-xerpihan/

Other Records

Field Value
crawl-delay 1

googlebot

Rule Path
Allow /
Allow /blog
Disallow /api/
Disallow /dashboard/
Disallow /admin/

bingbot

Rule Path
Allow /
Allow /blog
Disallow /api/
Disallow /dashboard/
Disallow /admin/

gptbot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

anthropic-ai

Rule Path
Disallow /

claude-web

Rule Path
Disallow /

Other Records

Field Value
sitemap https://www.xerpihan.com/sitemap.xml

Comments

  • Crawl-delay for better server performance
  • Sitemap location for international users
  • Specific rules for major search engines
  • Block AI training crawlers (2025 best practice)