khalsaaid.org
robots.txt

Robots Exclusion Standard data for khalsaaid.org

Resource Scan

Scan Details

Site Domain khalsaaid.org
Base Domain khalsaaid.org
Scan Status Ok
Last Scan2025-11-04T12:43:43+00:00
Next Scan 2025-12-04T12:43:43+00:00

Last Scan

Scanned2025-11-04T12:43:43+00:00
URL https://khalsaaid.org/robots.txt
Domain IPs 2600:9000:2721:1800:b:29ff:fc00:93a1, 2600:9000:2721:4000:b:29ff:fc00:93a1, 2600:9000:2721:4200:b:29ff:fc00:93a1, 2600:9000:2721:4800:b:29ff:fc00:93a1, 2600:9000:2721:6000:b:29ff:fc00:93a1, 2600:9000:2721:7200:b:29ff:fc00:93a1, 2600:9000:2721:9200:b:29ff:fc00:93a1, 2600:9000:2721:f600:b:29ff:fc00:93a1, 3.165.102.14, 3.165.102.66, 3.165.102.74, 3.165.102.93
Response IP 3.165.102.74
Found Yes
Hash 6495e0536a8888a370e05fcf6b57b82e08e581c75f830e66083d09ad33e68272
SimHash 84151b5151c3

Groups

*

Rule Path
Allow /

adsbot-google

Rule Path
Allow /
Disallow /*.js$
Disallow /*.ts$
Disallow /*.jsx$
Disallow /*.tsx$
Disallow /*.json
Disallow /analytics.json
Disallow /graphql

Other Records

Field Value
sitemap https://khalsaaid.org/sitemap.xml

Comments

  • https://www.robotstxt.org/robotstxt.html
  • Google adsbot ignores robots.txt unless specifically named!
  • Stop JS and JSON from being crawled