cit.org.in
robots.txt

Robots Exclusion Standard data for cit.org.in

Resource Scan

Scan Details

Site Domain cit.org.in
Base Domain cit.org.in
Scan Status Ok
Last Scan2026-01-25T21:59:04+00:00
Next Scan 2026-02-01T21:59:04+00:00

Last Scan

Scanned2026-01-25T21:59:04+00:00
URL https://www.cit.org.in/robots.txt
Domain IPs 151.101.131.52, 151.101.195.52, 151.101.3.52, 151.101.67.52, 2a04:4e42:200::820, 2a04:4e42:400::820, 2a04:4e42:600::820
Response IP 151.101.195.52
Found Yes
Hash dd6e6b07feaa0bbc899fbfde48d31d7d94a829739d188cdd9a8da632528a6f2b
SimHash 481059706971

Groups

*

Rule Path
Allow /
Disallow /api/
Disallow /_next/
Disallow /admin/
Disallow /error/
Disallow /product/edit/
Disallow /shopdetail/
Disallow /wp-admin/
Disallow /wp-content/
Disallow /wp-includes/
Disallow /manager/
Disallow /phpmyadmin/
Disallow /config/
Disallow /backup/
Disallow /old/
Disallow /temp/
Disallow /tmp/
Disallow /test/
Disallow /debug/
Disallow /*.json
Disallow /*?*utm_*
Disallow /*?*session*
Disallow /*?*sid=*

googlebot

Rule Path
Allow /
Allow /en/
Allow /hi/
Allow /es/
Allow /ar/
Allow /zh/
Allow /fr/
Allow /pt-BR/
Allow /services/
Allow /blog/
Allow /locations/
Allow /about/
Allow /contact/
Disallow /api/
Disallow /_next/
Disallow /admin/
Disallow /error/

bingbot

Rule Path
Allow /
Allow /en/
Allow /hi/
Allow /es/
Allow /ar/
Allow /zh/
Allow /fr/
Allow /pt-BR/
Disallow /api/
Disallow /_next/
Disallow /admin/

gptbot
chatgpt-user
ccbot
anthropic-ai
claude-web
google-extended
cohere-ai
omgilibot
bytespider

Rule Path
Disallow /

Other Records

Field Value
sitemap https://www.cit.org.in/sitemap-index.xml

Warnings

  • `host` is not a known field.