cniitmash.com
robots.txt

Robots Exclusion Standard data for cniitmash.com

Resource Scan

Scan Details

Site Domain cniitmash.com
Base Domain cniitmash.com
Scan Status Ok
Last Scan2024-11-09T10:02:51+00:00
Next Scan 2024-12-09T10:02:51+00:00

Last Scan

Scanned2024-11-09T10:02:51+00:00
URL https://cniitmash.com/robots.txt
Domain IPs 2a00:15f8:a000:5:1:11:5:f613, 2a00:15f8:a000:5:1:12:5:f613, 2a00:15f8:a000:5:1:13:5:f613, 2a00:15f8:a000:5:1:14:5:f613, 90.156.201.111, 90.156.201.13, 90.156.201.34, 90.156.201.56
Response IP 90.156.201.111
Found Yes
Hash 45653574abf6df6668aac666ba474fab409293d23f5933a2c84f46a7267ebfd5
SimHash 421889439f93

Groups

gptbot

Rule Path
Disallow /

bytespider

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

diffbot

Rule Path
Disallow /

facebookbot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

omgili

Rule Path
Disallow /

anthropic-ai

Rule Path
Disallow /

claude-web

Rule Path
Disallow /

claudebot

Rule Path
Disallow /

cohere-ai

Rule Path
Disallow /

*

Rule Path
Disallow /

yandexwebmaster

Rule Path
Disallow /

yandexbot

Rule Path
Disallow /assets/components
Disallow /connectors
Disallow /core
Disallow /manager
Allow /assets/images/
Allow /assets/files/

Other Records

Field Value
sitemap https://cniitmash.com/sitemap.xml

Warnings

  • `host` is not a known field.