guruin.cn
robots.txt

Robots Exclusion Standard data for guruin.cn

Resource Scan

Scan Details

Site Domain guruin.cn
Base Domain guruin.cn
Scan Status Ok
Last Scan2024-11-15T08:56:33+00:00
Next Scan 2024-11-22T08:56:33+00:00

Last Scan

Scanned2024-11-15T08:56:33+00:00
URL http://www.guruin.cn/robots.txt
Domain IPs 47.96.153.18
Response IP 47.96.153.18
Found Yes
Hash 495ebb6c7f04bf7105502272d05c650e143ece2718ef51b47a43c5df837696cf
SimHash aa0d8d97f740

Groups

googlebot

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

semrushbot-sa

Rule Path
Disallow /

*

Rule Path
Disallow /401.html
Disallow /403.html
Disallow /404.html
Disallow /422.html
Disallow /500.html
Disallow /online.html
Disallow /offline.html
Disallow /error.html
Disallow /*.png$
Disallow /*.jpg$
Disallow /*.gif$
Disallow /cdn-cgi/
Disallow /open/
Disallow /api/
Disallow /embed/
Disallow /qrcode
Disallow /bdmail
Disallow /csmail
Disallow /db/attachments/
Allow /

Other Records

Field Value
sitemap https://www.guruin.cn/system/cn/sitemap.xml.gz

Comments

  • See http://www.robotstxt.org/robotstxt.html for documentation on how to use the robots.txt file
  • To ban all spiders from the entire site uncomment the next two lines: