cailianpress.com
robots.txt

Robots Exclusion Standard data for cailianpress.com

Resource Scan

Scan Details

Site Domain cailianpress.com
Base Domain cailianpress.com
Scan Status Ok
Last Scan2024-10-26T09:33:53+00:00
Next Scan 2024-11-25T09:33:53+00:00

Last Scan

Scanned2024-10-26T09:33:53+00:00
URL http://cailianpress.com/robots.txt
Redirect https://www.cls.cn/robots.txt
Redirect Domain www.cls.cn
Redirect Base cls.cn
Domain IPs 140.207.177.207, 180.163.28.105
Redirect IPs 103.143.19.17, 240e:940:e009:1e0::18f
Response IP 103.143.19.17
Found Yes
Hash fb65f0409dcd8486d15dc87eb42220657fb6c102dd6ff40d12c913aaf543fceb
SimHash a838d4c00990

Groups

*

Rule Path
Disallow /hwwebscan_verify.html
Disallow /static/
Disallow .jpg$
Disallow .jpeg$
Disallow .gif$
Disallow .png$
Disallow .bmp$

Other Records

Field Value
sitemap https://cls.cn/map.xml

Comments

  • User-agent:* //制定规则适用于哪个蜘蛛,'*'代表所有搜索引擎
  • Disallow: /(禁止蜘蛛爬取网站的所有目录 "/" 表示根目录下)
  • Allow:(用来定义允许蜘蛛爬取的页面或子目录)
  • Sitemap:告诉蜘蛛XML网站地图在哪里。

Warnings

  • 1 invalid line.