archives.com
robots.txt

Robots Exclusion Standard data for archives.com

Resource Scan

Scan Details

Site Domain archives.com
Base Domain archives.com
Scan Status Ok
Last Scan2024-11-16T10:31:42+00:00
Next Scan 2024-11-23T10:31:42+00:00

Last Scan

Scanned2024-11-16T10:31:42+00:00
URL https://archives.com/robots.txt
Redirect https://www.archives.com/robots.txt
Redirect Domain www.archives.com
Redirect Base archives.com
Domain IPs 104.18.0.50, 104.18.1.50
Redirect IPs 104.18.33.62, 172.64.154.194
Response IP 172.64.154.194
Found Yes
Hash 0c73897a9e70287aa21ab1bbf245f336f448547b3d6b370ccfa3623b9e61a50a
SimHash a159d8066373

Groups

*

Rule Path
Disallow /terms/2
Disallow /terms/1

mozilla/4.0 (compatible; msie 7.0; windows nt 5.1; sv1)

Rule Path
Disallow /

mozilla/4.0 (compatible; msie 8.0; windows nt 6.1; trident/4.0; .net clr 1.1.4322; .net4.0c; .net4.0e; .net clr 2.0.50727; .net clr 3.0.4506.2152; .net clr 3.5.30729)

Rule Path
Disallow /

mozilla/5.0 (compatible; baiduspider/2.0; +http://www.baidu.com/search/spider.html)

Rule Path
Disallow /

mozilla/5.0 (windows nt 10.0; wow64) applewebkit/537.36 (khtml, like gecko) chrome/51.0.2704.103 safari/537.36

Rule Path
Disallow /

mozilla/5.0 (macintosh; intel mac os x 10_11_6) applewebkit/601.7.7 (khtml, like gecko) version/9.1.2 safari/601.7.7

Rule Path
Disallow /