theword.net
robots.txt

Robots Exclusion Standard data for theword.net

Resource Scan

Scan Details

Site Domain theword.net
Base Domain theword.net
Scan Status Ok
Last Scan5/2/2025, 4:13:05 AM
Next Scan 6/1/2025, 4:13:05 AM

Last Scan

Scanned5/2/2025, 4:13:05 AM
URL https://theword.net/robots.txt
Domain IPs 104.21.66.247, 172.67.166.136, 2606:4700:3030::ac43:a688, 2606:4700:3032::6815:42f7
Response IP 172.67.166.136
Found Yes
Hash 52aa0951fb8ec626574e792ae1b8cf10c75dd858a5793739b60bb5ef9d603304
SimHash b83ed86bc7ca

Groups

ia_archiver

Rule Path
Disallow /

*

Rule Path
Disallow /bin/

*

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 60

dotbot

Rule Path
Disallow /

*

Rule Path
Disallow /cgi-bin/
Disallow /tmp/

youdaobot

Rule Path
Disallow /

sogou spider

Rule Path
Disallow /

yisouspider

Rule Path
Disallow /

linkscrawler

Rule Path
Disallow /

easouspider

Rule Path
Disallow /

Warnings

  • 2 invalid lines.