twisave.com
robots.txt

Robots Exclusion Standard data for twisave.com

Resource Scan

Scan Details

Site Domain twisave.com
Base Domain twisave.com
Scan Status Failed
Failure StageFetching resource.
Failure ReasonRequest timed out.
Last Scan2024-11-08T04:55:43+00:00
Next Scan 2025-02-06T04:55:43+00:00

Last Successful Scan

Scanned2024-04-13T03:35:01+00:00
URL https://twisave.com/robots.txt
Domain IPs 133.130.118.187
Response IP 133.130.118.187
Found Yes
Hash e15dcfc54ff6c4777c9857aeae3efd5585c40e27b6dc3a181baa1355dbf855fe
SimHash 228e4e2d27e0

Groups

mj12bot

Rule Path
Disallow /

riddler

Rule Path
Disallow /

mlbot

Rule Path
Disallow /

just-crawler

Rule Path
Disallow /

baiduspider

Rule Path
Disallow /

grapeshot

Rule Path
Disallow /

bingbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

*

Rule Path
Disallow /auth/

Comments

  • See http://www.robotstxt.org/wc/norobots.html for documentation on how to use the robots.txt file
  • To ban all spiders from the entire site uncomment the next two lines:
  • User-Agent: *
  • Disallow: /