waste.org
robots.txt

Robots Exclusion Standard data for waste.org

Resource Scan

Scan Details

Site Domain waste.org
Base Domain waste.org
Scan Status Ok
Last Scan2025-10-13T19:24:04+00:00
Next Scan 2025-11-12T19:24:04+00:00

Last Scan

Scanned2025-10-13T19:24:04+00:00
URL https://waste.org/robots.txt
Domain IPs 172.104.6.113, 2600:3c03::f03c:91ff:fe8a:8b1e
Response IP 172.104.6.113
Found Yes
Hash 36b271e54947b26a4d7a92adbb56acc158b00ce664a99b829639ad530b499e0e
SimHash f151d6e04195

Groups

*

Rule Path
Disallow /pub
Disallow /cgi-bin
Disallow /users
Disallow /local
Disallow /mail
Disallow /regveg
Disallow /sci-veg
Disallow /~oxymoron/art
Disallow /~camera/photos
Disallow /~beetle/photos
Disallow /~velouria/photos
Disallow /~velouria/prvphotos
Disallow /sensor/photos
Disallow /~oxymoron/photos
Disallow /~unbroken/photos
Disallow /~apricot/photos
Disallow /~

Other Records

Field Value
crawl-delay 100