worldwidearchive.org
robots.txt
Robots Exclusion Standard data for worldwidearchive.org
Resource Scan
Scan Details
Site Domain | worldwidearchive.org |
Base Domain | worldwidearchive.org |
Scan Status | Ok |
Last Scan | 2025-08-19T06:01:00+00:00 |
Next Scan | 2025-08-26T06:01:00+00:00 |
Last Scan
Scanned | 2025-08-19T06:01:00+00:00 |
URL | https://worldwidearchive.org/robots.txt |
Domain IPs | 104.21.59.196, 172.67.182.241, 2606:4700:3033::6815:3bc4, 2606:4700:3037::ac43:b6f1 |
Response IP | 104.21.59.196 |
Found | Yes |
Hash | 2fa526570acba59d29971e10359f4f666f8c3c4226017e750a33e20439d1a1c4 |
SimHash | 501d0142f209 |
Groups
*
Rule | Path |
---|---|
Disallow | /search? |
Disallow | /edit/ |
Disallow | /cdn-cgi/ |
Disallow | /dynjs/ |
Disallow | /dyn/actions/ |
Disallow | /en/search? |
Disallow | /fr/search? |
Disallow | /ta/search? |
Disallow | /de/search? |
Allow | / |
Other Records
Field | Value |
---|---|
sitemap | https://worldwidearchive.org/sitemaps/sitemap_index.xml |