itsubuntu.com
robots.txt
Robots Exclusion Standard data for itsubuntu.com
Resource Scan
Scan Details
Site Domain | itsubuntu.com |
Base Domain | itsubuntu.com |
Scan Status | Ok |
Last Scan | 2024-11-16T23:53:22+00:00 |
Next Scan | 2024-11-23T23:53:22+00:00 |
Last Scan
Scanned | 2024-11-16T23:53:22+00:00 |
URL | https://itsubuntu.com/robots.txt |
Domain IPs | 104.21.22.206, 172.67.206.237, 2606:4700:3031::ac43:ceed, 2606:4700:3033::6815:16ce |
Response IP | 172.67.206.237 |
Found | Yes |
Hash | e4bf3e740f0f9e9d5fd8baed6dbbd980c1b4db9ac38f333eed6321d0bf5ad66a |
SimHash | cf495cc6ef41 |
Groups
*
Rule | Path | Comment |
---|---|---|
Disallow | /wp-admin/ | block access to admin section |
Disallow | /wp-login.php | block access to admin section |
Disallow | *%26preview%3D* | block access to preview pages |
Other Records
Field | Value |
---|---|
crawl-delay | 10 |
Other Records
Field | Value |
---|---|
sitemap | https://itsubuntu.com/sitemap_index.xml |