newsru.ca
robots.txt
Robots Exclusion Standard data for newsru.ca
Resource Scan
Scan Details
Site Domain | newsru.ca |
Base Domain | newsru.ca |
Scan Status | Failed |
Failure Stage | Fetching resource. |
Failure Reason | Server returned a client error. |
Last Scan | 2024-09-07T09:27:33+00:00 |
Next Scan | 2024-12-06T09:27:33+00:00 |
Last Successful Scan
Scanned | 2022-11-10T20:53:50+00:00 |
URL | https://newsru.ca/robots.txt |
Response IP | 172.67.170.121, 104.21.95.146 |
Found | Yes |
Hash | fad591f50d8030b4c05616f41ec2346621962ed3c0b608ef5514b57259d1dea1 |
SimHash | 7305dc332393 |
Groups
*
Rule | Path |
---|---|
Allow | */*.css |
Allow | */*.js* |
Allow | /wp-admin/admin-ajax.php |
Disallow | /wp-admin |
Disallow | */*wp-json/ |
Disallow | /wp-login.php |
Disallow | /wp-register.php |
Disallow | */feed* |
Disallow | /cgi-bin |
Disallow | /xmlrpc.php |
Disallow | */*comments |
Disallow | */*trackback/ |
Disallow | */embed* |
Other Records
Field | Value |
---|---|
crawl-delay | 10 |
Other Records
Field | Value |
---|---|
sitemap | https://newsru.ca/sitemap.xml |
Comments