circaworks.com
robots.txt
Robots Exclusion Standard data for circaworks.com
Resource Scan
Scan Details
Site Domain | circaworks.com |
Base Domain | circaworks.com |
Scan Status | Ok |
Last Scan | 2024-09-22T13:26:52+00:00 |
Next Scan | 2024-10-22T13:26:52+00:00 |
Last Scan
Scanned | 2024-09-22T13:26:52+00:00 |
URL | https://circaworks.com/robots.txt |
Domain IPs | 141.193.213.20, 141.193.213.21 |
Response IP | 141.193.213.20 |
Found | Yes |
Hash | 206b427bc9b90b8415ff44c284e0830e77300efd57110bef0352bc50dd7bbd63 |
SimHash | 5151ec01cdd1 |
Groups
*
Rule | Path | Comment |
---|---|---|
Disallow | /wp-admin/ | - |
Allow | /wp-admin/admin-ajax.php | - |
Disallow | /pdfs/ | Block the /pdfs/directory. |
Disallow | *.pdf$ | Block pdf files from all bots. Albeit non-standard, it works for major search engines. |
Other Records
Field | Value |
---|---|
sitemap | https://circaworks.com/sitemap_index.xml |