thearcaeproject.com
robots.txt

Robots Exclusion Standard data for thearcaeproject.com

Resource Scan

Scan Details

Site Domain thearcaeproject.com
Base Domain thearcaeproject.com
Scan Status Ok
Last Scan2026-01-10T04:34:32+00:00
Next Scan 2026-02-09T04:34:32+00:00

Last Scan

Scanned2026-01-10T04:34:32+00:00
URL https://thearcaeproject.com/robots.txt
Domain IPs 104.21.22.25, 172.67.202.13, 2606:4700:3032::6815:1619, 2606:4700:3036::ac43:ca0d
Response IP 172.67.202.13
Found Yes
Hash 4c47b9503f7120007eb41e815353b542f2bf501bd4a482eb078a51e5f30871a8
SimHash 2530bb7307b0

Groups

*

Rule Path
Disallow /cgi-bin
Disallow /?
Disallow /wp-
Disallow /wp/
Disallow *?s=
Disallow *%26s%3D
Disallow /search/
Disallow /author/
Disallow /users/
Disallow */trackback
Disallow */feed
Disallow */rss
Disallow */embed
Disallow */wlwmanifest.xml
Disallow /xmlrpc.php
Disallow *openstat%3D
Disallow *?page=*
Allow */uploads
Allow /wp-*.png
Allow /wp-*.jpg
Allow /wp-*.jpeg
Allow /wp-*.gif
Allow /wp-admin/admin-ajax.php

Other Records

Field Value
sitemap https://thearcaeproject.com/sitemap.xml

Warnings

  • `host` is not a known field.