incpas.org
robots.txt

Robots Exclusion Standard data for incpas.org

Resource Scan

Scan Details

Site Domain incpas.org
Base Domain incpas.org
Scan Status Ok
Last Scan2025-09-28T23:42:30+00:00
Next Scan 2025-10-28T23:42:30+00:00

Last Scan

Scanned2025-09-28T23:42:30+00:00
URL https://incpas.org/robots.txt
Redirect https://www.incpas.org:443/robots.txt
Redirect Domain www.incpas.org
Redirect Base incpas.org
Domain IPs 3.220.41.178
Redirect IPs 18.211.157.40, 52.23.137.179
Response IP 52.23.137.179
Found Yes
Hash e172c5d0c96794b87d86de955e3574e02b6d0cc1820712300809711a2da3c0ae
SimHash 691dd9754ff0

Groups

*

Rule Path
Disallow /Sitefinity
Disallow /sandbox
Disallow /search-results
Disallow /docs/default-source/not-indexed
Disallow /list-pages/article-list

Other Records

Field Value
crawl-delay 120

Other Records

Field Value
sitemap https://www.incpas.org/sitemap/sitemap-index.xml

Comments

  • Do not delete /Sitefinity. Never any reason to allow indexing here
  • The same goes for sandbox
  • Also disallow search. We already have it set to "noindex", but keep getting googlebot hits