alma-cac.org
robots.txt

Robots Exclusion Standard data for alma-cac.org

Resource Scan

Scan Details

Site Domain alma-cac.org
Base Domain alma-cac.org
Scan Status Ok
Last Scan2025-11-23T03:48:31+00:00
Next Scan 2025-12-23T03:48:31+00:00

Last Scan

Scanned2025-11-23T03:48:31+00:00
URL https://alma-cac.org/robots.txt
Redirect https://www.alma-cac.org/robots.txt
Redirect Domain www.alma-cac.org
Redirect Base alma-cac.org
Domain IPs 199.34.228.42
Redirect IPs 199.34.228.42
Response IP 199.34.228.42
Found Yes
Hash cf6210cf415a5c211ebf0118139a7b952d7792ac81be50d5e0930d081f6107cd
SimHash 3155d83c27a3

Groups

nerdybot

Rule Path
Disallow /

dotbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

*

Rule Path
Disallow /ajax/
Disallow /apps/
Disallow /financialcopy.html
Disallow /workshops.html
Disallow /monthly-classes.html
Disallow /special-classes.html
Disallow /donations.html
Disallow /assets.html
Disallow /homeold.html

Other Records

Field Value
sitemap https://www.alma-cac.org/sitemap.xml