mediatethurston.org
robots.txt

Robots Exclusion Standard data for mediatethurston.org

Resource Scan

Scan Details

Site Domain mediatethurston.org
Base Domain mediatethurston.org
Scan Status Ok
Last Scan2025-11-06T04:21:52+00:00
Next Scan 2025-12-06T04:21:52+00:00

Last Scan

Scanned2025-11-06T04:21:52+00:00
URL https://mediatethurston.org/robots.txt
Redirect https://www.mediatethurston.org/robots.txt
Redirect Domain www.mediatethurston.org
Redirect Base mediatethurston.org
Domain IPs 199.34.228.75
Redirect IPs 199.34.228.75
Response IP 199.34.228.75
Found Yes
Hash b1a110f9c412c3fa90d9920dc9d29d8e995e20686620a97f19c09b5f54b08049
SimHash 084dbc862393

Groups

nerdybot

Rule Path
Disallow /

dotbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

*

Rule Path
Disallow /ajax/
Disallow /apps/
Disallow /spscc-legl-220.html
Disallow /sobre-nosotros.html
Disallow /board-portal-archive.html
Disallow /editmysite.html
Disallow /404.html
Disallow /thank-you.html
Disallow /files.html
Disallow /volunteer-portal.html
Disallow /drc-40-hr-lab.html
Disallow /drc-lab-odr.html
Disallow /drc-library.html

Other Records

Field Value
sitemap https://www.mediatethurston.org/sitemap.xml