crewct.org
robots.txt

Robots Exclusion Standard data for crewct.org

Resource Scan

Scan Details

Site Domain crewct.org
Base Domain crewct.org
Scan Status Ok
Last Scan2024-09-28T14:13:13+00:00
Next Scan 2024-10-28T14:13:13+00:00

Last Scan

Scanned2024-09-28T14:13:13+00:00
URL https://crewct.org/robots.txt
Redirect https://connecticut.crewnetwork.org/robots.txt
Redirect Domain connecticut.crewnetwork.org
Redirect Base crewnetwork.org
Domain IPs 20.88.56.175
Redirect IPs 76.76.21.241, 76.76.21.9
Response IP 76.76.21.123
Found Yes
Hash 147f5757d6465951ba1a67b27a950664cdfa058a5533165fb3721b6bca1284c1
SimHash d345cec2bab2

Groups

*

Rule Path
Disallow /api/*
Disallow /special-pages/*
Disallow /deployment-monitor

mj12bot

Rule Path
Disallow /

buck

Rule Path
Disallow /

*

Rule Path
Disallow /getmedia/*
Disallow /CrewNetwork/media/*
Disallow /wp-login.php
Disallow /wp-content/*
Disallow /wp-admin/*
Disallow /files/*

Other Records

Field Value Comment
sitemap connecticut.crewnetwork.org/sitemap_index.xml Index sitemap
sitemap connecticut.crewnetwork.org/sitemap.xml -
sitemap connecticut.crewnetwork.org/en/sitemap.xml -

Comments

  • Sitemaps