volunteerhq.org
robots.txt

Robots Exclusion Standard data for volunteerhq.org

Resource Scan

Scan Details

Site Domain volunteerhq.org
Base Domain volunteerhq.org
Scan Status Failed
Failure ReasonScan timed out.
Last Scan2024-02-29T06:46:35+00:00
Next Scan 2024-05-29T06:46:35+00:00

Last Successful Scan

Scanned2022-10-04T20:04:38+00:00
URL https://www.volunteerhq.org/robots.txt
Response IP 13.227.254.43, 13.227.254.47, 13.227.254.7, 13.227.254.117
Found Yes
Hash 41543170cb479727d016f4ef742e68d788a35df690e9c082e6e7a5ccb4d40c3c
SimHash 6b5f4c51c253

Groups

*

Rule Path
Disallow /competition-winners/photo/2013
Disallow /competition-winners/photo/2012
Disallow /competition-winners/photo/2011
Disallow /competition-winners/video/2013
Disallow /competition-winners/video/2012
Disallow /competition-winners/video/2011
Disallow /*.pdf$

Other Records

Field Value
crawl-delay 10

Other Records

Field Value
sitemap https://www.volunteerhq.org/sitemap.xml

Comments

  • Custom
  • Don't index PDFs: