va.gov
robots.txt

Robots Exclusion Standard data for va.gov

Resource Scan

Scan Details

Site Domain va.gov
Base Domain va.gov
Scan Status Ok
Last Scan2024-09-15T10:25:52+00:00
Next Scan 2024-10-15T10:25:52+00:00

Last Scan

Scanned2024-09-15T10:25:52+00:00
URL https://va.gov/robots.txt
Redirect https://www.va.gov/robots.txt
Redirect Domain www.va.gov
Redirect Base va.gov
Domain IPs 152.130.96.221, 2600:8010:0:28::28:221
Redirect IPs 152.130.96.221, 2600:8010:0:28::28:221
Response IP 152.133.106.221
Found Yes
Hash 0e483635060e0e54f995d282c874260c9344050ac7394e5cf2ee26114d2d932a
SimHash 60de8991cfd3

Groups

usasearch

Rule Path
Allow /

synapse

Rule Path
Disallow /

*

Rule Path
Disallow /analytics-opt-out.html
Disallow /cgi-bin/
Disallow /drupal
Disallow /covid19screen

Other Records

Field Value
sitemap https://www.va.gov/sitemap.xml

Comments

  • existing disallow on va.gov (may not be needed)
  • existing disallow from vets.gov
  • disallow WIP VAMCs
  • make sure to add a trailing slash at the end of the path
  • to prevent sub-directories from being indexed
  • see https://developers.google.com/search/docs/advanced/robots/create-robots-txt#useful-robots.txt-rules
  • sitemap index