data.census.gov
robots.txt

Robots Exclusion Standard data for data.census.gov

Resource Scan

Scan Details

Site Domain data.census.gov
Base Domain census.gov
Scan Status Ok
Last Scan2025-12-20T16:24:15+00:00
Next Scan 2026-01-03T16:24:15+00:00

Last Scan

Scanned2025-12-20T16:24:15+00:00
URL https://data.census.gov/robots.txt
Domain IPs 172.65.90.24, 172.65.90.25, 172.65.90.26, 172.65.90.27, 2606:4700:78::90:0:180, 2606:4700:78::90:0:181, 2606:4700:78::90:0:182, 2606:4700:78::90:0:183
Response IP 172.65.90.26
Found Yes
Hash 9b253c62737f686d4af089ebed28931b479d26dcd91f33a49bd3d98934096698
SimHash 2c119d034755

Groups

*

Rule Path
Allow /
Disallow /mdat/

Other Records

Field Value
sitemap https://data.census.gov/sitemap.xml

Comments

  • robots.txt
  • This file serves to prevent the crawling and indexing of certain parts
  • of the site by web crawlers. By telling web crawlers where to go and not to go on your site,
  • you save bandwidth and server resources.
  • This file will be ignored unless it is at the root of your host:
  • Used: http://example.com/robots.txt
  • Ignored: http://example.com/site/robots.txt