nara.gov
robots.txt

Robots Exclusion Standard data for nara.gov

Resource Scan

Scan Details

Site Domain nara.gov
Base Domain nara.gov
Scan Status Ok
Last Scan2024-05-24T23:48:58+00:00
Next Scan 2024-06-23T23:48:58+00:00

Last Scan

Scanned2024-05-24T23:48:58+00:00
URL https://nara.gov/robots.txt
Redirect https://www.archives.gov/robots.txt
Redirect Domain www.archives.gov
Redirect Base archives.gov
Domain IPs 2600:1f18:43e8:f301:9046:c05f:75e7:c481, 2600:1f18:43e8:f302:b470:d266:4d03:3ed8, 52.206.136.3, 52.44.89.206
Redirect IPs 13.33.30.128, 13.33.30.22, 13.33.30.25, 13.33.30.3, 2600:9000:229f:1a00:f:fd2b:b880:93a1, 2600:9000:229f:1e00:f:fd2b:b880:93a1, 2600:9000:229f:2600:f:fd2b:b880:93a1, 2600:9000:229f:5a00:f:fd2b:b880:93a1, 2600:9000:229f:800:f:fd2b:b880:93a1, 2600:9000:229f:8400:f:fd2b:b880:93a1, 2600:9000:229f:b600:f:fd2b:b880:93a1, 2600:9000:229f:f200:f:fd2b:b880:93a1
Response IP 13.33.30.128
Found Yes
Hash 637513f759216025daef58ad327663dcf289c8f453ac4b1d44651649239c8ccb
SimHash b8169d0b4744

Groups

*

Rule Path
Disallow /citizen-archivist/history-hub/hh-test
Disallow /developer/artificial-intelligence-and-machine-learning-datasets
Disallow /developer/1940-census
Disallow /developer/national-archives-catalog-dataset

Other Records

Field Value
crawl-delay 10

usasearch

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 2

Other Records

Field Value
sitemap https://www.archives.gov/sitemap.xml
sitemap https://www.archives.gov/files/sitemap.xml
sitemap https://www.archives.gov/research/native-americans/bia/photos/sitemap.xml

Comments

  • robots.txt
  • This file is to prevent the crawling and indexing of certain parts
  • of your site by web crawlers and spiders run by sites like Yahoo!
  • and Google. By telling these "robots" where not to go on your site,
  • you save bandwidth and server resources.
  • This file will be ignored unless it is at the root of your host:
  • Used: http://example.com/robots.txt
  • Ignored: http://example.com/site/robots.txt
  • For more information about the robots.txt standard, see:
  • http://www.robotstxt.org/robotstxt.html