adsabs.harvard.edu
robots.txt

Robots Exclusion Standard data for adsabs.harvard.edu

Resource Scan

Scan Details

Site Domain adsabs.harvard.edu
Base Domain harvard.edu
Scan Status Ok
Last Scan2024-05-31T05:38:30+00:00
Next Scan 2024-06-30T05:38:30+00:00

Last Scan

Scanned2024-05-31T05:38:30+00:00
URL https://adsabs.harvard.edu/robots.txt
Domain IPs 131.142.198.210
Response IP 131.142.198.210
Found Yes
Hash 800d0978f5c908110e3ac29ca12ae8be3d2d50da179b1fc436b2f146093bdbdb
SimHash 2c0dd3405df0

Groups

googlebot

Rule Path
Disallow /cgi-bin/
Allow /full/

msnbot

Rule Path
Disallow /cgi-bin/
Allow /full/

slurp

Rule Path
Disallow /cgi-bin/
Allow /full/

teoma

Rule Path
Disallow /cgi-bin/
Disallow /full/

*

Rule Path
Disallow /cgi-bin/
Disallow /abs/
Disallow /full/

Other Records

Field Value
sitemap https://ui.adsabs.harvard.edu/sitemap/sitemap_index.xml

Comments

  • let search engines know where things are
  • Sitemap: http://adsabs.harvard.edu/sitemap_index.xml
  • Google
  • http://www.google.com/bot.html
  • Allow: /abs/
  • MS Live
  • http://search.msn.com/msnbot.htm
  • Allow: /abs/
  • Yahoo
  • http://help.yahoo.com/help/us/ysearch/slurp
  • Allow: /abs/
  • Ask.com
  • http://about.ask.com/en/docs/about/webmasters.shtml
  • Allow: /abs/
  • disallow harvesting from all other robots