med.unc.edu
robots.txt

Robots Exclusion Standard data for med.unc.edu

Resource Scan

Scan Details

Site Domain med.unc.edu
Base Domain unc.edu
Scan Status Ok
Last Scan2024-06-15T17:15:31+00:00
Next Scan 2024-07-15T17:15:31+00:00

Last Scan

Scanned2024-06-15T17:15:31+00:00
URL https://med.unc.edu/robots.txt
Redirect https://www.med.unc.edu:443/robots.txt
Redirect Domain www.med.unc.edu
Redirect Base unc.edu
Domain IPs 152.19.9.36
Redirect IPs 152.19.9.36
Response IP 152.19.9.36
Found Yes
Hash 0577f81fe28a368367464c1a6966ffc6b765d5c9713fbf45f22d52fb580e14ce
SimHash 2af4605557e4

Groups

slurp
exabot
yandex
msnbot
bingbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

*

Rule Path
Disallow /cgi-bin/
Disallow /wrkunits
Disallow /ent/ellen-applicant-files
Disallow /*%40%40search*
Disallow /search*

googlebot

Rule Path
Disallow /*sendto_form$
Disallow /*folder_factories$
Disallow /*?searchterm=*
Disallow /*/folder_contents*
Disallow /*/%40%40folder_contents*
Disallow /*/folder_summary_view*
Disallow /*/%40%40content-checkout*
Disallow /cgi-bin/Calcium37.pl*
Disallow /*%40%40search*
Disallow /search*

msnbot

Rule Path
Disallow /*sendto_form$
Disallow /*folder_factories$
Disallow /*?searchterm=*
Disallow /*/folder_contents*
Disallow /*/%40%40folder_contents*
Disallow /*/folder_summary_view*
Disallow /*/%40%40content-checkout*
Disallow /cgi-bin/Calcium37.pl*
Disallow /*%40%40search*
Disallow /search*

siteimprovebot-crawler

Rule Path
Disallow */wp-login.php
Disallow */wp-json/*
Disallow */wp-admin/*
Disallow */?attachment_id=*
Disallow */?s=*
Disallow */?taxonomy=nav_menu*
Disallow */?eventDisplay=past*
Disallow */?eventDisplay=photo*
Disallow */?post_type=tribe_events&eventDisplay=day*
Disallow */?post_type=tribe_events&eventDisplay=week*
Disallow */?post_type=tribe_events&eventDisplay=month*
Disallow */?tribe-bar-date=*
Disallow *%26eventDisplay%3Dpast*
Disallow *%26eventDisplay%3Dphoto*
Disallow *%26tribe-bar-date%3D*
Disallow */2009/*
Disallow */2010/*
Disallow */2011/*
Disallow */2012/*
Disallow */2013/*
Disallow */2014/*
Disallow */2015/*
Disallow */2016/*
Disallow */2017/*
Disallow */2018/*
Disallow */2019/*
Disallow */2020/*
Disallow */2021/*
Disallow */2022/*
Disallow */2023/*
Disallow */author/*
Disallow */category/*
Disallow */events/*
Disallow */organizer/*
Disallow */scripts/webalert.js?__ver=*
Disallow */tag/*
Disallow */venue/*

Other Records

Field Value
crawl-delay 3

Comments

  • pstarbac throttle bots
  • treat all robots the same
  • don't index dynamic pages
  • aycock 20090122
  • aycock 20111025 per ianderso
  • pstarbac deny search
  • Add Googlebot-specific syntax extension to exclude forms
  • that are repeated for each piece of content in the site
  • the wildcard is only supported by Googlebot
  • http://www.google.com/support/webmasters/bin/answer.py?answer=40367&ctx=sibling
  • Site Improve blocking

Warnings

  • 11 invalid lines.