ccf.georgetown.edu
robots.txt

Robots Exclusion Standard data for ccf.georgetown.edu

Resource Scan

Scan Details

Site Domain ccf.georgetown.edu
Base Domain georgetown.edu
Scan Status Ok
Last Scan2025-02-18T00:04:02+00:00
Next Scan 2025-03-20T00:04:02+00:00

Last Scan

Scanned2025-02-18T00:04:02+00:00
URL https://ccf.georgetown.edu/robots.txt
Domain IPs 23.185.0.2, 2620:12a:8000::2, 2620:12a:8001::2
Response IP 23.185.0.2
Found Yes
Hash b404108f7e8208e0e06678a62f9bba163000c83dcc06fb0353ef672d191d90e0
SimHash 784550106b95

Groups

*

Rule Path
Disallow /wp-admin/
Disallow /wp-includes/

*

Rule Path
Allow /

googlebot

Rule Path
Allow /

bingbot

Rule Path
Allow /

yahoo pipes 2.0

Rule Path
Allow /

googlebot-mobile

Rule Path
Allow /

baiduspider+

Rule Path
Allow /

mozilla/2.0

Rule Path
Allow /

charlotte

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

converacrawler/0.9e

Rule Path
Disallow /

gigabot

Rule Path
Disallow /

vse/1.0

Rule Path
Allow /

gsa-crawler

Rule Path
Allow /

Comments

  • STANDARD