cccaastats.org
robots.txt

Robots Exclusion Standard data for cccaastats.org

Resource Scan

Scan Details

Site Domain cccaastats.org
Base Domain cccaastats.org
Scan Status Ok
Last Scan2024-11-15T02:02:22+00:00
Next Scan 2024-11-22T02:02:22+00:00

Last Scan

Scanned2024-11-15T02:02:22+00:00
URL http://cccaastats.org/robots.txt
Domain IPs 18.236.40.215, 35.167.21.119, 52.34.123.126, 52.88.253.115
Response IP 35.167.21.119
Found Yes
Hash 5384c90a8f1903a1e18b6ec686452644462cf557df08cadced82f4127e1e6a72
SimHash 6875d002ceb9

Groups

american-univ-crawler (enterprise; s5-dwrrj5kwb2naa; nguyen@american.edu)

Rule Path
Disallow /

cstv search crawler

Rule Path
Disallow /

*

Rule Path
Disallow /cgi-bin/
Disallow /_private/
Disallow /_vti_bin/
Disallow /_vti_cnf/
Disallow /_vti_log/
Disallow /_vti_pvt/
Disallow /_vti_txt/
Disallow /reports/
Disallow /admin/
Disallow /action/

Other Records

Field Value
crawl-delay 5

msnbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

msnbot-media

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

bingbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

twitterbot

Rule Path
Disallow

ahrefsbot

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

semrushbot-sa

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

rogerbot

Rule Path
Disallow /

piplbot

Rule Path
Disallow /

yandex

Rule Path
Disallow /

baidu

Rule Path
Disallow /

Comments

  • Managed by PrestoSports sysadmin@prestosports.com