ceas.uc.edu
robots.txt

Robots Exclusion Standard data for ceas.uc.edu

Resource Scan

Scan Details

Site Domain ceas.uc.edu
Base Domain uc.edu
Scan Status Ok
Last Scan2025-11-14T20:34:01+00:00
Next Scan 2025-12-14T20:34:01+00:00

Last Scan

Scanned2025-11-14T20:34:01+00:00
URL https://ceas.uc.edu/robots.txt
Redirect https://www.ceas.uc.edu/robots.txt
Redirect Domain www.ceas.uc.edu
Redirect Base uc.edu
Domain IPs 151.101.131.10, 151.101.195.10, 151.101.3.10, 151.101.67.10
Redirect IPs 151.101.131.10, 151.101.195.10, 151.101.3.10, 151.101.67.10
Response IP 199.232.115.10
Found Yes
Hash 9ad6d64bd6436410742f9842bd14735c71f199694a75efc242a0e593ed79864b
SimHash 41401842e776

Groups

*

Rule Path
Disallow /content/dam/*
Disallow /system/
Disallow /libs/
Disallow /apps/
Disallow /bin/
Disallow /content/forms/
Disallow /content/experience-fragments/
Disallow /content/launches/
Disallow /content/campaigns/
Disallow /content/uc/news/articles/legacy/

googlebot

Rule Path
Allow /

siteimprove

Rule Path
Allow /

gptbot

Rule Path
Disallow /

bytespider

Rule Path
Disallow /

tiktokspider

Rule Path
Disallow /

googlebot

Rule Path
Allow /

Comments

  • robots.txt for production environments
  • Explicitly allow Googlebot
  • Allow Siteimprove bot
  • Block aggressive bots
  • Explicitly allow Googlebot