penn.museum
robots.txt

Robots Exclusion Standard data for penn.museum

Resource Scan

Scan Details

Site Domain penn.museum
Base Domain penn.museum
Scan Status Ok
Last Scan2026-01-08T12:53:02+00:00
Next Scan 2026-02-07T12:53:02+00:00

Last Scan

Scanned2026-01-08T12:53:02+00:00
URL https://penn.museum/robots.txt
Domain IPs 67.225.189.182
Response IP 67.225.189.182
Found Yes
Hash b26cf543abf9a32b30a441b78ad8822152186d49d8793b1f542e71a182055ce9
SimHash 4018df61c070

Groups

*

Rule Path
Disallow /administrator/
Disallow /api/
Disallow /bin/
Disallow /cache/
Disallow /cli/
Disallow /components/
Disallow /files/
Disallow /includes/
Disallow /language/
Disallow /layouts/
Disallow /libraries/
Disallow /logs/
Disallow /modules/
Disallow /plugins/
Disallow /styleguide/
Disallow /templates/
Disallow /tmp/
Disallow /index.php?option=com_ajax
Disallow /component/search

Other Records

Field Value
crawl-delay 15

googlebot

Rule Path
Allow /collections

Other Records

Field Value
crawl-delay 10

bingbot

Rule Path
Allow /collections

Other Records

Field Value
crawl-delay 10

gptbot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

claudebot

Rule Path
Disallow /

claude-web

Rule Path
Disallow /

anthropic-ai

Rule Path
Disallow /

perplexitybot

Rule Path
Disallow /

upstreambot

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

commoncrawler

Rule Path
Disallow /

amazonbot

Rule Path
Disallow /

bytespider

Rule Path
Disallow /

tiktokspider

Rule Path
Disallow /

bytespider-image

Rule Path
Disallow /

bytedancespider

Rule Path
Disallow /

awemespider

Rule Path
Disallow /

googleother

Rule Path
Disallow /

Comments

  • ========================================
  • Default rules for all bots
  • ========================================
  • ========================================
  • Collections: allow crawling but slow down (Google currently ignores)
  • ========================================
  • ========================================
  • Block known AI and scraper bots
  • ========================================