www.sas.upenn.edu
robots.txt
Robots Exclusion Standard data for www.sas.upenn.edu
Resource Scan
Scan Details
| Site Domain | www.sas.upenn.edu |
| Base Domain | upenn.edu |
| Scan Status | Ok |
| Last Scan | 2025-11-21T07:03:11+00:00 |
| Next Scan | 2025-12-21T07:03:11+00:00 |
Last Scan
| Scanned | 2025-11-21T07:03:11+00:00 |
| URL | https://www.sas.upenn.edu/robots.txt |
| Domain IPs | 52.42.92.117 |
| Response IP | 52.42.92.117 |
| Found | Yes |
| Hash | 3955f9eb1b39ead1c1dc7e7c8dc33fbb7fcf3523fa61544f42b0e937b236e8bb |
| SimHash | 3d9419134760 |
Groups
*
| Rule | Path |
|---|---|
| Allow | /core/*.css$ |
| Allow | /core/*.css? |
| Allow | /core/*.js$ |
| Allow | /core/*.js? |
| Allow | /core/*.gif |
| Allow | /core/*.jpg |
| Allow | /core/*.jpeg |
| Allow | /core/*.png |
| Allow | /core/*.svg |
| Allow | /profiles/*.css$ |
| Allow | /profiles/*.css? |
| Allow | /profiles/*.js$ |
| Allow | /profiles/*.js? |
| Allow | /profiles/*.gif |
| Allow | /profiles/*.jpg |
| Allow | /profiles/*.jpeg |
| Allow | /profiles/*.png |
| Allow | /profiles/*.svg |
| Disallow | /core/ |
| Disallow | /profiles/ |
| Disallow | /README.txt |
| Disallow | /web.config |
| Disallow | /admin/ |
| Disallow | /comment/reply/ |
| Disallow | /filter/tips |
| Disallow | /node/add/ |
| Disallow | /search/ |
| Disallow | /user/register |
| Disallow | /user/password |
| Disallow | /user/login |
| Disallow | /user/logout |
| Disallow | /media/oembed |
| Disallow | /*/media/oembed |
| Disallow | /index.php/admin/ |
| Disallow | /index.php/comment/reply/ |
| Disallow | /index.php/filter/tips |
| Disallow | /index.php/node/add/ |
| Disallow | /index.php/search/ |
| Disallow | /index.php/user/password |
| Disallow | /index.php/user/register |
| Disallow | /index.php/user/login |
| Disallow | /index.php/user/logout |
| Disallow | /index.php/media/oembed |
| Disallow | /index.php/*/media/oembed |
| Disallow | /search |
| Disallow | /system/ |
| Disallow | /administrator/ |
| Disallow | /wp-content/ |
| Disallow | /wp-admin/ |
| Disallow | /cgi-bin/ |
| Disallow | /core/ |
| Disallow | /wp-includes/ |
| Disallow | /wp/ |
| Disallow | /pantheon_healthcheck |
| Disallow | /pantheon_healthcheck/ |
| Disallow | /node/add/ |
| Disallow | /events/past-events |
| Disallow | /sites/www.math.upenn.edu/themes/bootstrap/ |
| Disallow | /?q=node%2Fadd |
| Disallow | /calendar/day/2023* |
| Disallow | /calendar/day/2024* |
| Disallow | /calendar/day/2022* |
| Disallow | /sites/default/files/*.pdf |
| Disallow | /application/core/ |
| Disallow | /*.pdf$ |
| Disallow | /*.xml$ |
| Disallow | /*.php |
| Disallow | /node?* |
| Disallow | /node/?* |
| Disallow | /ALF_DATA/ |
openai-gpt
claudebot
gptbot
chatgpt-user
claude-web
semrushbot
brightbot
pingdombot
petalbot
barkrowler
go-http-client/1.1
pingdom.com_bot_version_1.4_(http://www.pingdom.com/)
yandexbot
brightbot 1.0
ping*
bright*
chat*
pingdom.com_bot_version_1.4_(http://www.pingdom.com/)
apache-httpclient/4.5.2 (java/1.8.0_161)
claude-user
claude-searchbot
ccbot
diffbot
perplexitybot
perplexityâuser
omgili
omgilibot
webzio-extended
imagesiftbot
bytespider
tiktokspider
youbot
semrushbot-ocob
petalbot
velenpublicwebcrawler
turnitinbot
timpibot
oai-searchbot
icc-crawler
ai2bot
ai2bot-dolma
dataforseobot
awariobot
awariosmartbot
awariorssbot
pangubot
kangaroo bot
sentibot
img2dataset
meltwater
seekr
peer39_crawler
cohere-ai
cohere-training-data-crawler
duckassistbot
scrapy
cotoyogi
aihitbot
factset_spyderbot
firecrawlagent
velenpublicwebcrawler
| Rule | Path |
|---|---|
| Disallow | / |
Warnings
- 1 invalid line.
Comments