cs.ucla.edu
robots.txt

Robots Exclusion Standard data for cs.ucla.edu

Resource Scan

Scan Details

Site Domain cs.ucla.edu
Base Domain ucla.edu
Scan Status Ok
Last Scan2024-09-25T04:06:12+00:00
Next Scan 2024-10-02T04:06:12+00:00

Last Scan

Scanned2024-09-25T04:06:12+00:00
URL https://cs.ucla.edu/robots.txt
Redirect https://www.cs.ucla.edu/robots.txt
Redirect Domain www.cs.ucla.edu
Redirect Base ucla.edu
Domain IPs 164.67.100.182
Redirect IPs 164.67.100.182
Response IP 164.67.100.182
Found Yes
Hash b21ae4a58f3bc24891c87ae34549992098ce3eb6fb35cd59c7e424788f8a74ec
SimHash 281cde64a88b

Groups

*

Rule Path
Disallow /content-*
Disallow /wp-admin/*
Disallow /author/*
Disallow /category/uncategorized/*

Other Records

Field Value
crawl-delay 5

yandexbot

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

mail.ru_bot

Rule Path
Disallow /

baiduspider

Rule Path
Disallow /

claudebot

Rule Path
Disallow /

Other Records

Field Value
sitemap https://samueli.ucla.edu/sitemap.xml