www.cs.ucla.edu
robots.txt

Robots Exclusion Standard data for www.cs.ucla.edu

Resource Scan

Scan Details

Site Domain www.cs.ucla.edu
Base Domain ucla.edu
Scan Status Ok
Last Scan2024-10-29T00:29:35+00:00
Next Scan 2024-11-28T00:29:35+00:00

Last Scan

Scanned2024-10-29T00:29:35+00:00
URL https://www.cs.ucla.edu/robots.txt
Domain IPs 164.67.100.182
Response IP 164.67.100.182
Found Yes
Hash b21ae4a58f3bc24891c87ae34549992098ce3eb6fb35cd59c7e424788f8a74ec
SimHash 281cde64a88b

Groups

*

Rule Path
Disallow /content-*
Disallow /wp-admin/*
Disallow /author/*
Disallow /category/uncategorized/*

Other Records

Field Value
crawl-delay 5

yandexbot

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

mail.ru_bot

Rule Path
Disallow /

baiduspider

Rule Path
Disallow /

claudebot

Rule Path
Disallow /

Other Records

Field Value
sitemap https://samueli.ucla.edu/sitemap.xml