www.cs.ucla.edu
robots.txt
Robots Exclusion Standard data for www.cs.ucla.edu
Resource Scan
Scan Details
Site Domain | www.cs.ucla.edu |
Base Domain | ucla.edu |
Scan Status | Ok |
Last Scan | 2024-10-29T00:29:35+00:00 |
Next Scan | 2024-11-28T00:29:35+00:00 |
Last Scan
Scanned | 2024-10-29T00:29:35+00:00 |
URL | https://www.cs.ucla.edu/robots.txt |
Domain IPs | 164.67.100.182 |
Response IP | 164.67.100.182 |
Found | Yes |
Hash | b21ae4a58f3bc24891c87ae34549992098ce3eb6fb35cd59c7e424788f8a74ec |
SimHash | 281cde64a88b |
Groups
*
Rule | Path |
---|---|
Disallow | /content-* |
Disallow | /wp-admin/* |
Disallow | /author/* |
Disallow | /category/uncategorized/* |
Other Records
Field | Value |
---|---|
crawl-delay | 5 |
Other Records
Field | Value |
---|---|
sitemap | https://samueli.ucla.edu/sitemap.xml |