news.stanford.edu
robots.txt
Robots Exclusion Standard data for news.stanford.edu
Resource Scan
Scan Details
Site Domain | news.stanford.edu |
Base Domain | stanford.edu |
Scan Status | Ok |
Last Scan | 2025-07-01T22:27:23+00:00 |
Next Scan | 2025-07-31T22:27:23+00:00 |
Last Scan
Scanned | 2025-07-01T22:27:23+00:00 |
URL | https://news.stanford.edu/robots.txt |
Domain IPs | 2.58.104.10, 2.58.104.11 |
Response IP | 2.58.104.11 |
Found | Yes |
Hash | 15ed03516b37b883797875dfd0ecd71247e243822ce64b6d28f6eaad9fde687e |
SimHash | 26047c946683 |
Groups
*
Rule | Path |
---|---|
Disallow | /_designs/ |
Disallow | /*?sq_content_src= |
Disallow | /*_recache |
Disallow | /*_edit |
Disallow | /*_admin |
Disallow | /*_login |
Disallow | /*_performance |
Disallow | /*_design |
Disallow | /*_web_services |
Disallow | /__lib |
Disallow | /__fudge |
Disallow | /search |
Disallow | /*?*query= |
Disallow | /*?*f.*%7C*= |
Other Records
Field | Value |
---|---|
crawl-delay | 10 |
Other Records
Field | Value |
---|---|
sitemap | https://news.stanford.edu/sitemap.xml |
Comments