students.ubc.ca
robots.txt

Robots Exclusion Standard data for students.ubc.ca

Resource Scan

Scan Details

Site Domain students.ubc.ca
Base Domain ubc.ca
Scan Status Failed
Failure StageFetching resource.
Failure ReasonServer returned a client error.
Last Scan2025-05-24T22:48:57+00:00
Next Scan 2025-06-23T22:48:57+00:00

Last Successful Scan

Scanned2025-04-02T21:03:20+00:00
URL https://students.ubc.ca/robots.txt
Domain IPs 141.193.213.10, 141.193.213.11
Response IP 141.193.213.10
Found Yes
Hash a83a6f21d1f2080b2fd08f52d626b0e34ae790ff6cd1618d336d97a6cc744268
SimHash ea029d08a114

Groups

*

Rule Path
Disallow /wp-login.php
Disallow /wp-register.php
Disallow /wp-admin/
Allow /wp-admin/admin-ajax.php
Allow /wp-content/uploads/
Disallow /wp-content/plugins/
Disallow /wp-content/themes/
Disallow /wp-includes/
Allow /wp-includes/js/
Allow /wp-includes/images/
Disallow /README.md
Disallow /taxonomy/
Disallow /trackback/
Disallow /?s=
Disallow /page/

amazonbot
barkrowler
claudebot
drupal
go-http-client
imagesiftbot
seekportbot
semanticscholarbot
semrushbot
yandexbot

Rule Path
Disallow /

Other Records

Field Value
sitemap https://students.ubc.ca/sitemap.xml

Comments

  • This virtual robots.txt file was created by the Virtual Robots.txt WordPress plugin: https://www.wordpress.org/plugins/pc-robotstxt/
  • robots.txt
  • This file is to prevent the crawling and indexing of certain parts
  • of your site by web crawlers and spiders run by sites like Yahoo!
  • and Google. By telling these "robots" where not to go on your site,
  • you save bandwidth and server resources.
  • This file will be ignored unless it is at the root of your host:
  • Used: http://example.com/robots.txt
  • Ignored: http://example.com/site/robots.txt
  • For more information about the robots.txt standard, see:
  • https://www.robotstxt.org/robotstxt.html
  • Don't crawl search results
  • Block unwanted bots
  • Sitemap