ujdigital.uj.ac.za
robots.txt

Robots Exclusion Standard data for ujdigital.uj.ac.za

Resource Scan

Scan Details

Site Domain ujdigital.uj.ac.za
Base Domain uj.ac.za
Scan Status Ok
Last Scan2025-11-29T03:26:08+00:00
Next Scan 2025-12-29T03:26:08+00:00

Last Scan

Scanned2025-11-29T03:26:08+00:00
URL https://ujdigital.uj.ac.za/robots.txt
Domain IPs 152.106.6.14
Response IP 152.106.6.14
Found Yes
Hash 5642b638cd9e2530a7725f5e6bd74410cb6063d474e11087674ee5441f2db576
SimHash 6d22ea118513

Groups

*

Rule Path
Allow /assets/
Allow /lib/
Allow /Home/
Allow /Course/
Allow /Search/
Allow /Identity/Account/Login
Allow /Identity/Account/Register
Allow /Identity/Account/ForgotPassword
Allow /Identity/Account/ResetPassword
Disallow /Identity/Account/Manage/
Disallow /Identity/Account/Logout
Disallow /Admin/
Disallow /Error/
Disallow /api/
Disallow /uploads/
Disallow /templates/
Allow /sitemap.xml
Allow /robots.txt
Allow /favicon.ico

slurp

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 15

duckduckbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 15

badbot

Rule Path
Disallow /

Other Records

Field Value
sitemap https://ujdigital.uj.ac.za/sitemap.xml

Comments

  • Global rules
  • Allow general access to site assets needed for rendering
  • Allow important public pages
  • Disallow sensitive areas
  • Allow sitemap and important files
  • Specific rules for different bots
  • User-agent: Googlebot
  • Allow: / # If you need to override a general disallow for Googlebot specifically
  • User-agent: Bingbot
  • Allow: / # If you need to override a general disallow for Bingbot specifically
  • Block bad bots
  • Sitemap location