unisq.edu.au
robots.txt

Robots Exclusion Standard data for unisq.edu.au

Resource Scan

Scan Details

Site Domain unisq.edu.au
Base Domain unisq.edu.au
Scan Status Ok
Last Scan2024-08-29T15:52:39+00:00
Next Scan 2024-09-28T15:52:39+00:00

Last Scan

Scanned2024-08-29T15:52:39+00:00
URL https://unisq.edu.au/robots.txt
Redirect https://www.unisq.edu.au/robots.txt
Redirect Domain www.unisq.edu.au
Redirect Base unisq.edu.au
Domain IPs 139.86.7.80
Redirect IPs 139.86.7.80
Response IP 139.86.7.80
Found Yes
Hash 398f13d084b886338beae189896248cdbcde6a819b42bf0fcf3b6793632d8a83
SimHash b944d516adf8

Groups

*

Rule Path
Disallow /handbook/
Disallow /course/
Disallow /repec/
Disallow /extrafiles/

cazoodlebot

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

dotbot/1.0

Rule Path
Disallow /

gigabot

Rule Path
Disallow /

trendkite-akashic-crawler

Rule Path
Disallow /

Other Records

Field Value
sitemap https://www.unisq.edu.au/sitemap.xml

Comments

  • Robots text file
  • No indexing of these areas.
  • Block CazoodleBot as it does not present correct accept content headers
  • Block MJ12bot as it is just noise
  • Block dotbot as it cannot parse base urls properly
  • Block Gigabot
  • Block trendkite-akashic-crawler