cpbr.gov.au
robots.txt

Robots Exclusion Standard data for cpbr.gov.au

Resource Scan

Scan Details

Site Domain cpbr.gov.au
Base Domain cpbr.gov.au
Scan Status Ok
Last Scan2024-10-21T20:02:59+00:00
Next Scan 2024-11-20T20:02:59+00:00

Last Scan

Scanned2024-10-21T20:02:59+00:00
URL https://cpbr.gov.au/robots.txt
Domain IPs 108.156.133.67, 108.156.133.75, 108.156.133.82, 108.156.133.83
Response IP 108.156.133.67
Found Yes
Hash 2c16037ddf9eb13cba5bc9937d0cc9db8518a6177f052c595b96f0df9d7d1e6b
SimHash faa6189d8554

Groups

*

Rule Path Comment
Disallow /cgi-bin don't want anyone in here
Disallow /cool old Conservation on-line pages
Disallow /delta old DELTA pages
Disallow /hiscom old HISCOM information
Disallow /chah/apc/interim/ interim pages only
Disallow /chah/apc/families-treated.html interim pages only
Disallow /images/pp-pics staff photos for PowerPoint
Disallow /internal staff web directory
Disallow /jrc/kayak old NSWSKC pages
Disallow /logs uninformative
Disallow /lost%2Bfound system garbage
Disallow /pink old Olive Pink pages
Disallow /projects/.hidden work in proggress
Disallow /restricted password restriced pages
Disallow /taf temporary Taxonomic workshop pages
Disallow /temp temporary work area
Disallow /test test area
Disallow /tmp temporary work area
Disallow /tour Murray's play area
Disallow /project/fern old fern site
Disallow /cpbr/cpbr-staff.html cpbr staff address list
Disallow /xxx temporary work area

agls
bep

No rules defined. All paths allowed.

Comments

  • robots.txt
  • file to excude robots from indexing parts of the directory
  • to reduce clutter and unwanted hits on global search engines
  • The following line tells all robots to obey ...
  • and this is what they must obey - all files below the path-root
  • Please add new entries in aphabetical order
  • Special instructions for AGLS metadata

Warnings

  • `meta` is not a known field.