newzealand.com
robots.txt

Robots Exclusion Standard data for newzealand.com

Resource Scan

Scan Details

Site Domain newzealand.com
Base Domain newzealand.com
Scan Status Ok
Last Scan2024-09-29T11:47:12+00:00
Next Scan 2024-10-29T11:47:12+00:00

Last Scan

Scanned2024-09-29T11:47:12+00:00
URL https://newzealand.com/robots.txt
Redirect https://www.newzealand.com/robots.txt
Redirect Domain www.newzealand.com
Redirect Base newzealand.com
Domain IPs 23.210.110.190
Redirect IPs 23.210.110.190, 2600:1413:b000:386::1ef0, 2600:1413:b000:38c::1ef0
Response IP 23.51.45.182
Found Yes
Hash 00089035d7bde04dc68858c970373780157a6be586c889c6829e7d73c038cdb4
SimHash 325e9fe8cef8

Groups

*

Rule Path
Disallow /api/
Disallow /admin/
Disallow /dev/
Disallow /health/check/
Disallow /Security/
Disallow /CMSSecurity/
Disallow /RemoveOrphanedPagesTask/
Disallow /SiteTreeMaintenanceTask/
Disallow /UserDefinedFormController/
Disallow /InstallerTest/
Disallow /SapphireInfo/
Disallow /SapphireREPL/
Disallow /farefinder/
Disallow /_proxy
Disallow /*/utilities/search/
Disallow /*/utilities/product-overview-transport/
Disallow /*/listing/*/

Other Records

Field Value
crawl-delay 5

http://www.almaden.ibm.com/cs/crawler
bordermanager*
webcollage*
java*
grub-client
lwp*
linkwalker
offline explorer
larbin
mj12bot
blexbot
dotbot
yeti

Rule Path
Disallow /

swiftbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 0.25

Other Records

Field Value
sitemap https://www.newzealand.com/sitemap.xml

Comments

  • robots-prod.txt
  • Production Robots File
  • 20190824 0749
  • ----- DEFAULT CRAWLER RULES -----
  • - RESOURCE PATHS SS -
  • - RESOURCE PATHS ALACRITY -
  • - CONTENT EDITION PATHS -
  • ----- DISABLED CRAWLERS -----
  • ----- SITEMAP
  • ----- Swiftype specific config

Warnings

  • 3 invalid lines.