cnwl.nhs.uk
robots.txt

Robots Exclusion Standard data for cnwl.nhs.uk

Resource Scan

Scan Details

Site Domain cnwl.nhs.uk
Base Domain cnwl.nhs.uk
Scan Status Ok
Last Scan2024-09-12T00:11:11+00:00
Next Scan 2024-10-12T00:11:11+00:00

Last Scan

Scanned2024-09-12T00:11:11+00:00
URL https://cnwl.nhs.uk/robots.txt
Redirect https://www.cnwl.nhs.uk/robots.txt
Redirect Domain www.cnwl.nhs.uk
Redirect Base cnwl.nhs.uk
Domain IPs 45.157.41.19
Redirect IPs 45.157.41.19
Response IP 45.157.41.19
Found Yes
Hash 31231fe1158ecee6f1cb1d2398b62fd1190c3a75b83c91b7a11628fde7b6284e
SimHash 2e4ed7facd60

Groups

*

Rule Path
Allow /application/files/cache/css
Allow /application/files/cache/js
Allow /concrete/css
Allow /concrete/js
Disallow /application/attributes
Disallow /application/authentication
Disallow /application/bootstrap
Disallow /application/config
Disallow /application/controllers
Disallow /application/elements
Disallow /application/helpers
Disallow /application/jobs
Disallow /application/languages
Disallow /application/mail
Disallow /application/models
Disallow /application/page_types
Disallow /application/single_pages
Disallow /application/tools
Disallow /application/views
Disallow /ccm/system/captcha/picture
Disallow /application/files
Disallow /application/files/*
Disallow /download_file
Disallow /download_file/*
Disallow /concrete
Disallow /concrete/*
Disallow /demo
Disallow /demo/*

Other Records

Field Value
crawl-delay 20

teleport

Rule Path
Disallow /

teleportpro

Rule Path
Disallow /

emailcollector

Rule Path
Disallow /

emailsiphon

Rule Path
Disallow /

webbandit

Rule Path
Disallow /

webzip

Rule Path
Disallow /

webreaper

Rule Path
Disallow /

webstripper

Rule Path
Disallow /

web downloader

Rule Path
Disallow /

webcopier

Rule Path
Disallow /

offline explorer pro

Rule Path
Disallow /

httrack website copier

Rule Path
Disallow /

offline commander

Rule Path
Disallow /

leech

Rule Path
Disallow /

websnake

Rule Path
Disallow /

blackwidow

Rule Path
Disallow /

http weazel

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

yandexbot

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

linkedinbot

Rule Path
Disallow /

mauibot (crawler.feedback+wc@gmail.com)

Rule Path
Disallow /

safednsbot (https://www.safedns.com/searchbot)

Rule Path
Disallow /

exabot

Rule Path
Disallow /

pingdom bot

Rule Path
Disallow /

adsbot

Rule Path
Disallow /

applewebkit

Rule Path
Disallow /

adsbot-google

Rule Path
Disallow

googlebot-news

Rule Path
Disallow /

mediapartners-google

Rule Path
Disallow /

baidu spider 2.0

Rule Path
Disallow /

semrush crawler 2.0

Rule Path
Disallow /

twitterbot

Rule Path
Allow /application/files/thumbnails
Allow /application/files/thumbnails/*

Comments

  • Allow access to CSS and JS assets
  • 20-1-21 pingdom
  • 20-1-21 Ads
  • 20-1-21 Apple
  • 14-12-23 Google AdsBot
  • 20-1-21 Google BotNews
  • 20-1-21 Google media partners
  • 20-1-21 Baidu
  • extra SEMRush UA
  • Allow Twitter bot access to files