waterford-news.ie
robots.txt

Robots Exclusion Standard data for waterford-news.ie

Resource Scan

Scan Details

Site Domain waterford-news.ie
Base Domain waterford-news.ie
Scan Status Ok
Last Scan2024-06-17T01:45:52+00:00
Next Scan 2024-06-24T01:45:52+00:00

Last Scan

Scanned2024-06-17T01:45:52+00:00
URL https://waterford-news.ie/robots.txt
Domain IPs 213.182.15.181
Response IP 213.182.15.181
Found Yes
Hash b82165928340be6299e5b763cda6505a340135a7bb48051a93baeae492186e8a
SimHash 0224c1554074

Groups

*

Rule Path
Disallow /cms_addon
Disallow /cms_docs
Disallow /redFACT
Disallow /REST/frontend/itemstatistics

*

Rule Path
Disallow /pu_all
Allow /pu_all/img
Disallow /pu_carlow/
Allow /pu_carlow/img
Disallow /pu_kildare/
Allow /pu_kildare/img
Disallow /pu_laois/
Allow /pu_laois/img
Disallow /pu_roscommon/
Allow /pu_roscommon/img
Disallow /pu_waterford/
Allow /pu_waterford/img
Disallow /pu_western/
Allow /pu_western/img

googlebot

Rule Path
Allow /

adsbot-google

Rule Path
Allow /

googlebot-news

Rule Path
Disallow /sponsored/
Disallow /sponsored-content/
Disallow /sponsoredshowcase/
Disallow /test

ia_archiver

Rule Path
Disallow /

backlink-check.de

Rule Path
Disallow /

backlinkcrawler

Rule Path
Disallow /

bloodhound

Rule Path
Disallow /

cydralspider

Rule Path
Disallow /

downloadexpress

Rule Path
Disallow /

extractorpro

Rule Path
Disallow /

fasterfox

Rule Path
Disallow /

gammaspider

Rule Path
Disallow /

linkextractorpro

Rule Path
Disallow /

linkwalker

Rule Path
Disallow /

meltwater

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

node/simplecrawler

Rule Path
Disallow /

node/simplecrawler 0.7.0 (git+https://github.com/cgiffard/node-simplecrawler.git)

Rule Path
Disallow /

objectssearch

Rule Path
Disallow /

openbot

Rule Path
Disallow /

pimptrain

Rule Path
Disallow /

raven

Rule Path
Disallow /

rogerbot

Rule Path
Disallow /

searchpreview

Rule Path
Disallow /

simplecrawler

Rule Path
Disallow /

seodat

Rule Path
Disallow /

seoengbot

Rule Path
Disallow /

seokicks-robot

Rule Path
Disallow /

true_robot

Rule Path
Disallow /

url control

Rule Path
Disallow /

url_spider_pro

Rule Path
Disallow /

wapspider

Rule Path
Disallow /

webzinger

Rule Path
Disallow /

xovi

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

Other Records

Field Value
sitemap https://waterford-news.ie/sitemap-index/1274-google_channel_sitemap_wns.xml
sitemap https://waterford-news.ie/sitemap-index/1279-google_sitemap_wns.xml
sitemap https://waterford-news.ie/sitemap-index/1284-google_news_wns.xml

Comments

  • global live settings :
  • customised settings :