irishtechnews.ie
robots.txt

Robots Exclusion Standard data for irishtechnews.ie

Resource Scan

Scan Details

Site Domain irishtechnews.ie
Base Domain irishtechnews.ie
Scan Status Failed
Failure StageFetching resource.
Failure ReasonServer returned a client error.
Last Scan2024-11-10T04:47:15+00:00
Next Scan 2025-01-09T04:47:15+00:00

Last Successful Scan

Scanned2024-09-12T03:26:37+00:00
URL https://irishtechnews.ie/robots.txt
Domain IPs 104.26.8.58, 104.26.9.58, 172.67.74.181, 2606:4700:20::681a:83a, 2606:4700:20::681a:93a, 2606:4700:20::ac43:4ab5
Response IP 172.67.74.181
Found Yes
Hash 462bba374c5987dd4313a4b1833c005db0ed03ad30bdecf39d9630928ff7ce9f
SimHash 5f08bb216d92

Groups

*

Rule Path
Disallow /wp-admin/
Allow /wp-admin/admin-ajax.php

*

Rule Path
Disallow /*blackhole
Disallow /?blackhole

piplbot
femtosearchbot
mj12bot
buck
yeti
mozilla/5.0 (compatible; yeti/1.1; +http://naver.me/spd)
boardreader
blexbot
scrapy
ccbot
trendictionbot
anderspinkbot
lcc
ahrefsbot
jersey
woorankreview
primalbot
go-http-client
manzama

Rule Path
Disallow /

bl.uk_lddc_bot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 20

semrushbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 20

seznambot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 20

ias_crawler

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 20

Other Records

Field Value
sitemap https://irishtechnews.ie/sitemap.xml
sitemap https://irishtechnews.ie/sitemap-news.xml

Comments

  • XML Sitemap & Google News version 5.2.3 - https://status301.net/wordpress-plugins/xml-sitemap-feed/
  • DISABLE
  • User-agent: DotBot
  • SLOW DOWN
  • British Library
  • May be AdSense related
  • Czech search engine
  • Integralads.com