irishnews.com
robots.txt

Robots Exclusion Standard data for irishnews.com

Resource Scan

Scan Details

Site Domain irishnews.com
Base Domain irishnews.com
Scan Status Ok
Last Scan2025-04-03T08:49:57+00:00
Next Scan 2025-04-10T08:49:57+00:00

Last Scan

Scanned2025-04-03T08:49:57+00:00
URL https://irishnews.com/robots.txt
Redirect https://www.irishnews.com:443/robots.txt
Redirect Domain www.irishnews.com
Redirect Base irishnews.com
Domain IPs 15.197.251.193, 3.33.197.46
Redirect IPs 2600:1413:b000:13::b857:c186, 2600:1413:b000:13::b857:c197, 96.17.72.18, 96.17.72.75
Response IP 42.99.140.195
Found Yes
Hash c88384fba5126f40b8a7f397e5d5be21cc9a9f699203912ac98a790243e2251c
SimHash 318c529284c2

Groups

omgili

Rule Path
Disallow /

webvac

Rule Path
Disallow /

webzip

Rule Path
Disallow /

psbot

Rule Path
Disallow /

ia_archiver

Rule Path
Disallow /

meltwater

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

anthropic-ai

Rule Path
Disallow /

cohere-ai

Rule Path
Disallow /

omgilibot

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

piplbot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

anthropic-aibytespider

Rule Path
Disallow /

claudebot

Rule Path
Disallow /

claude-web

Rule Path
Disallow /

magpie-crawler

Rule Path
Disallow /

news-please

Rule Path
Disallow /

facebookbot

Rule Path
Disallow /

applebot-extended

Rule Path
Disallow /

perplexitybot

Rule Path
Disallow /

applebot

Rule Path
Disallow /

meta-externalagent

Rule Path
Disallow /

buck

Rule Path
Disallow /

Other Records

Field Value
sitemap https://www.irishnews.com/arc/outboundfeeds/sitemap-index/?outputType=xml
sitemap https://www.irishnews.com/arc/outboundfeeds/sitemap-news-index/?outputType=xml
sitemap https://www.irishnews.com/arc/outboundfeeds/sitemap-section-index/?outputType=xml