cdn-01.independent.ie
robots.txt

Robots Exclusion Standard data for cdn-01.independent.ie

Resource Scan

Scan Details

Site Domain cdn-01.independent.ie
Base Domain independent.ie
Scan Status Failed
Failure StageFetching resource.
Failure ReasonCouldn't connect to server.
Last Scan2024-10-23T17:22:33+00:00
Next Scan 2025-01-21T17:22:33+00:00

Last Successful Scan

Scanned2023-07-02T17:08:49+00:00
URL https://cdn-01.independent.ie/robots.txt
Redirect https://www.independent.ie/robots.txt
Redirect Domain www.independent.ie
Redirect Base independent.ie
Domain IPs 13.33.33.10, 13.33.33.5, 13.33.33.50, 13.33.33.9, 2600:9000:229f:0:12:80c4:7500:93a1, 2600:9000:229f:2200:12:80c4:7500:93a1, 2600:9000:229f:2600:12:80c4:7500:93a1, 2600:9000:229f:3a00:12:80c4:7500:93a1, 2600:9000:229f:3c00:12:80c4:7500:93a1, 2600:9000:229f:400:12:80c4:7500:93a1, 2600:9000:229f:6000:12:80c4:7500:93a1, 2600:9000:229f:e600:12:80c4:7500:93a1
Redirect IPs 104.18.30.137, 104.18.31.137, 2606:4700::6812:1e89, 2606:4700::6812:1f89
Response IP 104.18.30.137
Found Yes
Hash cac435d7fcfb99fa4a02d777cd422b58b9b69c4ddb79d3bb07f46f17c1709143
SimHash 403cc6e88cf1

Groups

*

Rule Path
Disallow /search/
Disallow /qwerty/
Disallow /*.ece$
Disallow /utils/
Disallow /account/
Disallow /LoadTest/
Disallow /api/
Disallow /qa/
Disallow /ad-test
Disallow /service-archive
Disallow /subscribe-archive
Disallow /messagent/
Disallow /extra/messagent/

googlebot-news

Rule Path
Disallow /storyplus/*
Disallow /sponsored-features/*

mediapartners-google

Rule Path
Disallow

Other Records

Field Value
sitemap https://www.independent.ie/sitemap/sitemap_googlenews.xml      
sitemap https://www.independent.ie/sitemap/sitemap_channels.xml
sitemap https://www.independent.ie/sitemap/sitemap.xml
sitemap https://www.independent.ie/sitemap/sitemap_video.xml

Comments

  • All Robots
  • Disallow unwanted URL patterns to be crawled and indexed
  • Disallow Sponsored Articles for Google News
  • Sitemap Files
  • Allow Adsense