cdn3.independent.ie
robots.txt

Robots Exclusion Standard data for cdn3.independent.ie

Resource Scan

Scan Details

Site Domain cdn3.independent.ie
Base Domain independent.ie
Scan Status Failed
Failure StageFetching resource.
Failure ReasonCouldn't connect to server.
Last Scan2024-10-23T17:22:03+00:00
Next Scan 2025-01-21T17:22:03+00:00

Last Successful Scan

Scanned2023-07-02T17:09:02+00:00
URL https://cdn3.independent.ie/robots.txt
Redirect https://www.independent.ie/robots.txt
Redirect Domain www.independent.ie
Redirect Base independent.ie
Domain IPs 13.33.33.10, 13.33.33.5, 13.33.33.50, 13.33.33.9, 2600:9000:229f:1400:12:80c4:7500:93a1, 2600:9000:229f:1600:12:80c4:7500:93a1, 2600:9000:229f:7000:12:80c4:7500:93a1, 2600:9000:229f:800:12:80c4:7500:93a1, 2600:9000:229f:a000:12:80c4:7500:93a1, 2600:9000:229f:d000:12:80c4:7500:93a1, 2600:9000:229f:f200:12:80c4:7500:93a1, 2600:9000:229f:fc00:12:80c4:7500:93a1
Redirect IPs 104.18.30.137, 104.18.31.137, 2606:4700::6812:1e89, 2606:4700::6812:1f89
Response IP 104.18.30.137
Found Yes
Hash cac435d7fcfb99fa4a02d777cd422b58b9b69c4ddb79d3bb07f46f17c1709143
SimHash 403cc6e88cf1

Groups

*

Rule Path
Disallow /search/
Disallow /qwerty/
Disallow /*.ece$
Disallow /utils/
Disallow /account/
Disallow /LoadTest/
Disallow /api/
Disallow /qa/
Disallow /ad-test
Disallow /service-archive
Disallow /subscribe-archive
Disallow /messagent/
Disallow /extra/messagent/

googlebot-news

Rule Path
Disallow /storyplus/*
Disallow /sponsored-features/*

mediapartners-google

Rule Path
Disallow

Other Records

Field Value
sitemap https://www.independent.ie/sitemap/sitemap_googlenews.xml      
sitemap https://www.independent.ie/sitemap/sitemap_channels.xml
sitemap https://www.independent.ie/sitemap/sitemap.xml
sitemap https://www.independent.ie/sitemap/sitemap_video.xml

Comments

  • All Robots
  • Disallow unwanted URL patterns to be crawled and indexed
  • Disallow Sponsored Articles for Google News
  • Sitemap Files
  • Allow Adsense