cdn-01.belfasttelegraph.co.uk
robots.txt

Robots Exclusion Standard data for cdn-01.belfasttelegraph.co.uk

Resource Scan

Scan Details

Site Domain cdn-01.belfasttelegraph.co.uk
Base Domain belfasttelegraph.co.uk
Scan Status Failed
Failure StageFetching resource.
Failure ReasonCouldn't connect to server.
Last Scan2024-05-03T19:04:49+00:00
Next Scan 2024-08-01T19:04:49+00:00

Last Successful Scan

Scanned2023-07-09T18:58:32+00:00
URL https://cdn-01.belfasttelegraph.co.uk/robots.txt
Redirect https://www.belfasttelegraph.co.uk/robots.txt
Redirect Domain www.belfasttelegraph.co.uk
Redirect Base belfasttelegraph.co.uk
Domain IPs 2600:9000:2003:2600:17:c440:2b80:93a1, 2600:9000:2003:8400:17:c440:2b80:93a1, 2600:9000:2003:a600:17:c440:2b80:93a1, 2600:9000:2003:b200:17:c440:2b80:93a1, 2600:9000:2003:b600:17:c440:2b80:93a1, 2600:9000:2003:d200:17:c440:2b80:93a1, 2600:9000:2003:da00:17:c440:2b80:93a1, 2600:9000:2003:f000:17:c440:2b80:93a1, 54.192.150.113, 54.192.150.115, 54.192.150.19, 54.192.150.90
Redirect IPs 104.18.4.239, 104.18.5.239, 2606:4700::6812:4ef, 2606:4700::6812:5ef
Response IP 104.18.5.239
Found Yes
Hash 05d930b28a8d798dc12fdc82fa727f81117294841b9aac0f0e7b221ae4b9cc2b
SimHash 012cc2d40cf0

Groups

*

Rule Path
Disallow /search/
Disallow /qwerty/
Disallow /*.ece$
Disallow /utils/
Disallow /account/
Disallow /LoadTest/
Disallow /api/
Disallow /qa/
Disallow /ad-test
Disallow /service-archive
Disallow /subscribe-archive
Disallow /messagent/
Disallow /extra/messagent/

googlebot-news

Rule Path
Disallow /service/ad-features/*

mediapartners-google

Rule Path
Disallow

Other Records

Field Value
sitemap https://www.belfasttelegraph.co.uk/sitemap/sitemap_googlenews.xml
sitemap https://www.belfasttelegraph.co.uk/sitemap/sitemap_channels.xml
sitemap https://www.belfasttelegraph.co.uk/sitemap/sitemap.xml
sitemap https://www.belfasttelegraph.co.uk/sitemap/sitemap_video.xml

Comments

  • All Robots
  • Disallow Internal Search
  • Disallow Qwerty and Rogue Qwerty Articles
  • Disallow Test Subfolders and Draft Articles
  • Disallow Sponsored Articles for Google News
  • Sitemap Files
  • Allow Adsense