belfasttelegraph.co.uk
robots.txt

Robots Exclusion Standard data for belfasttelegraph.co.uk

Resource Scan

Scan Details

Site Domain belfasttelegraph.co.uk
Base Domain belfasttelegraph.co.uk
Scan Status Ok
Last Scan2024-10-04T05:48:29+00:00
Next Scan 2024-10-11T05:48:29+00:00

Last Scan

Scanned2024-10-04T05:48:29+00:00
URL https://belfasttelegraph.co.uk/robots.txt
Redirect https://www.belfasttelegraph.co.uk/robots.txt
Redirect Domain www.belfasttelegraph.co.uk
Redirect Base belfasttelegraph.co.uk
Domain IPs 104.18.35.240, 172.64.152.16, 2606:4700:4400::6812:23f0, 2606:4700:4400::ac40:9810
Redirect IPs 104.18.35.240, 172.64.152.16, 2606:4700:4400::6812:23f0, 2606:4700:4400::ac40:9810
Response IP 104.18.35.240
Found Yes
Hash 7fb499388b589f7aeaf3e7d8f3b914617a85b2d742a53f0d61d2a4f68e3ea1b3
SimHash 683897518c75

Groups

*

Rule Path
Disallow /search/
Disallow /qwerty/
Disallow /*.ece$
Disallow /utils/
Disallow /account/
Disallow /LoadTest/
Disallow /api/
Disallow /qa/
Disallow /ad-test
Disallow /service-archive
Disallow /subscribe-archive
Disallow /messagent/
Disallow /extra/messagent/

googlebot-news

Rule Path
Disallow /service/ad-features/*

mediapartners-google

Rule Path
Disallow

amazonbot

Rule Path
Disallow /

anthropic-ai

Rule Path
Disallow /

bytespider

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

claudebot

Rule Path
Disallow /

claude-web

Rule Path
Disallow /

cohere-ai

Rule Path
Disallow /

diffbot

Rule Path
Disallow /

facebookbot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

magpie-crawler

Rule Path
Disallow /

omgili

Rule Path
Disallow /

omgilibot

Rule Path
Disallow /

perplexitybot

Rule Path
Disallow /

Other Records

Field Value
sitemap https://www.belfasttelegraph.co.uk/sitemap/sitemap_googlenews.xml
sitemap https://www.belfasttelegraph.co.uk/sitemap/sitemap_channels.xml
sitemap https://www.belfasttelegraph.co.uk/sitemap/sitemap.xml
sitemap https://www.belfasttelegraph.co.uk/sitemap/sitemap_video.xml

Comments

  • All copyrights, neighbouring rights and database rights in the content and layout of this website/app are explicitly reserved and are for personal, non-commercial use only.
  • In accordance with Article 4 of the Directive on Copyright in the Digital Single Market (CDSM) and its transposition into the law of the applicable Member State,
  • all content of this website on which it is made available is not to be used for the purposes of text and data mining, extraction, scraping and/or the use of programs or robots
  • for automatic data collection and/or extraction of digital data, whether for machine learning or artificial intelligence purposes or otherwise.
  • See also the Terms and Conditions of this website.
  • All Robots
  • Disallow Internal Search
  • Disallow Qwerty and Rogue Qwerty Articles
  • Disallow Test Subfolders and Draft Articles
  • Disallow Sponsored Articles for Google News
  • Sitemap Files
  • Allow Adsense
  • Rules for robots