belfasttelegraph.co.uk
robots.txt

Robots Exclusion Standard data for belfasttelegraph.co.uk

Archived Snapshots

Resource Scan

Scan Details

Site Domain	belfasttelegraph.co.uk
Base Domain	belfasttelegraph.co.uk
Scan Status	Ok
Last Scan	2024-10-04T05:48:29+00:00
Next Scan	2024-10-11T05:48:29+00:00

Last Scan

Scanned	2024-10-04T05:48:29+00:00
URL	https://belfasttelegraph.co.uk/robots.txt
Redirect	https://www.belfasttelegraph.co.uk/robots.txt
Redirect Domain	www.belfasttelegraph.co.uk
Redirect Base	belfasttelegraph.co.uk
Domain IPs	104.18.35.240, 172.64.152.16, 2606:4700:4400::6812:23f0, 2606:4700:4400::ac40:9810
Redirect IPs	104.18.35.240, 172.64.152.16, 2606:4700:4400::6812:23f0, 2606:4700:4400::ac40:9810
Response IP	104.18.35.240
Found	Yes
Hash	7fb499388b589f7aeaf3e7d8f3b914617a85b2d742a53f0d61d2a4f68e3ea1b3
SimHash	683897518c75

Groups

*

Rule	Path
Disallow	/search/
Disallow	/qwerty/
Disallow	/*.ece$
Disallow	/utils/
Disallow	/account/
Disallow	/LoadTest/
Disallow	/api/
Disallow	/qa/
Disallow	/ad-test
Disallow	/service-archive
Disallow	/subscribe-archive
Disallow	/messagent/
Disallow	/extra/messagent/

Rule

Path

Disallow

/search/

Disallow

/qwerty/

Disallow

/*.ece$

Disallow

/utils/

Disallow

/account/

Disallow

/LoadTest/

Disallow

/api/

Disallow

/qa/

Disallow

/ad-test

Disallow

/service-archive

Disallow

/subscribe-archive

Disallow

/messagent/

Disallow

/extra/messagent/

googlebot-news

Rule	Path
Disallow	/service/ad-features/*

Rule

Path

Disallow

/service/ad-features/*

mediapartners-google

Rule	Path
Disallow

Rule

Path

Disallow

amazonbot

Rule	Path
Disallow	/

Rule

Path

Disallow

anthropic-ai

Rule	Path
Disallow	/

Rule

Path

Disallow

bytespider

Rule	Path
Disallow	/

Rule

Path

Disallow

ccbot

Rule	Path
Disallow	/

Rule

Path

Disallow

chatgpt-user

Rule	Path
Disallow	/

Rule

Path

Disallow

claudebot

Rule	Path
Disallow	/

Rule

Path

Disallow

claude-web

Rule	Path
Disallow	/

Rule

Path

Disallow

cohere-ai

Rule	Path
Disallow	/

Rule

Path

Disallow

diffbot

Rule	Path
Disallow	/

Rule

Path

Disallow

facebookbot

Rule	Path
Disallow	/

Rule

Path

Disallow

google-extended

Rule	Path
Disallow	/

Rule

Path

Disallow

gptbot

Rule	Path
Disallow	/

Rule

Path

Disallow

magpie-crawler

Rule	Path
Disallow	/

Rule

Path

Disallow

omgili

Rule	Path
Disallow	/

Rule

Path

Disallow

omgilibot

Rule	Path
Disallow	/

Rule

Path

Disallow

perplexitybot

Rule	Path
Disallow	/

Rule

Path

Disallow

Other Records

Field	Value
sitemap	https://www.belfasttelegraph.co.uk/sitemap/sitemap_googlenews.xml
sitemap	https://www.belfasttelegraph.co.uk/sitemap/sitemap_channels.xml
sitemap	https://www.belfasttelegraph.co.uk/sitemap/sitemap.xml
sitemap	https://www.belfasttelegraph.co.uk/sitemap/sitemap_video.xml

Field

Value

sitemap

https://www.belfasttelegraph.co.uk/sitemap/sitemap_googlenews.xml

sitemap

https://www.belfasttelegraph.co.uk/sitemap/sitemap_channels.xml

sitemap

https://www.belfasttelegraph.co.uk/sitemap/sitemap.xml

sitemap

https://www.belfasttelegraph.co.uk/sitemap/sitemap_video.xml

Comments

All copyrights, neighbouring rights and database rights in the content and layout of this website/app are explicitly reserved and are for personal, non-commercial use only.
In accordance with Article 4 of the Directive on Copyright in the Digital Single Market (CDSM) and its transposition into the law of the applicable Member State,
all content of this website on which it is made available is not to be used for the purposes of text and data mining, extraction, scraping and/or the use of programs or robots
for automatic data collection and/or extraction of digital data, whether for machine learning or artificial intelligence purposes or otherwise.
See also the Terms and Conditions of this website.
All Robots
Disallow Internal Search
Disallow Qwerty and Rogue Qwerty Articles
Disallow Test Subfolders and Draft Articles
Disallow Sponsored Articles for Google News
Sitemap Files
Allow Adsense
Rules for robots

belfasttelegraph.co.ukrobots.txt

Resource Scan

Scan Details

Last Scan

Groups

*

googlebot-news

mediapartners-google

amazonbot

anthropic-ai

bytespider

ccbot

chatgpt-user

claudebot

claude-web

cohere-ai

diffbot

facebookbot

google-extended

gptbot

magpie-crawler

omgili

omgilibot

perplexitybot

Other Records

Comments

belfasttelegraph.co.uk
robots.txt