independent.ie
robots.txt

Robots Exclusion Standard data for independent.ie

Archived Snapshots

Resource Scan

Scan Details

Site Domain	independent.ie
Base Domain	independent.ie
Scan Status	Ok
Last Scan	2024-11-15T23:38:04+00:00
Next Scan	2024-11-22T23:38:04+00:00

Last Scan

Scanned	2024-11-15T23:38:04+00:00
URL	https://independent.ie/robots.txt
Redirect	https://www.independent.ie/robots.txt
Redirect Domain	www.independent.ie
Redirect Base	independent.ie
Domain IPs	104.18.30.138, 104.18.31.138, 2606:4700::6812:1e8a, 2606:4700::6812:1f8a
Redirect IPs	104.18.30.138, 104.18.31.138, 2606:4700::6812:1e8a, 2606:4700::6812:1f8a
Response IP	104.18.31.138
Found	Yes
Hash	5f3b30ad52f2aa7808a05dd83dbf0d60e25f210e48f2e92426edfc2cb96014f2
SimHash	683897718c75

Groups

*

Rule	Path
Disallow	/search/
Disallow	/qwerty/
Disallow	/*.ece$
Disallow	/utils/
Disallow	/account/
Disallow	/LoadTest/
Disallow	/api/
Disallow	/qa/
Disallow	/ad-test
Disallow	/service-archive
Disallow	/subscribe-archive
Disallow	/messagent/
Disallow	/extra/messagent/

Rule

Path

Disallow

/search/

Disallow

/qwerty/

Disallow

/*.ece$

Disallow

/utils/

Disallow

/account/

Disallow

/LoadTest/

Disallow

/api/

Disallow

/qa/

Disallow

/ad-test

Disallow

/service-archive

Disallow

/subscribe-archive

Disallow

/messagent/

Disallow

/extra/messagent/

googlebot-news

Rule	Path
Disallow	/storyplus/*
Disallow	/sponsored-features/*

Rule

Path

Disallow

/storyplus/*

Disallow

/sponsored-features/*

mediapartners-google

Rule	Path
Disallow

Rule

Path

Disallow

amazonbot

Rule	Path
Disallow	/

Rule

Path

Disallow

anthropic-ai

Rule	Path
Disallow	/

Rule

Path

Disallow

bytespider

Rule	Path
Disallow	/

Rule

Path

Disallow

ccbot

Rule	Path
Disallow	/

Rule

Path

Disallow

chatgpt-user

Rule	Path
Disallow	/

Rule

Path

Disallow

claudebot

Rule	Path
Disallow	/

Rule

Path

Disallow

claude-web

Rule	Path
Disallow	/

Rule

Path

Disallow

cohere-ai

Rule	Path
Disallow	/

Rule

Path

Disallow

diffbot

Rule	Path
Disallow	/

Rule

Path

Disallow

facebookbot

Rule	Path
Disallow	/

Rule

Path

Disallow

google-extended

Rule	Path
Disallow	/

Rule

Path

Disallow

gptbot

Rule	Path
Disallow	/

Rule

Path

Disallow

magpie-crawler

Rule	Path
Disallow	/

Rule

Path

Disallow

omgili

Rule	Path
Disallow	/

Rule

Path

Disallow

omgilibot

Rule	Path
Disallow	/

Rule

Path

Disallow

perplexitybot

Rule	Path
Disallow	/

Rule

Path

Disallow

Other Records

Field	Value
sitemap	https://www.independent.ie/sitemap/sitemap_googlenews.xmlââââââ
sitemap	https://www.independent.ie/sitemap/sitemap_channels.xml
sitemap	https://www.independent.ie/sitemap/sitemap.xml
sitemap	https://www.independent.ie/sitemap/sitemap_video.xml

Field

Value

sitemap

https://www.independent.ie/sitemap/sitemap_googlenews.xmlââââââ

sitemap

https://www.independent.ie/sitemap/sitemap_channels.xml

sitemap

https://www.independent.ie/sitemap/sitemap.xml

sitemap

https://www.independent.ie/sitemap/sitemap_video.xml

Comments

All copyrights, neighbouring rights and database rights in the content and layout of this website/app are explicitly reserved and are for personal, non-commercial use only.
In accordance with Article 4 of the Directive on Copyright in the Digital Single Market (CDSM) and its transposition into the law of the applicable Member State,
all content of this website on which it is made available is not to be used for the purposes of text and data mining, extraction, scraping and/or the use of programs or robots
for automatic data collection and/or extraction of digital data, whether for machine learning or artificial intelligence purposes or otherwise.
See also the Terms and Conditions of this website.
All Robots
Disallow unwanted URL patterns to be crawled and indexed
Disallow Sponsored Articles for Google News
Sitemap Files
Allow Adsense
Rules for robots

independent.ierobots.txt

Resource Scan

Scan Details

Last Scan

Groups

*

googlebot-news

mediapartners-google

amazonbot

anthropic-ai

bytespider

ccbot

chatgpt-user

claudebot

claude-web

cohere-ai

diffbot

facebookbot

google-extended

gptbot

magpie-crawler

omgili

omgilibot

perplexitybot

Other Records

Comments

independent.ie
robots.txt