irish-times.com
robots.txt

Robots Exclusion Standard data for irish-times.com

Resource Scan

Scan Details

Site Domain irish-times.com
Base Domain irish-times.com
Scan Status Ok
Last Scan2024-05-24T04:00:58+00:00
Next Scan 2024-05-31T04:00:58+00:00

Last Scan

Scanned2024-05-24T04:00:58+00:00
URL http://www.irish-times.com/robots.txt
Redirect https://www.irishtimes.com/robots.txt
Redirect Domain www.irishtimes.com
Redirect Base irishtimes.com
Domain IPs 151.101.130.174, 151.101.194.174, 151.101.2.174, 151.101.66.174
Redirect IPs 125.56.219.24, 2600:1417:3f::173b:5088, 2600:1417:3f::173b:50a1, 96.17.72.17
Response IP 23.202.33.154
Found Yes
Hash d4dee903212563ae1f8f8d3ddcb44cd711c3aaaf9d03f770513198a68dff4cbf
SimHash 7b60cb48a4f1

Groups

chatgpt-user

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

googlebot-news

Rule Path
Disallow /sponsored/

*

Rule Path
Disallow /blogimageupload/
Disallow /captcha/
Disallow /content/
Disallow /cmlink/
Disallow /cm/
Disallow /error/
Disallow /errorpages/
Disallow /logger/
Disallow /mailapi/
Disallow /membership/
Disallow /mobile/
Disallow /newspaper/archive/
Disallow /photosales/index.cfm?fuseaction=
Disallow /poll/
Disallow /polopolydevelopment/
Disallow /redirect/
Disallow /rta-logging/reader-history.php
Disallow /search/
Disallow /search-results/
Disallow /search/archive.html
Disallow /search/search-7.4195619
Disallow /search/search-7.1213540
Disallow /search/search-7.2285082
Disallow /status/
Disallow /zephr/feature-decisions
Disallow /zephr/features

Other Records

Field Value
sitemap https://www.irishtimes.com/arc/outboundfeeds/sitemap/
sitemap https://www.irishtimes.com/arc/outboundfeeds/sitemap-days/
sitemap https://www.irishtimes.com/arc/outboundfeeds/sitemap-news-index/
sitemap https://www.irishtimes.com/arc/outboundfeeds/sitemap-section-index/
sitemap https://www.irishtimes.com/arc/outboundfeeds/sitemap-section/
sitemap https://www.irishtimes.com/arc/outboundfeeds/video-sitemap/

Comments

  • Block ChatGPT crawler
  • Block ChatGPT crawler