the-sun.co.uk
robots.txt

Robots Exclusion Standard data for the-sun.co.uk

Resource Scan

Scan Details

Site Domain the-sun.co.uk
Base Domain the-sun.co.uk
Scan Status Ok
Last Scan2024-05-07T17:38:14+00:00
Next Scan 2024-06-06T17:38:14+00:00

Last Scan

Scanned2024-05-07T17:38:14+00:00
URL http://the-sun.co.uk/robots.txt
Redirect https://www.thesun.co.uk/robots.txt
Redirect Domain www.thesun.co.uk
Redirect Base thesun.co.uk
Domain IPs 34.240.28.43, 52.208.17.106, 54.76.240.177
Redirect IPs 13.35.18.40, 13.35.18.62, 13.35.18.73, 13.35.18.87
Response IP 13.35.18.40
Found Yes
Hash 52fff314bc109dd235c6c0c5856b4d31885617ae2e914794fe26033f51a296c3
SimHash 3decd5cac192

Groups

*

Rule Path
Disallow /wp-admin/
Allow /wp-admin/admin-ajax.php
Disallow /*horseracing/racecards/2019*
Disallow /*horseracing/racecards/2020*
Disallow /*horseracing/racecards/2021*
Disallow /*horseracing/racecards/2022*
Disallow /*horseracing/results/2019*
Disallow /*horseracing/results/2020*
Disallow /*horseracing/results/2021*
Disallow /*horseracing/results/2022*
Disallow /search/
Disallow /simwidgets/
Disallow /*?s=*
Disallow *%26s%3D*
Disallow /?p=*
Disallow /app/
Disallow /sso/login/
Disallow /wp-login.php
Disallow /amp-tealium/
Disallow /archives/

ccbot

Rule Path
Disallow /

anthropic-ai

Rule Path
Disallow /

cohere-ai

Rule Path
Disallow /

ia_archiver

Rule Path
Disallow /

omgili

Rule Path
Disallow /

omgilibot

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

piplbot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

meltwater

Rule Path
Disallow /

claude-web

Rule Path
Disallow /

claudebot

Rule Path
Disallow /

perplexity-ai

Rule Path
Disallow /

seekr

Rule Path
Disallow /

Other Records

Field Value
sitemap https://www.thesun.co.uk/sitemap.xml
sitemap https://www.thesun.co.uk/news-sitemap.xml
sitemap https://www.thesun.co.uk/nav-sitemap.xml
sitemap https://www.thesun.co.uk/author-sitemap.xml

Comments

  • Sitemap archive
  • News Sitemap
  • Nav Sitemap
  • Author Sitemap