cyclist.co.uk
robots.txt

Robots Exclusion Standard data for cyclist.co.uk

Resource Scan

Scan Details

Site Domain cyclist.co.uk
Base Domain cyclist.co.uk
Scan Status Ok
Last Scan2024-10-05T20:01:39+00:00
Next Scan 2024-10-12T20:01:39+00:00

Last Scan

Scanned2024-10-05T20:01:39+00:00
URL https://cyclist.co.uk/robots.txt
Redirect https://www.cyclist.co.uk/robots.txt
Redirect Domain www.cyclist.co.uk
Redirect Base cyclist.co.uk
Domain IPs 34.252.192.49, 52.211.216.48
Redirect IPs 34.252.192.49, 52.211.216.48
Response IP 52.211.216.48
Found Yes
Hash f68854051347a803cbcc514c1a1a708b0edfa2e24bac2baf46dc1635f7de5f0a
SimHash 531cc1708e91

Groups

mj12bot

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

petalbot

Rule Path
Disallow /

aspiegelbot

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

mauibot

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

amazonbot

Rule Path
Disallow /

seekportbot

Rule Path
Disallow /

claudebot

Rule Path
Disallow /

*

Rule Path
Disallow /account/
Disallow /news/email-template/*

claude-web

Rule Path
Disallow /

anthropic-ai

Rule Path
Disallow /

cohere-ai

Rule Path
Disallow /

perplexitybot

Rule Path
Disallow /

seekr

Rule Path
Disallow /

neticlebot

Rule Path
Disallow /

crawler4j

Rule Path
Disallow /

magpie-crawler

Rule Path
Disallow /

livelapbot

Rule Path
Disallow /

barkrowler

Rule Path
Disallow /

velenpublicwebcrawler

Rule Path
Disallow /

meltwater

Rule Path
Disallow /

Other Records

Field Value
sitemap https://www.cyclist.co.uk/google_news_sitemap.xml
sitemap https://www.cyclist.co.uk/sitemap_index.xml
sitemap https://www.cyclist.co.uk/sitemap