cycleworld.com
robots.txt

Robots Exclusion Standard data for cycleworld.com

Resource Scan

Scan Details

Site Domain cycleworld.com
Base Domain cycleworld.com
Scan Status Ok
Last Scan2024-11-09T18:46:39+00:00
Next Scan 2024-11-16T18:46:39+00:00

Last Scan

Scanned2024-11-09T18:46:39+00:00
URL https://cycleworld.com/robots.txt
Redirect https://www.cycleworld.com:443/robots.txt
Redirect Domain www.cycleworld.com
Redirect Base cycleworld.com
Domain IPs 15.197.174.213, 3.33.166.34
Redirect IPs 23.46.230.151, 23.46.230.153, 2600:1413:b000:13::b857:c18e, 2600:1413:b000:13::b857:c196
Response IP 23.45.207.171
Found Yes
Hash 1472ef2824cd44f18437ca4ed578f6b379a55583ca19623a22eaf47db5ab777b
SimHash ac44daf2a503

Groups

gigabot

Rule Path
Disallow /

scrubby

Rule Path
Disallow /

nutch

Rule Path
Disallow /

baiduspider

Rule Path
Disallow /

naverbot

Rule Path
Disallow /

yeti

Rule Path
Disallow /

asterias

Rule Path
Disallow /

*

Rule Path
Disallow /au/
Disallow /ca/
Disallow /fr/
Disallow /ca/
Disallow /fr/
Disallow /de/
Disallow /in/
Disallow /it/
Disallow /jp/
Disallow /mx/
Disallow /es/
Disallow /uk/

Other Records

Field Value
crawl-delay 10

Other Records

Field Value
sitemap https://www.cycleworld.com/arcio/sitemap-index/index/
sitemap https://www.cycleworld.com/arcio/fronts-sitemap/

Comments

  • Disallow the following spiders