xxl.fi
robots.txt

Robots Exclusion Standard data for xxl.fi

Resource Scan

Scan Details

Site Domain xxl.fi
Base Domain xxl.fi
Scan Status Ok
Last Scan2024-09-22T02:20:53+00:00
Next Scan 2024-10-22T02:20:53+00:00

Last Scan

Scanned2024-09-22T02:20:53+00:00
URL https://xxl.fi/robots.txt
Redirect https://www.xxl.fi/robots.txt
Redirect Domain www.xxl.fi
Redirect Base xxl.fi
Domain IPs 52.222.144.16, 52.222.144.48, 52.222.144.52, 52.222.144.76
Redirect IPs 18.155.202.23, 18.155.202.58, 18.155.202.63, 18.155.202.73, 2600:9000:23d1:1200:1b:87b2:5540:93a1, 2600:9000:23d1:200:1b:87b2:5540:93a1, 2600:9000:23d1:5400:1b:87b2:5540:93a1, 2600:9000:23d1:5c00:1b:87b2:5540:93a1, 2600:9000:23d1:6a00:1b:87b2:5540:93a1, 2600:9000:23d1:9400:1b:87b2:5540:93a1, 2600:9000:23d1:ac00:1b:87b2:5540:93a1, 2600:9000:23d1:ee00:1b:87b2:5540:93a1
Response IP 65.9.112.3
Found Yes
Hash c4ec49fbc8caad49989aa907299f84baf1df6a3651179868979bb6c94a8b8f50
SimHash 64547f1697fa

Groups

*

Rule Path
Disallow /account
Disallow /cart
Disallow /checkout
Disallow /login
Disallow /search
Disallow *?*Renkaan%2Bkoko=*
Disallow *?*Pronaatio=*
Disallow *?*Ik%C3%A4=*
Disallow *?*Moottorin%2Bsensori=*
Disallow *?*Rungon%2Bmateriaali=*
Disallow *?*Osasarja%2B%2F%2BVaihtaja=*
Disallow *?*Jarrutyyppi=*
Disallow *?*Vaihteiden%2Blukum%C3%A4%C3%A4r%C3%A4=*
Disallow *?*Moottorin%2Bsijainti=*
Disallow *?*Akun%2Bkapasiteetti=*
Disallow *?*Moottorin%2Bvalmistaja=*
Disallow *?*Py%C3%B6r%C3%A4ilykeng%C3%A4t=*
Disallow *?*GPS=*
Disallow *?*Tavaratalot=*
Disallow *?*Kampanja=*
Disallow *?*Suksityyppi=*
Disallow *?*Taso=*
Disallow *?*Maasto=*
Disallow *?*Base=*
Disallow *?*Sidetyyppi=*
Disallow *?*Suksen%2Bleveys=*
Disallow *?*Siteet=*
Disallow *?*K%C3%A4velytoiminto=*
Disallow *?*Materiaali=*
Disallow *?*Teleskooppisauva=*
Disallow *?*Olosuhteet=*
Disallow *?*Voideltava=*
Disallow *?*Fluoripitoisuus=*
Disallow *?*Luistovoiteen%2Btyyppi=*
Disallow *?*Kampanja=*
Disallow *?*Etuhaarukka=*
Disallow *?*Py%C3%B6r%C3%A4n%2Blukot=*
Disallow *?*Harjoitusvastus=*
Disallow *?*Makuupussin%2Bl%C3%A4mp%C3%B6arvo=*
Disallow *?*Polttoaine=*
Disallow *?*Paino=*
Disallow *?*Vesityyppi=*
Disallow *?*Haulikon%2Bkaliiberi=*
Disallow *?*Lyijy%2B%2F%2BLyijyt%C3%B6n=*
Disallow *?*Haulikon%2Bmekanismi=*
Disallow *?*Runkoputken%2Bhalkaisija=*
Disallow *?*Kiv%C3%A4%C3%A4rin%2Bkaliiberi=*
Disallow *?*Sarja=*
Disallow *?*NHL-joukkue%2B=*
Disallow *?*NHL-joukkue=*
Disallow *?*Golfmailan%2Btyyppi=*
Disallow *?*M%C3%A4rk%C3%A4puvun%2Bpaksuus=*
Disallow *?*Puoli=*
Disallow *?*M%C3%A4rk%C3%A4puvun%2Bpituus=*
Disallow *?*Golfsetin%2Btyyppi=*
Disallow *?*Golfpallot=*
Disallow *?*Sukellus=*
Disallow *?*M%C3%A4rk%C3%A4puku=*
Disallow *?*Pelastusliivit=*
Disallow *?*Hinta=*
Disallow *?*Tuoteryhm%C3%A4=*
Disallow *?*Tuoteryhm%C3%A4=*
Disallow *?*Style%2BSwatch=*

cazoodlebot

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

dotbot/1.0

Rule Path
Disallow /

gigabot

Rule Path
Disallow /

sogou spider

Rule Path
Disallow /

moget
ichiro

Rule Path
Disallow /

naverbot
yeti

Rule Path
Disallow /

baiduspider
baiduspider-video
baiduspider-image

Rule Path
Disallow /

youdaobot

Rule Path
Disallow /

mauibot (crawler.feedback+wc@gmail.com)

Rule Path
Disallow /

*

Rule Path
Disallow /awesomeproduct

Comments

  • For all robots
  • Block access to specific groups of pages
  • Block access to search results
  • Block CazoodleBot as it does not present correct accept content headers
  • Block MJ12bot as it is just noise
  • Block dotbot as it cannot parse base urls properly
  • Block Gigabot
  • Block chinese, korean and russian bots
  • Block legacy facets
  • Disallow: *?*_mv*=*