foodholland.nl
robots.txt

Robots Exclusion Standard data for foodholland.nl

Resource Scan

Scan Details

Site Domain foodholland.nl
Base Domain foodholland.nl
Scan Status Ok
Last Scan2024-10-25T10:04:09+00:00
Next Scan 2024-11-24T10:04:09+00:00

Last Scan

Scanned2024-10-25T10:04:09+00:00
URL https://foodholland.nl/robots.txt
Redirect https://www.foodholland.nl/robots.txt
Redirect Domain www.foodholland.nl
Redirect Base foodholland.nl
Domain IPs 2001:7b8:62b:2:0:d4ff:fe72:7961, 212.114.121.97
Redirect IPs 2001:7b8:62b:2:0:d4ff:fe72:7961, 212.114.121.97
Response IP 212.114.121.97
Found Yes
Hash c1a7cd2ed24b489527c10437bfe737ce1315d60a54283195b3cb3e317e01c71c
SimHash 3a4d77f2fc9a

Groups

sphider-agriholland

Rule Path
Disallow /images/

googlebot

Rule Path
Disallow /images/

gulliver

Rule Path
Disallow /images/

htdig

Rule Path
Disallow /images/

infoseek

Rule Path
Disallow /images/

jeeves

Rule Path
Disallow /images/

jooblebot

Rule Path
Disallow /images/

looksmart

Rule Path
Disallow /images/

ingrid

Rule Path
Disallow /images/

lycosnl

Rule Path
Disallow /images/

vagabondo

Rule Path
Disallow /images/

mozilla/5.0 (slurp/cat; slurp@inktomi.com; http://www.inktomi.com/slurp.html)

Rule Path
Disallow /images/

nederland.zoek

Rule Path
Disallow /images/

scooter

Rule Path
Disallow /images/

scooter/1.0

Rule Path
Disallow /images/

slurp

Rule Path
Disallow /images/

webferret

Rule Path
Disallow /images/

snooper

Rule Path
Disallow /images/

msnbot

Rule Path
Disallow /images/

surfbot

Rule Path
Disallow /images/

woelmuis.nl

Rule Path
Disallow /images/

yahoo! slurp

Rule Path
Disallow /images/

mediapartners-google

Rule Path
Disallow /images/

wizenozespider

Rule Path
Disallow /images/

ahrefsbot

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

baiduspider

Rule Path
Disallow /

cazoodlebot

Rule Path
Disallow /

dotbot/1.0

Rule Path
Disallow /

ezooms

Rule Path
Disallow /

gigabot

Rule Path
Disallow /

mediatoolkitbot

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

semrushbot-sa

Rule Path
Disallow /

seznambot

Rule Path
Disallow /

sogou spider

Rule Path
Disallow /

sogou web spider

Rule Path
Disallow /

trendkite-akashic-crawler

Rule Path
Disallow /

*

Rule Path
Disallow /

Other Records

Field Value
sitemap https://www.agriholland.nl/sitemap.xml

Comments

  • Robots.txt file https://www.agriholland.nl
  • deze staat er op speciaal verzoek in:
  • Helemaal blokkeren
  • Block MJ12bot as it is just noise
  • Block CazoodleBot as it does not present correct accept content headers
  • Block dotbot as it cannot parse base urls properly
  • Block Gigabot
  • Block trendkite-akashic-crawler