ennatuurlijk.nl
robots.txt

Robots Exclusion Standard data for ennatuurlijk.nl

Resource Scan

Scan Details

Site Domain ennatuurlijk.nl
Base Domain ennatuurlijk.nl
Scan Status Ok
Last Scan2025-03-15T07:40:34+00:00
Next Scan 2025-04-14T07:40:34+00:00

Last Scan

Scanned2025-03-15T07:40:34+00:00
URL https://ennatuurlijk.nl/robots.txt
Domain IPs 87.233.157.153
Response IP 87.233.157.153
Found Yes
Hash a6ee0475197416002b33b4b30885b19d76a6a9790cc3a8d5910363fdee628c15
SimHash 219f2d914258

Groups

googlebot

Rule Path
Disallow

Other Records

Field Value
crawl-delay 5

*

Rule Path
Disallow /amCommon
Disallow /admin
Disallow /htmleditor
Disallow /worldpay
Disallow /xml
Disallow /xsl
Disallow /originals
Disallow /images
Disallow /db
Disallow /logs

Other Records

Field Value
crawl-delay 30

baiduspider

Rule Path
Disallow /

boitho.com-dc

Rule Path
Disallow /

busiverse

Rule Path
Disallow /

cazoodlebot

Rule Path
Disallow /

exabot

Rule Path
Disallow /

googlebot-image

Rule Path
Disallow /

heritrix

Rule Path
Disallow /

irlbot

Rule Path
Disallow /

jyxobot

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

mirago-test-robot (http://www.miragorobot.com)

Rule Path
Disallow /

msnbot

Rule Path
Disallow /*.gif$
Disallow /*.jpeg$
Disallow /

psbot

Rule Path
Disallow /

sirketce

Rule Path
Disallow /

seekbot

Rule Path
Disallow /

semanticdiscovery

Rule Path
Disallow /

sogou

Rule Path
Disallow /

soso

Rule Path
Disallow /

sosoimagespider

Rule Path
Disallow /

tineye

Rule Path
Disallow /

turnitinbot

Rule Path
Disallow /

twiceler

Rule Path
Disallow /

voilabot

Rule Path
Disallow /

webalta

Rule Path
Disallow /

yahoo-mmcrawler

Rule Path
Disallow /

yodaobot

Rule Path
Disallow /

zermelo

Rule Path
Disallow /

*

Rule Path
Disallow /flash
Disallow /404.asp

*

Rule Path
Allow /core/*.css$
Allow /core/*.css?
Allow /core/*.js$
Allow /core/*.js?
Allow /core/*.gif
Allow /core/*.jpg
Allow /core/*.jpeg
Allow /core/*.png
Allow /core/*.svg
Allow /profiles/*.css$
Allow /profiles/*.css?
Allow /profiles/*.js$
Allow /profiles/*.js?
Allow /profiles/*.gif
Allow /profiles/*.jpg
Allow /profiles/*.jpeg
Allow /profiles/*.png
Allow /profiles/*.svg
Disallow /core/
Disallow /profiles/
Disallow /README.txt
Disallow /web.config
Disallow /admin/
Disallow /comment/reply/
Disallow /filter/tips
Disallow /node/add/
Disallow /search/
Disallow /user/register
Disallow /user/password
Disallow /user/login
Disallow /user/logout
Disallow /index.php/admin/
Disallow /index.php/comment/reply/
Disallow /index.php/filter/tips
Disallow /index.php/node/add/
Disallow /index.php/search/
Disallow /index.php/user/password
Disallow /index.php/user/register
Disallow /index.php/user/login
Disallow /index.php/user/logout

Other Records

Field Value
sitemap https://ennatuurlijk.nl/sitemap.xml

Comments

  • A L L O W E D
  • Allow Google Bot
  • D I S A L L O W E D
  • Disallow all spidering of images and hub
  • Disallow Baidu Bot (Japanese)
  • Disallow Boitho dc Bot (Norway)
  • Disallow Busiverse Bot (Turkey Sirketce/Busiverse )
  • Disallow CazoodleBot - from University of Illinois
  • Disallow Exabot Bot - Exalead
  • Disallow Google Image Bot
  • Disallow heritrix Bot - from Yell.Com
  • Disallow IRLbot - IRL Texas AM research bot
  • Disallow Jyxobot - Czech Webcrawler for Jyxo
  • Disallow Majestic12.co.uk
  • Disallow Mirago.com
  • Disallow MSN from seeing gifs and jpgsd
  • Disallow NimbleCrawler (http://www.webmasterworld.com/forum93/858.htm)
  • Disallow psbot spidering of images and hub
  • Disallow Sirketce Bot (Turkey Sirketce/Busiverse )
  • Disallow Seekbot - http://www.seekport.co.uk/seekbot/
  • Disallow semanticdiscovery - from Southern Utah University (compyter Science Dept.)
  • Disallow Sogou - Chinese Search Engine
  • Disallow SoSo - Chinese Search Engine
  • Disallow SoSoImageSpider - Chinese picture Search Engine
  • Disallow TinEye - Image trawler Search Engine
  • Disallow TurnITin - "This robot collects content from the Internet for the sole purpose of helping educational institutions prevent plagiarism"
  • Disallow Twiceler - Cuill (also Barred IPs on firewall)
  • Disallow Voilabot Bot - France Telecom
  • Disallow WebAlta Bot - Russian
  • Disallow Yahoo Image Bot
  • Disallow YodaoBot - Chinese Search Engine
  • Disallow zermelo - Bot du Jour from Amazon - may need to block IP range
  • Disallow All Bots to see in '/flash' folder
  • CSS, JS, Images
  • Directories
  • Files
  • Paths (clean URLs)
  • Paths (no clean URLs)

Warnings

  • `useragent` is not a known field.