infomil.nl
robots.txt

Robots Exclusion Standard data for infomil.nl

Resource Scan

Scan Details

Site Domain infomil.nl
Base Domain infomil.nl
Scan Status Ok
Last Scan2024-09-21T07:51:56+00:00
Next Scan 2024-10-05T07:51:56+00:00

Last Scan

Scanned2024-09-21T07:51:56+00:00
URL https://infomil.nl/robots.txt
Redirect https://www.infomil.nl/robots.txt
Redirect Domain www.infomil.nl
Redirect Base infomil.nl
Domain IPs 185.38.232.210, 2a02:720:9:8::210
Redirect IPs 185.38.232.220, 2a02:720:9:8::220
Response IP 185.38.232.220
Found Yes
Hash 46d3a3df004578a572daec6b1f551ff00e8b415c44d1dcbbf3d8189d2026f4c7
SimHash 631eea604c5c

Groups

simplepie

Rule Path
Disallow /

curl

Rule Path
Disallow /

python urllib

Rule Path
Disallow /

osce

Rule Path
Disallow /

wget

Rule Path
Disallow /

wotbox

Rule Path
Disallow /

yandexbot

Rule Path
Disallow /

ezooms

Rule Path
Disallow /

yandeximages

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

genieo

Rule Path
Disallow /

jobdiggerspider

Rule Path
Disallow /

exabot

Rule Path
Disallow /

baiduspider

Rule Path
Disallow /

abonti

Rule Path
Disallow /

linkchecker

Rule Path
Disallow /

jetslide

Rule Path
Disallow /

seokicks

Rule Path
Disallow /

sistrix

Rule Path
Disallow /

wbsearchbot

Rule Path
Disallow /

vagabondo

Rule Path
Disallow /

eknip

Rule Path
Disallow /

mail.ru_bot

Rule Path
Disallow /

kingspider

Rule Path
Disallow /

twitterbot

Rule Path
Disallow /

megaindex.ru/2.0

Rule Path
Disallow /

*

Rule Path
Disallow /aspx/
Allow /aspx/read.aspx
Disallow /*?*pdf=true*
Disallow /*?*PDF=true*
Disallow PDF%3Dtrue
Disallow *PDF%3Dtrue*
Disallow *export%3Dpdf*
Disallow export%3Dpdf
Disallow /*?*rss=true*
Disallow /*?*CalDtm=*
Disallow /*?*zoeken_term=*
Disallow /*?*Zoe=*
Disallow /*?*zoeken_metwildcard=true*
Disallow /*?*pager_page*

Other Records

Field Value
crawl-delay 30

Comments

  • deny 80legs.com crawler

Warnings

  • 5 invalid lines.