kolzchut.org.il
robots.txt

Robots Exclusion Standard data for kolzchut.org.il

Resource Scan

Scan Details

Site Domain kolzchut.org.il
Base Domain kolzchut.org.il
Scan Status Failed
Failure StageFetching resource.
Failure ReasonServer returned a client error.
Last Scan2024-08-02T03:26:50+00:00
Next Scan 2024-10-31T03:26:50+00:00

Last Successful Scan

Scanned2023-12-14T02:41:52+00:00
URL https://kolzchut.org.il/robots.txt
Redirect https://www.kolzchut.org.il/robots.txt
Redirect Domain www.kolzchut.org.il
Redirect Base kolzchut.org.il
Domain IPs 176.58.121.250
Redirect IPs 13.107.213.59, 13.107.246.59, 2620:1ec:46::59, 2620:1ec:bdf::59
Response IP 13.107.213.59
Found Yes
Hash d53c8ad1d94e771beed3ab6dc97e299c318ad6aa8e25735b96625ce42a3b586b
SimHash a254af55cd76

Groups

*

Rule Path
Allow /

mediapartners-google

Rule Path
Disallow /

googlebot-image

Rule Path
Disallow /

*

Rule Path
Disallow /w/*/index.php
Disallow /w/*/api.php

*

Rule Path
Allow /w/he/index.php?title=%D7%9E%D7%99%D7%95%D7%97%D7%93%3A%D7%A9%D7%99%D7%A0%D7%95%D7%99%D7%99%D7%9D_%D7%90%D7%97%D7%A8%D7%95%D7%A0%D7%99%D7%9D&feed=atom

*

Rule Path
Disallow /w/*/%28includes%7Cextensions%7Cskins%29

*

Rule Path
Disallow /ChangeRequest/
Disallow /forms/

*

Rule Path
Disallow /he/%D7%9E%D7%99%D7%95%D7%97%D7%93%3ADrilldown

Comments

  • robots.txt for http://www.kolzchut.org.il
  • Start by allowing all bots
  • Then prevent *some* advertising-related bots:
  • Google AdSense bot
  • Google image crawler (for image search engine)
  • no access to editing links and the like
  • The following are disallowed by a robots meta tag,
  • but going by the robots meta tag is not good enough
  • because the bot traffic still hits the server
  • but allow access to recentchanges
  • Leave our internal directories alone
  • no point in a bot accessing the forms
  • dynamic pages are disallowed using the robots meta tag inside the wiki.
  • But we still disallow specific pages, because some bastards hit them anyway...
  • I'm looking at you, bing