infotuba.pl
robots.txt

Robots Exclusion Standard data for infotuba.pl

Resource Scan

Scan Details

Site Domain infotuba.pl
Base Domain infotuba.pl
Scan Status Ok
Last Scan2025-04-16T03:43:32+00:00
Next Scan 2025-04-23T03:43:32+00:00

Last Scan

Scanned2025-04-16T03:43:32+00:00
URL https://infotuba.pl/robots.txt
Domain IPs 104.21.61.11, 172.67.204.174, 2606:4700:3033::6815:3d0b, 2606:4700:3035::ac43:ccae
Response IP 104.21.61.11
Found Yes
Hash 41cb14869a3e260d028736f5c85887db9f9a61ff604fc7548e0364a79daa02e5
SimHash e410515967f7

Groups

*

Rule Path
Disallow /wyszukaj/

mediapartners-google

Rule Path
Disallow

googlebot

Rule Path
Disallow

googlebot-image

Rule Path
Disallow

googlebot-mobile

Rule Path
Disallow

googlebot-news

Rule Path
Disallow

googlebot-video

Rule Path
Disallow

adsbot-google

Rule Path
Disallow

googlebot_nauxeo

Rule Path
Disallow

twitterbot

Rule Path
Disallow

applebot

Rule Path
Disallow

ouestfrancebot

Rule Path
Disallow

taboolabot

Rule Path
Disallow

proximic

Rule Path
Disallow

upday

Rule Path
Disallow

bingbot

Rule Path
Disallow

ubicrawler

Rule Path
Disallow /

doc

Rule Path
Disallow /

zao

Rule Path
Disallow /

sitecheck.internetseer.com

Rule Path
Disallow /

zealbot

Rule Path
Disallow /

msiecrawler

Rule Path
Disallow /

sitesnagger

Rule Path
Disallow /

webstripper

Rule Path
Disallow /

webcopier

Rule Path
Disallow /

fetch

Rule Path
Disallow /

offline explorer

Rule Path
Disallow /

teleport

Rule Path
Disallow /

teleportpro

Rule Path
Disallow /

webzip

Rule Path
Disallow /

linko

Rule Path
Disallow /

httrack

Rule Path
Disallow /

microsoft.url.control

Rule Path
Disallow /

xenu

Rule Path
Disallow /

larbin

Rule Path
Disallow /

libwww

Rule Path
Disallow /

zyborg

Rule Path
Disallow /

download ninja

Rule Path
Disallow /

fast

Rule Path
Disallow /

wget

Rule Path
Disallow /

grub-client

Rule Path
Disallow /

k2spider

Rule Path
Disallow /

npbot

Rule Path
Disallow /

webreaper

Rule Path
Disallow /

*

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 2

*

Rule Path
Disallow

Other Records

Field Value
sitemap https://infotuba.pl/sitemap
sitemap https://infotuba.pl/sitemap/news

Comments

  • disable at search level
  • for quizik.pl
  • Allowed search engines directives
  • Crawlers that are kind enough to obey, but which we'd rather not have
  • unless they're feeding search engines.
  • Some bots are known to be trouble, particularly those designed to copy
  • entire sites. Please obey robots.txt.
  • Misbehaving: requests much too fast:
  • Sorry, wget in its recursive mode is a frequent problem.
  • Please read the man page and use it properly; there is a
  • --wait option you can use to set the delay between hits,
  • for instance.
  • The 'grub' distributed client has been *very* poorly behaved.
  • Doesn't follow robots.txt anyway, but...
  • Hits many times per second, not acceptable
  • http://www.nameprotect.com/botinfo.html
  • A capture bot, downloads gazillions of pages with no public benefit
  • http://www.webreaper.net/