osterhaus-stammbaum.de
robots.txt

Robots Exclusion Standard data for osterhaus-stammbaum.de

Resource Scan

Scan Details

Site Domain osterhaus-stammbaum.de
Base Domain osterhaus-stammbaum.de
Scan Status Ok
Last Scan2024-11-08T16:21:57+00:00
Next Scan 2024-12-08T16:21:57+00:00

Last Scan

Scanned2024-11-08T16:21:57+00:00
URL http://osterhaus-stammbaum.de/robots.txt
Domain IPs 109.237.140.44
Response IP 109.237.140.44
Found Yes
Hash 637bf8abb68d108289c07b92d5babc7a55dfafba36bd5f807bb0b83248701590
SimHash b71b53590cf7

Groups

aisearchbot

Rule Path
Disallow /

datacha0s

Rule Path
Disallow /

dblbot

Rule Path
Disallow /

doc

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

download ninja

Rule Path
Disallow /

fetch

Rule Path
Disallow /

grub-client

Rule Path
Disallow /

httrack

Rule Path
Disallow /

indy library

Rule Path
Disallow /

internet explorer

Rule Path
Disallow /

kaloogabot

Rule Path
Disallow /

larbin

Rule Path
Disallow /

libwww

Rule Path
Disallow /

linko

Rule Path
Disallow /

lwp::simple

Rule Path
Disallow /

lwp-trivial

Rule Path
Disallow /

mediapartners-google

Rule Path
Disallow /

microsoft.url.control

Rule Path
Disallow /

msiecrawler

Rule Path
Disallow /

npbot

Rule Path
Disallow /

offline explorer

Rule Path
Disallow /

orthogaffe

Rule Path
Disallow

panscient.com

Rule Path
Disallow

plonebot

Rule Path
Disallow

sitecheck.internetseer.com

Rule Path
Disallow /

sitesnagger

Rule Path
Disallow /

super_ale

Rule Path
Disallow /

teleport

Rule Path
Disallow /

turnitinbot

Rule Path
Disallow /

ubicrawler

Rule Path
Disallow /

webcopier

Rule Path
Disallow /

webreaper

Rule Path
Disallow /

webstripper

Rule Path
Disallow /

webzip

Rule Path
Disallow /

wget

Rule Path
Disallow /

xenu

Rule Path
Disallow /

xxx

Rule Path
Disallow /

zao

Rule Path
Disallow /

zealbot

Rule Path
Disallow /

zyborg

Rule Path
Disallow /

*

Rule Path
Disallow /addmedia.php
Disallow /addremotelink.php
Disallow /addsearchlink.php
Disallow /admin.php
Disallow /ancestry.php
Disallow /autocomplete.php
Disallow /branches.php
Disallow /calendar.php
Disallow /client.php
Disallow /compact.php
Disallow /data/
Disallow /descendancy.php
Disallow /dir_editor.php
Disallow /downloadbackup.php
Disallow /downloadgedcom.php
Disallow /editconfig_gedcom.php
Disallow /editgedcoms.php
Disallow /editnews.php
Disallow /edituser.php
Disallow /edit_changes.php
Disallow /edit_interface.php
Disallow /edit_merge.php
Disallow /expand_view.php
Disallow /export_gedcom.php
Allow /family.php
Disallow /familybook.php
Allow /famlist.php
Disallow /fanchart.php
Disallow /favicon.ico
Disallow /find.php
Disallow /gedcheck.php
Disallow /gedrecord.php
Disallow /genservice.php
Disallow /help_text.php
Disallow /hourglass.php
Disallow /hourglass_ajax.php
Disallow /imageflush.php
Disallow /images/
Disallow /imageview.php
Disallow /import.php
Disallow /includes/
Allow /index.php
Disallow /index_edit.php
Disallow /indilist.php
Disallow /individual.php
Disallow /inverselink.php
Disallow /js/
Disallow /language/
Disallow /library/
Disallow /lifespan.php
Disallow /login.php
Disallow /login_register.php
Disallow /logs.php
Disallow /manageservers.php
Disallow /media/
Allow /media.php
Disallow /mediafirewall.php
Allow /medialist.php
Disallow /mediaviewer.php
Disallow /message.php
Allow /module.php
Disallow /modules/
Disallow /module_admin.php
Allow /note.php
Allow /notelist.php
Disallow /opensearch.php
Disallow /PEAR.php
Disallow /pedigree.php
Allow /placelist.php
Disallow /places/
Disallow /relationship.php
Allow /repo.php
Allow /repolist.php
Disallow /reportengine.php
Disallow /search.php
Disallow /search_advanced.php
Allow /search_engine.php
Disallow /serviceClientTest.php
Disallow /setup.php
Disallow /sidebar.php
Allow /site-unavailable.php
Disallow /siteconfig.php
Disallow /SOAP/
Allow /source.php
Allow /sourcelist.php
Disallow /statistics.php
Disallow /statisticsplot.php
Disallow /themechange.php
Disallow /themes/
Disallow /timeline.php
Disallow /treenav.php
Disallow /uploadmedia.php
Disallow /useradmin.php
Disallow /webservice/
Disallow /wtinfo.php
Disallow /bot-trap.php

Other Records

Field Value
crawl-delay 60

Other Records

Field Value
sitemap http://www.osterhaus-ahnen.de/module.php?mod=sitemap&mod_action=generate&file=sitemap.xml

Comments

  • robots.txt file for webtrees
  • (c) Greg Roach, 2010
  • This file needs to be placed in the domain root directory,
  • such as "www.example.com/robots.txt". It will not work in a
  • subdirectory, such as "www.example.com/webtrees/robots.txt"
  • If you need to move it, then remember to adjust the paths as well.
  • e.g. "Allow: /index.php" becomes "Allow: /webtrees/index.php".
  • See http://www.botsvsbrowsers.com/category/1/ for a useful list of robots.
  • This program is free software; you can redistribute it and/or modify
  • it under the terms of the GNU General Public License as published by
  • the Free Software Foundation; either version 2 of the License, or
  • (at your option) any later version.
  • This program is distributed in the hope that it will be useful,
  • but WITHOUT ANY WARRANTY; without even the implied warranty of
  • MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
  • GNU General Public License for more details.
  • You should have received a copy of the GNU General Public License
  • along with this program; if not, write to the Free Software
  • Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
  • $Id: robots.txt 8924 2010-06-21 17:10:07Z greg $
  • Instructions for unfriendly robots ...
  • Internet explorer uses "MSIE"
  • Instructions for friendly robots.
  • Note that not all robots understand the "Allow:" directive. These are
  • included simply to document the allowed scripts. These will be allowed by
  • default.
  • This is a trap for bad robots.
  • Visits to this URL will have their IP address and UA string blacklisted.
  • Some of the list pages can be slow to generate.
  • Restrict requests to one a minute.