thieme.de
robots.txt

Robots Exclusion Standard data for thieme.de

Resource Scan

Scan Details

Site Domain thieme.de
Base Domain thieme.de
Scan Status Failed
Failure StageFetching resource.
Failure ReasonCouldn't connect to server.
Last Scan2024-09-30T16:55:26+00:00
Next Scan 2024-10-07T16:55:26+00:00

Last Successful Scan

Scanned2024-08-30T16:54:49+00:00
URL https://thieme.de/robots.txt
Redirect https://www.thieme.de/robots.txt
Redirect Domain www.thieme.de
Redirect Base thieme.de
Domain IPs 91.208.107.242
Redirect IPs 91.208.107.240
Response IP 91.208.107.240
Found Yes
Hash 75294d93caab57d0cb4295734974cf00920daeafada15ff01832aa6a5b3d641f
SimHash ec4d0700bbf1

Groups

*

Rule Path
Disallow /META-INF/
Disallow /WEB-INF/
Disallow /cps/
Disallow /de/addToCart
Disallow /classic/
Disallow /de/detailseiten/
Disallow /specials/
Disallow /fm/
Disallow /de/karriere/Job
Disallow /viamedici/foren/
Disallow /wp-login.php
Disallow /medizinjobs/
Disallow /de/ebooklibrary/
Disallow /medias/sys_master/
Disallow /webslides/
Disallow /cmscockpit/
Disallow /de/SID-
Disallow /en/SID-
Disallow /sonderseiten/
Disallow /de/test/
Disallow /en/test/

searchmetricsbot

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

bleriot
qwantify

Rule Path
Disallow /

jobkicks

Rule Path
Disallow /

trendkite-akashic-crawler

Rule Path
Disallow /

Comments

  • robots.txt
  • Diese Verzeichnisse sollen nicht durchsucht werden