ep.de
robots.txt

Robots Exclusion Standard data for ep.de

Resource Scan

Scan Details

Site Domain ep.de
Base Domain ep.de
Scan Status Ok
Last Scan2024-09-20T23:43:36+00:00
Next Scan 2024-10-20T23:43:36+00:00

Last Scan

Scanned2024-09-20T23:43:36+00:00
URL https://ep.de/robots.txt
Redirect https://www.ep.de/robots.txt
Redirect Domain www.ep.de
Redirect Base ep.de
Domain IPs 194.55.3.222
Redirect IPs 20.103.224.1
Response IP 20.103.224.1
Found Yes
Hash 8930e42b781ad9ed10a7098c4cbd7f18bfd9424975ad13a087cd0222b96ce130
SimHash 6c44571ecd68

Groups

*

Rule Path
Disallow /cart
Disallow /*/cart
Disallow /store-pickup
Disallow /*/store-pickup
Disallow /preselect
Disallow /*/preselect
Disallow /checkout
Disallow /my-account
Disallow /view/ProductCarouselComponentController/
Disallow /*/view/ProductCarouselComponentController/
Disallow /expressreservation/
Disallow /*/expressreservation/

Other Records

Field Value Comment
crawl-delay 10 10 seconds between page requests

cazoodlebot

Rule Path
Disallow /

amazonbot

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

dotbot/1.0

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

gigabot

Rule Path
Disallow /

sogou spider

Rule Path
Disallow /

sogou blog

Rule Path
Disallow /

sogou inst spider

Rule Path
Disallow /

sogou news spider

Rule Path
Disallow /

sogou orion spider

Rule Path
Disallow /

sogou spider2

Rule Path
Disallow /

sogou web spider

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

baiduspider
baiduspider-video
baiduspider-image

Rule Path
Disallow /

bytespider

Rule Path
Disallow /

yandex

Rule Path
Disallow /

bubing

Rule Path
Disallow /

slurp

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 5

seznambot

Rule Path
Disallow /

blexbot

Rule Path
Disallow /

cliqzbot

Rule Path
Disallow /

adscanner

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

semrushbot-sa

Rule Path
Disallow /

petalbot

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

Other Records

Field Value
sitemap https://www.ep.de/sitemap.xml

Comments

  • For all robots
  • Block access to specific groups of pages
  • Allow search crawlers to discover the sitemap
  • Block CazoodleBot as it does not present correct accept content headers
  • Block MJ12bot as it is just noise
  • Block dotbot as it cannot parse base urls properly
  • dotbot without version
  • Block Gigabot
  • SoGou (CN) Info: http://www.sogou.com/docs/help/webmasters.htm#07
  • https://ahrefs.com/robot
  • http://law.di.unimi.it/BUbiNG.html
  • slow down Yahoo

Warnings

  • `request-rate` is not a known field.
  • `visit-time` is not a known field.