porta.de
robots.txt

Robots Exclusion Standard data for porta.de

Resource Scan

Scan Details

Site Domain porta.de
Base Domain porta.de
Scan Status Ok
Last Scan2025-12-17T05:10:29+00:00
Next Scan 2025-12-31T05:10:29+00:00

Last Scan

Scanned2025-12-17T05:10:29+00:00
URL https://porta.de/robots.txt
Domain IPs 34.111.37.79
Response IP 34.111.37.79
Found Yes
Hash 7aa35e3d5b93a84afec5e54143089c13f37e5394a6865c10f1f8d1419d08e1fe
SimHash 67d64b50d2c4

Groups

*

Rule Path
Disallow /cart
Disallow /checkout
Disallow /mein-porta
Disallow /order-confirmation
Disallow /search

adsbot-google

Rule Path
Allow /search

adsbot-google-mobile

Rule Path
Allow /search

ai2bot
ai2bot-dolma
aihitbot
amazonbot
anthropic-ai
anthropicbot
applebot-extended
brightbot 1.0
bytedancespider
bytespider
ccbot
claude-web
claudebot
cohere-training-data-crawler
cotoyogi
crawlspace
diffbot
discordbot
facebookbot
factset_spyderbot
firecrawlagent
friendlycrawler
fullstorybot
google-cloudvertexbot
googleother
googleother-image
googleother-video
icc-crawler
imagesiftbot
img2dataset
imgproxy
isscyberriskcrawler
kangaroo bot
meta-externalagent
meta-externalfetcher
novaact
omgili
omgilibot
pangubot
petalbot
qualifiedbot
scrapy
semrushbot-eo
semrushbot-ocob
semrushbot-swa
sidetrade indexer bot
tiktokspider
timpibot
velenpublicwebcrawler
webzio-extended
youbot

Rule Path
Disallow /

Comments

  • For all robots
  • Block access to specific groups of pages
  • Allow search crawlers to discover the sitemap: https://porta.de/sitemap.xml
  • Ask bots, not related to search, to kindly go away.