web.de
robots.txt

Robots Exclusion Standard data for web.de

Archived Snapshots

Resource Scan

Scan Details

Site Domain	web.de
Base Domain	web.de
Scan Status	Ok
Last Scan	2025-12-02T00:17:47+00:00
Next Scan	2025-12-09T00:17:47+00:00

Last Scan

Scanned	2025-12-02T00:17:47+00:00
URL	https://web.de/robots.txt
Domain IPs	82.165.229.138, 82.165.229.83
Response IP	82.165.229.138
Found	Yes
Hash	eb6a0824d451b64dbb4049184c98a166b92c099be75d92dcf252c9ed343ad041
SimHash	f11a8b22c136

Groups

*

Rule	Path
Disallow	/deals/
Disallow	/test/

Rule

Path

Disallow

/deals/

Disallow

/test/

googlebot-news

Rule	Path
Disallow	/
Disallow	/magazine/*/thema/
Allow	/magazine/
Allow	/amp/
Allow	/$

Rule

Path

Disallow

/

Disallow

/magazine/*/thema/

Allow

/magazine/

Allow

/amp/

Allow

/$

ai2bot
ai2bot
amazonbot
applebot-extended
ccbot
cincraw
claudebot
cohere-ai
diffbot
friendlycrawler
gptbot
imagesiftbot
img2dataset
meta-externalagent
petalbot
semanticbot
timpibot
velenpublicwebcrawler
yandex

Rule	Path
Disallow	/magazine/
Allow	/magazine/in-eigener-sache/
Allow	/magazine/unicef/
Allow	/magazine/so-arbeitet-die-redaktion/

Rule

Path

Disallow

/magazine/

Allow

/magazine/in-eigener-sache/

Allow

/magazine/unicef/

Allow

/magazine/so-arbeitet-die-redaktion/

Back to top

web.derobots.txt

Resource Scan

Scan Details

Last Scan

Groups

*

googlebot-news

ai2botai2botamazonbotapplebot-extendedccbotcincrawclaudebotcohere-aidiffbotfriendlycrawlergptbotimagesiftbotimg2datasetmeta-externalagentpetalbotsemanticbottimpibotvelenpublicwebcrawleryandex

web.de
robots.txt

ai2bot
ai2bot
amazonbot
applebot-extended
ccbot
cincraw
claudebot
cohere-ai
diffbot
friendlycrawler
gptbot
imagesiftbot
img2dataset
meta-externalagent
petalbot
semanticbot
timpibot
velenpublicwebcrawler
yandex