whic.de
robots.txt

Robots Exclusion Standard data for whic.de

Resource Scan

Scan Details

Site Domain whic.de
Base Domain whic.de
Scan Status Ok
Last Scan2025-10-24T04:36:54+00:00
Next Scan 2025-11-23T04:36:54+00:00

Last Scan

Scanned2025-10-24T04:36:54+00:00
URL https://whic.de/robots.txt
Domain IPs 104.26.6.212, 104.26.7.212, 172.67.73.157, 2606:4700:20::681a:6d4, 2606:4700:20::681a:7d4, 2606:4700:20::ac43:499d
Response IP 104.26.6.212
Found Yes
Hash 0be0a1db6077fd7cab78f5b81e93b3f6f600eff8d66ea59f5a91f63d8e863dd9
SimHash 0814d4d23dc3

Groups

*

Rule Path
Disallow *?*order=*
Disallow *?*q=*
Disallow *?*manufacturer=*
Disallow *?*properties=*

ia_archiver

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

Other Records

Field Value
sitemap https://whic.de/sitemap.xml

Comments

  • Crawlers Setup
  • Archive.org
  • Open.ai
  • Semrush