all-in.de
robots.txt

Robots Exclusion Standard data for all-in.de

Resource Scan

Scan Details

Site Domain all-in.de
Base Domain all-in.de
Scan Status Ok
Last Scan2024-06-02T11:23:05+00:00
Next Scan 2024-06-09T11:23:05+00:00

Last Scan

Scanned2024-06-02T11:23:05+00:00
URL https://all-in.de/robots.txt
Domain IPs 213.182.15.189
Response IP 213.182.15.189
Found Yes
Hash 600f538687e7c8545218eb7dbe9a7760d23d3a3186800bdb620a82e1a7f31b37
SimHash 28306550cdbd

Groups

*

Rule Path
Disallow /cms_addon
Disallow /cms_docs
Disallow /redFACT
Disallow /REST/frontend/itemstatistics

*

Rule Path
Disallow /index.php?pageid=1008
Disallow /index.php?pageid=1012
Disallow /index.php?pageid=1036
Disallow *costart%3D*
Disallow *seite%3D*
Disallow /extern/*

gptbot

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

Other Records

Field Value
sitemap https://all-in.de/sitemap-index_4-Google_Sitemap.xml
sitemap https://all-in.de/sitemap-index_5-Google_News_Sitemap.xml

Comments

  • global live settings :
  • customised settings :
  • Legal notice: all-in.de expressly reserves the right to use its content for commercial text and data mining (ยง44b UrhG).
  • The use of robots or other automated means to access all-in.de or collect or mine data without the express permission of all-in.de is strictly prohibited.
  • If you would like to apply for permission to crawl all-in.de, collect or use data, please contact digitalteam@azv.de