haufe.de
robots.txt

Robots Exclusion Standard data for haufe.de

Resource Scan

Scan Details

Site Domain haufe.de
Base Domain haufe.de
Scan Status Ok
Last Scan2024-05-04T00:10:54+00:00
Next Scan 2024-05-11T00:10:54+00:00

Last Scan

Scanned2024-05-04T00:10:54+00:00
URL https://haufe.de/robots.txt
Redirect https://www.haufe.de/robots.txt
Redirect Domain www.haufe.de
Redirect Base haufe.de
Domain IPs 52.51.77.218, 54.220.239.54
Redirect IPs 52.51.77.218, 54.220.239.54
Response IP 54.220.239.54
Found Yes
Hash 583d2f78fbc14075ec28b27bff48d77d82664e9bd6aa9393dba6e58d8c4560f1
SimHash eb701856dc05

Groups

slurp

Rule Path
Disallow /image/

firefly

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

vebidoo

Rule Path
Disallow /

unido-bot

Rule Path
Disallow /

turnitinbot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

*

Rule Path
Allow /lazyloading/*
Disallow /haufe-personal-office-gold/*
Disallow /recht/deutsches-anwalt-office-start/*
Disallow /*/upload/*
Disallow /forms/*
Allow /forms/kontakt.html$
Disallow /*/Auftritte/*
Disallow /Auftritte/*
Disallow /*/DataCenter/*
Disallow /DataCenter/*
Disallow /Downloads/*
Disallow /*/PaidContent/*
Disallow *searchResult*
Disallow /steuer1*
Disallow /suche/*
Disallow *?print=*
Disallow *%26ghost%3Dtrue
Disallow /SEOShopData/media/pro/*
Allow *profirma-professional*

googlebot

Rule Path
Allow /lazyloading/*
Disallow /haufe-personal-office-gold/*
Disallow /recht/deutsches-anwalt-office-start/*
Disallow /*/upload/*
Disallow /forms/*
Allow /forms/kontakt.html$
Disallow /Auftritte/
Disallow /*/Auftritte/
Disallow /*/*/Auftritte/
Disallow /*/DataCenter/
Disallow /*/*/DataCenter/
Disallow /DataCenter/
Disallow /DataCenter/*
Disallow /Downloads/*
Disallow /*/PaidContent/*
Disallow *searchResult*
Disallow /steuer1*
Disallow /suche/*
Disallow *?print=*
Disallow *%26ghost%3Dtrue
Disallow /SEOShopData/media/pro/*
Allow *profirma-professional*

Other Records

Field Value
sitemap https://www.haufe.de/siteindex.xml
sitemap https://www.haufe.de/www.haufe.de.xml
sitemap https://www.haufe.de/haufe.de_HID_Norms.xml
sitemap https://www.haufe.de/hr/sitemap.xml

Comments

  • cf. https://platform.openai.com/docs/plugins/bot
  • cf. https://platform.openai.com/docs/gptbot
  • Note: For the time being, OpenAI treats GPTBot as an alias of ChatGPT-User. Therefore, we should ensure the two
  • entries are configured the same. We explicitly configure both agents because OpenAI might start treating them separately.
  • Crawler for Google's Bard and Vertex AI
  • cf. https://developers.google.com/search/docs/crawling-indexing/overview-google-crawlers?hl=en#google-extended
  • sitemap: