hoermann-info.de
robots.txt

Robots Exclusion Standard data for hoermann-info.de

Resource Scan

Scan Details

Site Domain hoermann-info.de
Base Domain hoermann-info.de
Scan Status Ok
Last Scan2026-01-10T20:58:59+00:00
Next Scan 2026-02-09T20:58:59+00:00

Last Scan

Scanned2026-01-10T20:58:59+00:00
URL https://hoermann-info.de/robots.txt
Redirect https://www.hoermann-info.de/robots.txt
Redirect Domain www.hoermann-info.de
Redirect Base hoermann-info.de
Domain IPs 2a00:1169:103:4560::, 92.205.54.226
Redirect IPs 2a00:1169:103:4560::, 92.205.54.226
Response IP 92.205.54.226
Found Yes
Hash 100f48bf2047fba220dc0b785fe18baec75d647412cde65b988131ef0aaa818f
SimHash 6a1f1d5849e1

Groups

*

Rule Path
Disallow /administrator/
Disallow /api/
Disallow /bin/
Disallow /cache/
Disallow /cli/
Disallow /components/
Disallow /includes/
Disallow /images/
Disallow /installation/
Disallow /language/
Disallow /layouts/
Disallow /libraries/
Disallow /logs/
Disallow /modules/
Disallow /plugins/
Disallow /tmp/
Disallow /*?start=
Disallow /*%26start%3D
Disallow /*?view=
Disallow /*%26view%3D
Disallow /*%26article%3D
Disallow /*?article=
Disallow /*%26catid%3D
Disallow /*?catid=
Disallow /*?id=
Disallow /*%26id%3D
Disallow /*?limit=
Disallow /*%26limit%3D
Disallow /*?filter=
Disallow /*%26filter%3D
Disallow /*karriere-dinamyscheinhalt-sef
Disallow /*?sid=
Disallow /*%26sid%3D
Disallow /*?rCH=
Disallow /cwhire/
Disallow /karriere/bewerbungsformular-stellenangebot?jobtitel=*

Other Records

Field Value
sitemap https://www.hoermann-info.de/sitemap_xml_de_0.xml
sitemap https://www.hoermann-info.de/sitemap_xml_en_0.xml
sitemap https://www.hoermann-info.de/sitemap_xml_ro_0.xml

Comments

  • If the Joomla site is installed within a folder
  • eg www.example.com/joomla/ then the robots.txt file
  • MUST be moved to the site root
  • eg www.example.com/robots.txt
  • AND the joomla folder name MUST be prefixed to all of the
  • paths.
  • eg the Disallow rule for the /administrator/ folder MUST
  • be changed to read
  • Disallow: /joomla/administrator/
  • For more information about the robots.txt standard, see:
  • https://www.robotstxt.org/orig.html
  • Disallow: /component/
  • Umgeht die typische URLs
  • Umgeht die URLs mit session ID
  • cwhire disallow
  • Sitemap