itrheinland.de
robots.txt

Robots Exclusion Standard data for itrheinland.de

Resource Scan

Scan Details

Site Domain itrheinland.de
Base Domain itrheinland.de
Scan Status Ok
Last Scan2024-06-08T07:39:38+00:00
Next Scan 2024-07-08T07:39:38+00:00

Last Scan

Scanned2024-06-08T07:39:38+00:00
URL https://itrheinland.de/robots.txt
Redirect https://www.itrheinland.de/robots.txt
Redirect Domain www.itrheinland.de
Redirect Base itrheinland.de
Domain IPs 168.119.242.134
Redirect IPs 168.119.242.134
Response IP 168.119.242.134
Found Yes
Hash aa21fda7509fafec7dc50b25b1c81ed128e767d6106fa80ffde8c396995f72de
SimHash 793ddcb14408

Groups

*

Rule Path
Disallow /bewerbung
Disallow /merkliste
Disallow /feedback
Disallow /jobs/counter
Disallow /jobs/autocomplete
Disallow /apply
Disallow /datenschutz
Disallow /impressum
Disallow /agb
Disallow /widget
Disallow /auth
Disallow /auth/twitter
Disallow /auth/facebook
Disallow /auth/xing
Disallow /auth/linkedin
Disallow /job_subscriptions
Disallow /job_subscriptions/new
Disallow /arbeitgeber
Disallow /IT-jobs/search
Disallow /IT-jobs/search
Disallow /IT-jobs/search

Other Records

Field Value
crawl-delay 10

mj12bot

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

semrushbot-sa

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

qwantify

Rule Path
Disallow /

amazonbot

Rule Path
Disallow /

auskunftbot

Rule Path
Disallow /

dataforseobot

Rule Path
Disallow /

petalbot

Rule Path
Disallow /

blexbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

amazonbot

Rule Path
Disallow /

auskunftbot

Rule Path
Disallow /

dataforseobot

Rule Path
Disallow /

petalbot

Rule Path
Disallow /

Other Records

Field Value
sitemap https://www.itrheinland.de/system/sitemap.xml.gz

Warnings

  • 3 invalid lines.