jobs-in-berlin.info
robots.txt

Robots Exclusion Standard data for jobs-in-berlin.info

Resource Scan

Scan Details

Site Domain jobs-in-berlin.info
Base Domain jobs-in-berlin.info
Scan Status Ok
Last Scan2024-10-07T13:37:53+00:00
Next Scan 2024-11-06T13:37:53+00:00

Last Scan

Scanned2024-10-07T13:37:53+00:00
URL https://jobs-in-berlin.info/robots.txt
Redirect https://www.jobs-in-berlin.info/robots.txt
Redirect Domain www.jobs-in-berlin.info
Redirect Base jobs-in-berlin.info
Domain IPs 157.90.49.3
Redirect IPs 157.90.49.3
Response IP 157.90.49.3
Found Yes
Hash 9cea5c274b9375216e5b1d48a611c530697d33660c451ed6b4fab53b2fc3156e
SimHash 5bf4f271c175

Groups

*

Rule Path
Disallow /impressum.html
Disallow /datenschutz.html
Disallow /agb.pdf
Disallow /agb.html
Disallow /erweiterte-suche.html
Disallow /suche.html
Disallow /job.php
Disallow /job.php*
Disallow /unternehmen/

msnbot

Rule Path
Disallow /impressum.html
Disallow /datenschutz.html
Disallow /agb.pdf
Disallow /agb.html
Disallow /erweiterte-suche.html
Disallow /suche.html
Disallow /job.php
Disallow /job.php*
Disallow /unternehmen/

Other Records

Field Value
crawl-delay 1

bingbot

Rule Path
Disallow /impressum.html
Disallow /datenschutz.html
Disallow /agb.pdf
Disallow /agb.html
Disallow /erweiterte-suche.html
Disallow /suche.html
Disallow /job.php
Disallow /job.php*
Disallow /unternehmen/

Other Records

Field Value
crawl-delay 1

ltx71

Rule Path
Disallow /

panscient.com

Rule Path
Disallow /

seekport

Rule Path
Disallow /

seekport crawler

Rule Path
Disallow /

netestate ne crawler

Rule Path
Disallow /

amazonbot

Rule Path
Disallow /

lcc

Rule Path
Disallow /

seokicks

Rule Path
Disallow /

blexbot

Rule Path
Disallow /