haribo.com
robots.txt

Robots Exclusion Standard data for haribo.com

Archived Snapshots

Resource Scan

Scan Details

Site Domain	haribo.com
Base Domain	haribo.com
Scan Status	Ok
Last Scan	2024-04-28T07:21:11+00:00
Next Scan	2024-05-28T07:21:11+00:00

Last Scan

Scanned	2024-04-28T07:21:11+00:00
URL	https://www.haribo.com/robots.txt
Domain IPs	23.209.46.89, 23.209.46.92, 2600:1413:b000:14::b857:c14d, 2600:1413:b000:14::b857:c152
Response IP	42.99.140.137
Found	Yes
Hash	980f2f35f61d20c8c6eb4c55f0cbf21da595a3dba7fadbfafa2cc880a748c510
SimHash	61349d563591

Groups

*

Rule	Path
Disallow	/cpresources/
Disallow	/vendor/
Disallow	/.env
Disallow	/cache/
Disallow	/admin/
Disallow	/overview/
Disallow	/website-manual/
Disallow	/social-media-styleguide/
Disallow	/playbook-e-commerce/
Disallow	/robots.txt

Rule

Path

Disallow

/cpresources/

Disallow

/vendor/

Disallow

/.env

Disallow

/cache/

Disallow

/admin/

Disallow

/overview/

Disallow

/website-manual/

Disallow

/social-media-styleguide/

Disallow

/playbook-e-commerce/

Disallow

/robots.txt

Back to top

Other Records

Field	Value
sitemap	https://www.haribo.com/sitemap.xml

Field

Value

sitemap

https://www.haribo.com/sitemap.xml

Back to top

Comments

robots.txt for https://www.haribo.com/
live - don't allow web crawlers to index cpresources/ or vendor/

Back to top

haribo.comrobots.txt

Resource Scan

Scan Details

Last Scan

Groups

*

Other Records

Comments

haribo.com
robots.txt