soscroquettes.com
robots.txt

Robots Exclusion Standard data for soscroquettes.com

Resource Scan

Scan Details

Site Domain soscroquettes.com
Base Domain soscroquettes.com
Scan Status Ok
Last Scan2026-02-15T01:09:31+00:00
Next Scan 2026-02-22T01:09:31+00:00

Last Scan

Scanned2026-02-15T01:09:31+00:00
URL https://soscroquettes.com/robots.txt
Domain IPs 104.21.13.210, 172.67.133.29, 2606:4700:3032::6815:dd2, 2606:4700:3032::ac43:851d
Response IP 172.67.133.29
Found Yes
Hash 541fb69518bf8640dc99d71bfb02c5161386a3b567f63473b3bdbf89e2586d1d
SimHash 583d4c70fa83

Groups

*

Rule Path
Allow /
Allow /sitemap.xml
Disallow /wp-admin/
Disallow /admin/
Disallow /backup/
Disallow /cgi-bin/
Disallow /tmp/
Disallow /private/

baiduspider

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

rogerbot

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

petalbot

Rule Path
Disallow /

zoominfobot

Rule Path
Disallow /

mail.ru_bot

Rule Path
Disallow /

yandexbot

Rule Path
Disallow /

seznambot

Rule Path
Disallow /

blexbot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

Other Records

Field Value
crawl-delay 10

Other Records

Field Value
sitemap https://info-du-continent.com/sitemap.xml

Comments

  • Interdire l'accès aux dossiers administratifs
  • Bloquer les bots agressifs ou inutiles
  • Limiter la vitesse de crawl
  • Sitemap