guiacities.com
robots.txt

Robots Exclusion Standard data for guiacities.com

Resource Scan

Scan Details

Site Domain guiacities.com
Base Domain guiacities.com
Scan Status Ok
Last Scan2026-03-24T03:07:05+00:00
Next Scan 2026-03-31T03:07:05+00:00

Last Scan

Scanned2026-03-24T03:07:05+00:00
URL https://guiacities.com/robots.txt
Domain IPs 145.223.124.188, 2a02:4780:84:5b91:8b25:f538:5d88:791b, 2a02:4780:84:f2c2:6f7c:e78d:2ad2:66cd, 88.223.87.128
Response IP 77.37.75.254
Found Yes
Hash 75a9f11cafe7743a594ed84044e7160b8abdeaf3c526da078808bd93559818d1
SimHash 6d095263c620

Groups

googlebot

Rule Path
Allow /

Other Records

Field Value
crawl-delay 1

bingbot

Rule Path
Allow /

Other Records

Field Value
crawl-delay 1

twitterbot

Rule Path
Allow /

facebookexternalhit

Rule Path
Allow /

linkedinbot

Rule Path
Allow /

slurp

Rule Path
Allow /

Other Records

Field Value
crawl-delay 1

duckduckbot

Rule Path
Allow /

Other Records

Field Value
crawl-delay 1

*

Rule Path
Allow /
Disallow /admin/
Disallow /minha-conta/
Disallow /auth
Disallow /api/
Allow /explorar
Allow /autos
Allow /imoveis
Allow /servicos
Allow /comercio-local
Allow /eventos
Allow /trocas
Allow /blog
Allow /estado/
Allow /cidade/
Allow /anuncio/
Allow /comercio/

Other Records

Field Value
crawl-delay 2

Other Records

Field Value
sitemap https://guiacities.com/sitemap.xml

Comments

  • Sitemap locations
  • Block admin and private areas
  • Allow important pages
  • Host directive

Warnings

  • `host` is not a known field.