goodhousebox.com
robots.txt

Robots Exclusion Standard data for goodhousebox.com

Resource Scan

Scan Details

Site Domain goodhousebox.com
Base Domain goodhousebox.com
Scan Status Ok
Last Scan2026-02-02T19:25:22+00:00
Next Scan 2026-02-09T19:25:22+00:00

Last Scan

Scanned2026-02-02T19:25:22+00:00
URL https://goodhousebox.com/robots.txt
Domain IPs 72.62.135.201
Response IP 72.62.135.201
Found Yes
Hash ee23f882673cf6a0e87c8209bdca0dc49c8189d24d5ce2e548b225aa728cf57d
SimHash 69480b30c5db

Groups

*

Rule Path
Allow /
Disallow /cgi-bin/
Disallow /admin/
Disallow /*?*

Other Records

Field Value
sitemap https://goodhousebox.com/sitemap.xml

Comments

  • ======================================
  • robots.txt for goodhousebox.com
  • ======================================
  • Block useless or sensitive paths
  • Block URL parameters (duplicate content)
  • Sitemap