newark-de.claz.org
robots.txt

Robots Exclusion Standard data for newark-de.claz.org

Resource Scan

Scan Details

Site Domain newark-de.claz.org
Base Domain claz.org
Scan Status Ok
Last Scan2024-09-20T13:33:50+00:00
Next Scan 2024-09-27T13:33:50+00:00

Last Scan

Scanned2024-09-20T13:33:50+00:00
URL https://newark-de.claz.org/robots.txt
Domain IPs 69.162.68.146, 69.162.83.22, 74.63.201.106
Response IP 69.162.83.22
Found Yes
Hash b9e290cdcf414b97d8b0eb84549ad52dba0da322eda7e7ec45b76a0addea21a9
SimHash 3f015104e893

Groups

*

Rule Path
Disallow /user/
Disallow /guest/
Disallow /go/
Disallow /partner/
Disallow /*?*save=search
Disallow /*/flag$
Disallow /classifieds/*/analytics.svg
Disallow /classifieds/*/contact

Other Records

Field Value
sitemap https://newark-de.claz.org/sitemap.xml