gucaste.net
robots.txt

Robots Exclusion Standard data for gucaste.net

Resource Scan

Scan Details

Site Domain gucaste.net
Base Domain gucaste.net
Scan Status Failed
Failure StageFetching resource.
Failure ReasonCouldn't connect to server.
Last Scan2025-06-22T10:01:16+00:00
Next Scan 2025-09-20T10:01:16+00:00

Last Successful Scan

Scanned2023-05-08T16:17:21+00:00
URL https://gucaste.net/robots.txt
Domain IPs 118.69.80.47
Response IP 118.69.80.47
Found Yes
Hash f8d875b4847ec427d6ad90050201de26142b9d50dca0bd5d38bd441b9118339b
SimHash a715de4ad4d0

Groups

*

Rule Path
Disallow /admin
Disallow /cart
Disallow /carts
Disallow /orders
Disallow /checkout
Disallow /checkouts
Disallow /account
Disallow /collections/*%2B*
Disallow /collections/*%2B*
Disallow /collections/*%2B*
Disallow /blogs/*%2B*
Disallow /blogs/*%2B*
Disallow /blogs/*%2B*
Disallow /discount/*
Disallow /apple-app-site-association

adsbot-google

Rule Path
Disallow /checkout
Disallow /checkouts
Disallow /carts
Disallow /orders
Disallow /discount/*

nutch

Rule Path
Disallow /

mj12bot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

pinterest

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 1

Other Records

Field Value
sitemap https://gucaste.com/sitemap.xml

Comments

  • we use Haravan as our ecommerce platform
  • Google adsbot ignores robots.txt unless specifically named!