clurge.com
robots.txt

Robots Exclusion Standard data for clurge.com

Resource Scan

Scan Details

Site Domain clurge.com
Base Domain clurge.com
Scan Status Failed
Failure StageFetching resource.
Failure ReasonCouldn't establish SSL connection.
Last Scan2025-09-03T07:17:16+00:00
Next Scan 2025-11-02T07:17:16+00:00

Last Successful Scan

Scanned2025-06-12T14:40:28+00:00
URL https://clurge.com/robots.txt
Redirect https://www.clurge.com/robots.txt
Redirect Domain www.clurge.com
Redirect Base clurge.com
Domain IPs 212.237.10.83
Redirect IPs 212.237.10.83
Response IP 212.237.10.83
Found Yes
Hash 3b2949754337decba44766bad0abfabce624004ef8d14c94c25efaced7c2e10d
SimHash 78193a71f7e8

Groups

*

Rule Path
Disallow

Other Records

Field Value
crawl-delay 60

*

Rule Path
Disallow *?*s=*
Disallow *?*search=*
Disallow *?*query=*
Disallow *?*sort=*
Disallow *?*filter=*
Disallow *?*price=*
Disallow *?*color=*
Disallow *?*size=*

*

Rule Path
Disallow *?*add=*
Disallow *?*add_to_cart=*
Disallow *?*add_to_wishlist=*

*

Rule Path
Disallow /*.pdf$

ia_archiver

Rule Path
Disallow /

archive.org_bot

Rule Path
Disallow /

archive.is_bot

Rule Path
Disallow /

Other Records

Field Value
crawl-delay 120

wayback

Rule Path
Disallow /

Other Records

Field Value
crawl-delay 120

wayback machine

Rule Path
Disallow /

Other Records

Field Value
crawl-delay 120

waybackarchive.org

Rule Path
Disallow /

Other Records

Field Value
crawl-delay 120

webarchive.nl

Rule Path
Disallow /

Other Records

Field Value
crawl-delay 120

mementoweb.org

Rule Path
Disallow /

Other Records

Field Value
crawl-delay 120

web.archive.org

Rule Path
Disallow /

Other Records

Field Value
crawl-delay 120

gptbot

Rule Path
Disallow /

claude-web

Rule Path
Disallow /

claudebot

Rule Path
Disallow /

anthropic-ai

Rule Path
Disallow /

cohere-ai

Rule Path
Disallow /

bytespider

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

perplexitybot

Rule Path
Disallow /

applebot-extended

Rule Path
Disallow /

diffbot

Rule Path
Disallow /

scrapy

Rule Path
Disallow /

Other Records

Field Value
crawl-delay 120

magpie-crawler

Rule Path
Disallow /

Other Records

Field Value
crawl-delay 120

ccbot

Rule Path
Disallow /

Other Records

Field Value
crawl-delay 120

omgili

Rule Path
Disallow /

Other Records

Field Value
crawl-delay 120

omgilibot

Rule Path
Disallow /

Other Records

Field Value
crawl-delay 120

node/simplecrawler

Rule Path
Disallow /

Other Records

Field Value
crawl-delay 120

ahrefsbot

Rule Path
Disallow /assets/

Other Records

Field Value
crawl-delay 100

semrushbot

Rule Path
Disallow /assets/
Disallow /cgi-bin/
Disallow /scripts/
Disallow /tmp/
Disallow /BK
Disallow backup_index

Other Records

Field Value
crawl-delay 100

Comments

  • robots.txt
  • Global settings for well-behaved bots
  • Block internal search parameters and faceted navigation
  • Block action URLs
  • Block PDF files
  • Block additional archive services
  • Block AI chatbots and training
  • Block scrapers
  • Standard protected directories

Warnings

  • `archive-control-allow` is not a known field.
  • `noindex` is not a known field.