jianshu.com
robots.txt

Robots Exclusion Standard data for jianshu.com

Resource Scan

Scan Details

Site Domain jianshu.com
Base Domain jianshu.com
Scan Status Failed
Failure StageFetching resource.
Failure ReasonServer returned a client error.
Last Scan2024-09-05T22:08:41+00:00
Next Scan 2024-12-04T22:08:41+00:00

Last Successful Scan

Scanned2024-05-09T22:07:13+00:00
URL https://jianshu.com/robots.txt
Redirect https://www.jianshu.com/robots.txt
Redirect Domain www.jianshu.com
Redirect Base jianshu.com
Domain IPs 47.251.58.39
Redirect IPs 163.181.81.231, 163.181.81.232, 163.181.81.233, 163.181.81.234, 163.181.81.235, 163.181.81.236, 163.181.81.237, 163.181.81.238
Response IP 163.181.81.231
Found Yes
Hash b1042569a3c6c7973ac0e2db51bb59dc8fe30f920185e6557e99c6cf81fbc159
SimHash 4cc40ad7f644

Groups

*

Rule Path
Disallow /search
Disallow /convos/
Disallow /notes/
Disallow /admin/
Disallow /adm/
Disallow /p/0826cf4692f9
Disallow /p/d8b31d20a867
Disallow /collections/*/recommended_authors
Disallow /trial/*
Disallow /keyword_notes
Disallow /stats-2017/*

trendkite-akashic-crawler

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 60

yisouspider

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 60

cliqzbot

Rule Path
Disallow /

googlebot

Rule Path
Allow /

Other Records

Field Value
crawl-delay 10

mediapartners-google

Rule Path
Allow /

Comments

  • See http://www.robotstxt.org/wc/norobots.html for documentation on how to use the robots.txt file
  • To ban all spiders from the entire site uncomment the next two lines:

Warnings

  • `request-rate` is not a known field.