douban.com
robots.txt

Robots Exclusion Standard data for douban.com

Resource Scan

Scan Details

Site Domain douban.com
Base Domain douban.com
Scan Status Ok
Last Scan2024-04-30T19:07:09+00:00
Next Scan 2024-05-07T19:07:09+00:00

Last Scan

Scanned2024-04-30T19:07:09+00:00
URL https://douban.com/robots.txt
Redirect https://www.douban.com/robots.txt
Redirect Domain www.douban.com
Redirect Base douban.com
Domain IPs 120.53.130.158, 140.143.177.206, 81.70.124.99
Redirect IPs 120.53.130.158, 140.143.177.206, 81.70.124.99
Response IP 81.70.124.99
Found Yes
Hash a44732778690f2f3d6e294188b6024a0105a8192c72978b19f2c6ce39e459db7
SimHash 2b44cc57e617

Groups

*

Rule Path
Disallow /subject_search
Disallow /amazon_search
Disallow /search
Disallow /group/search
Disallow /event/search
Disallow /celebrities/search
Disallow /location/drama/search
Disallow /forum/
Disallow /new_subject
Disallow /service/iframe
Disallow /j/
Disallow /link2/
Disallow /recommend/
Disallow /doubanapp/card
Disallow /update/topic/
Disallow /share/
Disallow /people/*/collect
Disallow /people/*/wish
Disallow /people/*/all
Disallow /people/*/do
Allow /ads.txt

wandoujia spider

Rule Path
Disallow /

mediapartners-google

Rule Path
Disallow /subject_search
Disallow /amazon_search
Disallow /search
Disallow /group/search
Disallow /event/search
Disallow /celebrities/search
Disallow /location/drama/search
Disallow /j/

Other Records

Field Value
sitemap https://www.douban.com/sitemap_index.xml
sitemap https://www.douban.com/sitemap_updated_index.xml

Comments

  • Crawl-delay: 5