picman.1688.com
robots.txt

Robots Exclusion Standard data for picman.1688.com

Resource Scan

Scan Details

Site Domain picman.1688.com
Base Domain 1688.com
Scan Status Ok
Last Scan2024-05-20T21:50:15+00:00
Next Scan 2024-06-19T21:50:15+00:00

Last Scan

Scanned2024-05-20T21:50:15+00:00
URL https://picman.1688.com/robots.txt
Domain IPs 2408:4001:f00::12b, 2408:4001:f00::1c8, 2408:4001:f00::1fd, 2408:4001:f00::20d, 2408:4001:f00::21, 2408:4001:f00::251, 2408:4001:f00::289, 2408:4001:f00::2a3, 2408:4001:f00::2ce, 2408:4001:f00::318, 2408:4001:f00::349, 2408:4001:f00::39f, 2408:4001:f00::82, 2408:4001:f00::8b, 2408:4001:f00::b1, 2408:4001:f00::dc, 59.82.23.146
Response IP 59.82.23.111
Found Yes
Hash 53df41b62003a8ccd03b030c1020e46b20b4bb00da39603bc4132459c8c3933c
SimHash 2a0e5960afd2

Groups

*

Rule Path
Disallow /bin/
Disallow /offer/turbine/template/offer%2CPost
Disallow /catalog/turbine/template/product%2CCreateProduct
Disallow /community/turbine/template/Index/action/community.friend.AddForOffer
Disallow /offer/turbine/template/offer%2CForward
Disallow /athena/bizref/rempost
Disallow /athena/myalibaba
Disallow /ali/news/
Disallow /member/
Disallow /buyer/turbine/template/
Disallow /seller/turbine/template/
Disallow /message

hl_ftien_spider

Rule Path
Disallow /

Comments

  • file: robots.txt,v 1.0 2002/09/23 created by Tsing Kong
  • exodus.1688.com
  • 按照robots.txt的标准写法,规定一些不允许爬虫爬的页面或目录。
  • robots.txt 的写法参照 <URL:http://www.robotstxt.org/wc/exclusion.html#robotstxt>
  • Format is:
  • User-agent: <name of spider>
  • Disallow: <nothing> | <path>
  • -----------------------------------------------------------------------------
  • 天津海量 搜索