sxl.cn
robots.txt

Robots Exclusion Standard data for sxl.cn

Archived Snapshots

Resource Scan

Scan Details

Site Domain	sxl.cn
Base Domain	sxl.cn
Scan Status	Ok
Last Scan	2024-11-09T05:42:52+00:00
Next Scan	2024-11-16T05:42:52+00:00

Last Scan

Scanned	2024-11-09T05:42:52+00:00
URL	https://sxl.cn/robots.txt
Redirect	https://www.sxl.cn/robots.txt
Redirect Domain	www.sxl.cn
Redirect Base	sxl.cn
Domain IPs	47.89.2.255, 47.89.57.153
Redirect IPs	163.171.211.34
Response IP	163.171.211.34
Found	Yes
Hash	3f39c4f04c2bc8f4e847d5ff9a8f97ca0f50c7b436c959f72afb0cd6e87b3e52
SimHash	a295292d7514

Groups

*

Rule	Path
Disallow	/a/
Disallow	/r/
Disallow	/s/sites/
Disallow	/s/audience/
Disallow	/s/reseller/user/
Disallow	/s/analytics/

Rule

Path

Disallow

/a/

Disallow

/r/

Disallow

/s/sites/

Disallow

/s/audience/

Disallow

/s/reseller/user/

Disallow

/s/analytics/

baiduspider

Rule	Path
Allow	/article/blog/*.mip

Rule

Path

Allow

/article/blog/*.mip

baiduspider

Rule	Path
Allow	/article/blog/*.mip

Rule

Path

Allow

/article/blog/*.mip

*

Rule	Path
Disallow	/article/blog/*.mip

Rule

Path

Disallow

/article/blog/*.mip

adsbot-google

Rule	Path
Disallow	/s/screenshot/

Rule

Path

Disallow

/s/screenshot/

Back to top

Other Records

Field	Value
sitemap	https://www.sxl.cn/sitemap.xml

Field

Value

sitemap

https://www.sxl.cn/sitemap.xml

Back to top

Comments

See http://www.robotstxt.org/wc/norobots.html for documentation on how to use the robots.txt file
To ban all spiders from the entire site uncomment the next two lines:
User-Agent: *
Disallow: /
Google adsbot ignores robots.txt unless specifically named!

Back to top

sxl.cnrobots.txt

Resource Scan

Scan Details

Last Scan

Groups

*

baiduspider

baiduspider

*

adsbot-google

Other Records

Comments

sxl.cn
robots.txt