/.well-known/

Log In Sign Up

python.org
robots.txt

Robots Exclusion Standard data for python.org

Archived Snapshots

Resource Scan

Scan Details

Site Domain	python.org
Base Domain	python.org
Scan Status	Ok
Last Scan	2024-09-19T20:24:44+00:00
Next Scan	2024-10-19T20:24:44+00:00

Last Scan

Scanned	2024-09-19T20:24:44+00:00
URL	https://python.org/robots.txt
Redirect	https://www.python.org/robots.txt
Redirect Domain	www.python.org
Redirect Base	python.org
Domain IPs	151.101.0.223, 151.101.128.223, 151.101.192.223, 151.101.64.223, 2a04:4e42:200::223, 2a04:4e42:400::223, 2a04:4e42:600::223, 2a04:4e42::223
Redirect IPs	199.232.44.223, 2a04:4e42:48::223
Response IP	199.232.44.223
Found	Yes
Hash	18cb4cd525df8528491845e76f3af26c29c6795d02ea8133974d3b341a2ddd9f
SimHash	aa159b4a8570

Groups

httrack
puf
msiecrawler

Rule

Path

Disallow

/

krugle

Rule

Path

Allow

/

Disallow

/~guido/orlijn/

Disallow

/webstats/

nutch

Rule

Path

Disallow

/

*

Rule

Path

Disallow

/~guido/orlijn/

Disallow

/webstats/

Back to top

Comments

Directions for robots. See this URL:
http://www.robotstxt.org/robotstxt.html
for a description of the file format.
The Krugle web crawler (though based on Nutch) is OK.
No one should be crawling us with Nutch.
Hide old versions of the documentation and various large sets of files.

Back to top