nightlies.apache.org
robots.txt

Robots Exclusion Standard data for nightlies.apache.org

Archived Snapshots

Resource Scan

Scan Details

Site Domain	nightlies.apache.org
Base Domain	apache.org
Scan Status	Ok
Last Scan	2025-03-03T10:02:32+00:00
Next Scan	2025-04-02T10:02:32+00:00

Last Scan

Scanned	2025-03-03T10:02:32+00:00
URL	https://nightlies.apache.org/robots.txt
Domain IPs	2a01:4f9:4a:23ec::2, 95.217.87.228
Response IP	95.217.87.228
Found	Yes
Hash	056733415abb87cdc34e95f2fa29427f32fb81026763a9bbe4af8083ba0c021b
SimHash	2c39ced2dfd5

Groups

googlebot

Rule	Path
Disallow

Rule

Path

Disallow

googlebot-image

Rule	Path
Disallow

Rule

Path

Disallow

googlebot-mobile

Rule	Path
Disallow

Rule

Path

Disallow

msnbot

Rule	Path
Disallow

Rule

Path

Disallow

slurp

Rule	Path
Disallow

Rule

Path

Disallow

nutch

Rule	Path
Disallow

Rule

Path

Disallow

ia_archiver

Rule	Path
Disallow

Rule

Path

Disallow

baiduspider

Rule	Path
Disallow	/

Rule

Path

Disallow

yahoo-mmcrawler

Rule	Path
Disallow

Rule

Path

Disallow

psbot

Rule	Path
Disallow

Rule

Path

Disallow

yahoo-blogs/v3.9

Rule	Path
Disallow

Rule

Path

Disallow

*

Rule	Path
Disallow	/

Rule

Path

Disallow

Other Records

Field	Value
crawl-delay	2

Field

Value

crawl-delay

Comments

Bot operators will need to contact root@apache.org in order to be explicitly allowed.
Bots that do not respect crawl-delay instructions are not permitted.
Default action: don't allow. Global crawl-delay is set to two seconds.

nightlies.apache.orgrobots.txt

Resource Scan

Scan Details

Last Scan

Groups

googlebot

googlebot-image

googlebot-mobile

msnbot

slurp

nutch

ia_archiver

baiduspider

yahoo-mmcrawler

psbot

yahoo-blogs/v3.9

*

Other Records

Comments

nightlies.apache.org
robots.txt