lwn.net
robots.txt

Robots Exclusion Standard data for lwn.net

Resource Scan

Scan Details

Site Domain lwn.net
Base Domain lwn.net
Scan Status Ok
Last Scan2024-09-11T07:39:21+00:00
Next Scan 2024-10-11T07:39:21+00:00

Last Scan

Scanned2024-09-11T07:39:21+00:00
URL https://lwn.net/robots.txt
Domain IPs 173.255.236.65, 2600:3c03::f03c:93ff:febd:80f5
Response IP 173.255.236.65
Found Yes
Hash 98a68d147657b22abf33c89d3d1916fc55d5182dc1dd9ec8e84e32422373d297
SimHash 6f1c4840cb16

Groups

*

Rule Path
Disallow /Search
Disallow /ml

slurp

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

scoutjet

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

ahrefsbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 20

ccbot
mj12bot
mail.ru_bot
mail.ru_bot/2.0
megaindex
megaindex.ru
trendkite-akashic-crawler
jooblebot
httrack
yacybot
petalbot
gptbot

Rule Path
Disallow /

Comments

  • Throttle this bozo search engine
  • ...and another
  • ...and another (7/2013)
  • Robota non grata