aman.awiki.org
robots.txt

Robots Exclusion Standard data for aman.awiki.org

Resource Scan

Scan Details

Site Domain aman.awiki.org
Base Domain awiki.org
Scan Status Failed
Failure StageFetching resource.
Failure ReasonServer returned a client error.
Last Scan2025-10-13T17:27:01+00:00
Next Scan 2026-01-11T17:27:01+00:00

Last Successful Scan

Scanned2024-08-27T14:39:57+00:00
URL https://aman.awiki.org/robots.txt
Domain IPs 109.123.230.163, 2400:d320:2161:9775::1
Response IP 109.123.230.163
Found Yes
Hash c0aa00e2d5f20f9c4005787ca70f0d01d4468eb50f9539f8ab94828100e67bc6
SimHash 3247c85ac5d0

Groups

*

Rule Path
Allow /w/api.php?action=mobileview&
Allow /w/load.php?
Disallow /w/
Disallow /geoip$
Disallow /rest_v1/

semrushbot

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

bytespider

Rule Path
Disallow /

petalbot

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

megaindex

Rule Path
Disallow /

serpstatbot

Rule Path
Disallow /

barkrowler

Rule Path
Disallow /

seekportbot

Rule Path
Disallow /

mj12bot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

yandexbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 2.5

bingbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 20

Other Records

Field Value
sitemap https://aman.awiki.org/sitemap.xml

Comments

  • robots.txt for Miraheze
  • Throttle access to certain pages
  • Do not include special pages and other pages where indexing is undesirable if they are likely to be linked to; use noindex instead.
  • That's because Google can still index pages in here without crawling them if the pages are linked to
  • See https://developers.google.com/search/docs/crawling-indexing/robots/intro
  • Block SemrushBot
  • Block AhrefsBot
  • Block Bytespider
  • Block PetalBot
  • Block DotBot
  • Block MegaIndex
  • Block serpstatbot
  • Block Barkrowler
  • Block SeekportBot
  • Keep Crawl-Delay rules at the bottom
  • Bots that don't understand Crawl-Delay might break when encountering it
  • See https://github.com/otwcode/otwarchive/pull/4411#discussion_r1044351129 (English) and https://webtan.impress.co.jp/e/2022/11/04/43611 (Japanese)
  • Throttle MJ12Bot
  • Throttle YandexBot
  • TODO: Crawl-delay is not respected since 2018
  • Throttle BingBot
  • ----------------------------------------------------------
  • Dynamic sitemap url
  • ----------------------------------------------------------