sekaipedia.org
robots.txt

Robots Exclusion Standard data for sekaipedia.org

Resource Scan

Scan Details

Site Domain sekaipedia.org
Base Domain sekaipedia.org
Scan Status Ok
Last Scan2024-05-12T11:12:42+00:00
Next Scan 2024-06-11T11:12:42+00:00

Last Scan

Scanned2024-05-12T11:12:42+00:00
URL https://sekaipedia.org/robots.txt
Redirect https://www.sekaipedia.org/robots.txt
Redirect Domain www.sekaipedia.org
Redirect Base sekaipedia.org
Domain IPs 44.230.85.241, 52.33.207.7
Redirect IPs 109.123.230.163, 2400:d320:2161:9775::1
Response IP 109.123.230.163
Found Yes
Hash 1567666cecfd9e228a37f2a93f39b4e2ab184a9d9baea91363f9c63cd49e4784
SimHash 7857ac4a3d58

Groups

*

Rule Path
Allow /w/api.php?action=mobileview&
Allow /w/load.php?
Disallow /w/
Disallow /geoip$
Disallow /rest_v1/
Disallow /wiki/Special%3A
Disallow /wiki/Spezial%3A
Disallow /wiki/Spesial%3A
Disallow /wiki/Special%3A
Disallow /wiki/Spezial%3A
Disallow /wiki/Spesial%3A
Disallow /wiki/Property%3A
Disallow /wiki/Property%3A
Disallow /wiki/property%3A
Disallow /wiki/Especial%3A
Disallow /wiki/Especial%3A
Disallow /wiki/especial%3A
Disallow /wiki/Special%3A*
Disallow /wiki/Spezial%3A*
Disallow /wiki/Spesial%3A*
Disallow /wiki/Special%3A*
Disallow /wiki/Spezial%3A*
Disallow /wiki/Spesial%3A*
Disallow /wiki/Property%3A*
Disallow /wiki/Property%3A*
Disallow /wiki/property%3A*
Disallow /wiki/Especial%3A*
Disallow /wiki/Especial%3A*
Disallow /wiki/especial%3A*

semrushbot

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

bytespider

Rule Path
Disallow /

petalbot

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

megaindex

Rule Path
Disallow /

serpstatbot

Rule Path
Disallow /

barkrowler

Rule Path
Disallow /

seekportbot

Rule Path
Disallow /

mj12bot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

yandexbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 2.5

bingbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 5

Other Records

Field Value
sitemap https://www.sekaipedia.org/sitemap.xml

Comments

  • robots.txt for Miraheze
  • Throttle access to certain pages
  • Pattern matching is not officially supported by the robots.txt spec, but some crawlers, like Googlebot, support it
  • Block SemrushBot
  • Block AhrefsBot
  • Block Bytespider
  • Block PetalBot
  • Block DotBot
  • Block MegaIndex
  • Block serpstatbot
  • Block Barkrowler
  • Block SeekportBot
  • Keep Crawl-Delay rules at the bottom
  • Bots that don't understand Crawl-Delay might break when encountering it
  • See https://github.com/otwcode/otwarchive/pull/4411#discussion_r1044351129
  • Throttle MJ12Bot
  • Throttle YandexBot
  • TODO: Crawl-delay is not respected since 2018
  • Throttle BingBot
  • ----------------------------------------------------------
  • Dynamic sitemap url
  • ----------------------------------------------------------