polcompball.wikitide.org
robots.txt

Robots Exclusion Standard data for polcompball.wikitide.org

Resource Scan

Scan Details

Site Domain polcompball.wikitide.org
Base Domain wikitide.org
Scan Status Failed
Failure StageFetching resource.
Failure ReasonServer returned a client error.
Last Scan2025-10-22T19:22:54+00:00
Next Scan 2025-11-21T19:22:54+00:00

Last Successful Scan

Scanned2025-05-16T00:30:29+00:00
URL https://polcompball.wikitide.org/robots.txt
Domain IPs 2602:294:0:b13::110, 38.46.223.205
Response IP 38.46.223.205
Found Yes
Hash 0e5e8f41a755391e9e8c9b7053b10fbc63b49587db9021995dc51af4f5803ddf
SimHash 3207e85a85d0

Groups

*

Rule Path
Allow /w/api.php?action=mobileview&
Allow /w/load.php?
Disallow /w/
Disallow /geoip$
Disallow /rest_v1/

semrushbot

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

bytespider

Rule Path
Disallow /

petalbot

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

megaindex

Rule Path
Disallow /

serpstatbot

Rule Path
Disallow /

barkrowler

Rule Path
Disallow /

seekportbot

Rule Path
Disallow /

mj12bot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

yandexbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 2.5

bingbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 20

Other Records

Field Value
sitemap https://polcompball.wikitide.org/sitemap.xml

Comments

  • robots.txt for Miraheze
  • Throttle access to certain pages
  • Do not include special pages and other pages where indexing is undesirable if they are likely to be linked to; use noindex instead.
  • That's because Google can still index pages in here without crawling them if the pages are linked to
  • See https://developers.google.com/search/docs/crawling-indexing/robots/intro
  • Block SemrushBot
  • Block AhrefsBot
  • Block Bytespider
  • Block PetalBot
  • Block DotBot
  • Block MegaIndex
  • Block serpstatbot
  • Block Barkrowler
  • Block SeekportBot
  • Keep Crawl-Delay rules at the bottom
  • Bots that don't understand Crawl-Delay might break when encountering it
  • See https://github.com/otwcode/otwarchive/pull/4411#discussion_r1044351129 (English) and https://webtan.impress.co.jp/e/2022/11/04/43611 (Japanese)
  • Throttle MJ12Bot
  • Throttle YandexBot
  • TODO: Crawl-delay is not respected since 2018
  • Throttle BingBot
  • ----------------------------------------------------------
  • Dynamic sitemap url
  • ----------------------------------------------------------