portalsofphereon.com
robots.txt

Robots Exclusion Standard data for portalsofphereon.com

Resource Scan

Scan Details

Site Domain portalsofphereon.com
Base Domain portalsofphereon.com
Scan Status Failed
Failure StageFetching resource.
Failure ReasonServer returned a client error.
Last Scan2025-11-16T00:32:06+00:00
Next Scan 2025-11-30T00:32:06+00:00

Last Successful Scan

Scanned2025-10-23T03:09:36+00:00
URL https://www.portalsofphereon.com/robots.txt
Domain IPs 104.21.70.31, 172.67.218.208, 2606:4700:3031::ac43:dad0, 2606:4700:3037::6815:461f
Response IP 104.21.70.31
Found Yes
Hash 6e75715ac9f255d6b4782fc240b7fc5a2fd175b06462b4f87a66d518d4df10aa
SimHash 3007c85ac5c0

Groups

*

Rule Path
Allow /w/api.php?action=mobileview&
Allow /w/load.php?
Disallow /w/
Disallow /geoip$
Disallow /rest_v1/

semrushbot

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

bytespider

Rule Path
Disallow /

petalbot

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

megaindex

Rule Path
Disallow /

serpstatbot

Rule Path
Disallow /

barkrowler

Rule Path
Disallow /

seekportbot

Rule Path
Disallow /

mj12bot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

yandexbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 2.5

bingbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 20

Other Records

Field Value
sitemap https://www.portalsofphereon.com/sitemap.xml

Comments

  • robots.txt for Miraheze
  • Throttle access to certain pages
  • Do not include special pages and other pages where indexing is undesirable if they are likely to be linked to; use noindex instead.
  • That's because Google can still index pages in here without crawling them if the pages are linked to
  • See https://developers.google.com/search/docs/crawling-indexing/robots/intro
  • Block SemrushBot
  • Block AhrefsBot
  • Block Bytespider
  • Block PetalBot
  • Block DotBot
  • Block MegaIndex
  • Block serpstatbot
  • Block Barkrowler
  • Block SeekportBot
  • Keep Crawl-Delay rules at the bottom
  • Bots that don't understand Crawl-Delay might break when encountering it
  • See https://github.com/otwcode/otwarchive/pull/4411#discussion_r1044351129 (English) and https://webtan.impress.co.jp/e/2022/11/04/43611 (Japanese)
  • Throttle MJ12Bot
  • Throttle YandexBot
  • TODO: Crawl-delay is not respected since 2018
  • Throttle BingBot
  • ----------------------------------------------------------
  • Dynamic sitemap url
  • ----------------------------------------------------------