isogg.org
robots.txt

Robots Exclusion Standard data for isogg.org

Resource Scan

Scan Details

Site Domain isogg.org
Base Domain isogg.org
Scan Status Ok
Last Scan2025-04-26T14:34:33+00:00
Next Scan 2025-05-26T14:34:33+00:00

Last Scan

Scanned2025-04-26T14:34:33+00:00
URL https://isogg.org/robots.txt
Domain IPs 104.21.18.104, 172.67.181.149, 2606:4700:3035::6815:1268, 2606:4700:3037::ac43:b595
Response IP 104.21.18.104
Found Yes
Hash 60ef2af08370a7278e52f370f9fde66d4532a03da7c893003485a6fffcbeef1e
SimHash ba0273336752

Groups

*

Rule Path
Disallow /w/
Allow /w/skins/
Allow /w/images/
Allow /w/cache/
Allow /w/resources/
Allow /w/load.php*
Allow /tree/
Allow /images/
Allow /icons/
Allow /icon/
Allow /resources/
Allow /newsletters/
Allow /scripts/
Allow /styles/
Disallow /wiki/Special%3ASearch
Disallow /wiki/Special%3ARandom
Disallow /wiki/Special%3A*
Disallow /wiki/Widget%3ACalFrame
Disallow /pub/

youdaobot

Rule Path
Disallow /

sogou spider

Rule Path
Disallow /

yisouspider

Rule Path
Disallow /

linkscrawler

Rule Path
Disallow /

easouspider

Rule Path
Disallow /

Comments

  • www.robotstxt.org/
  • http://code.google.com/web/controlcrawlindex/

Warnings

  • 2 invalid lines.