simplementegenial.cc
robots.txt

Robots Exclusion Standard data for simplementegenial.cc

Resource Scan

Scan Details

Site Domain simplementegenial.cc
Base Domain simplementegenial.cc
Scan Status Ok
Last Scan2024-11-11T14:55:52+00:00
Next Scan 2024-11-18T14:55:52+00:00

Last Scan

Scanned2024-11-11T14:55:52+00:00
URL https://simplementegenial.cc/robots.txt
Domain IPs 104.21.79.226, 172.67.149.73, 2606:4700:3030::6815:4fe2, 2606:4700:3033::ac43:9549
Response IP 172.67.149.73
Found Yes
Hash 392eabe55bf7ef2c9768648c21e6c8b08cbcf6a24c6551ba1202ca278c8210ce
SimHash 424182520334

Groups

ia_archiver-web.archive.org

Rule Path
Disallow /

*

Rule Path
Disallow /cgi-bin
Disallow /wp-
Disallow /trackback
Disallow */trackback
Disallow */*/trackback
Disallow /*?*
Disallow /xmlrpc.php
Allow */uploads
Allow *.js
Allow *.css

yandex

Rule Path
Disallow /cgi-bin
Disallow /wp-
Disallow /trackback
Disallow */trackback
Disallow */*/trackback
Disallow /*?*
Disallow /xmlrpc.php
Allow */uploads
Allow *.js
Allow *.css
Allow */feed/zen/

Other Records

Field Value
sitemap https://simplementegenial.cc/sitemap_index.xml

Comments

  • Disallow: */*/feed/*/
  • Disallow: */feed
  • Disallow: /tag
  • Disallow: */*/feed/*/
  • Disallow: */feed
  • Disallow: /tag

Warnings

  • `host` is not a known field.