onretrieval.com
robots.txt

Robots Exclusion Standard data for onretrieval.com

Resource Scan

Scan Details

Site Domain onretrieval.com
Base Domain onretrieval.com
Scan Status Ok
Last Scan2025-11-19T12:44:51+00:00
Next Scan 2025-12-19T12:44:51+00:00

Last Scan

Scanned2025-11-19T12:44:51+00:00
URL https://onretrieval.com/robots.txt
Domain IPs 104.26.10.75, 104.26.11.75, 172.67.70.177, 2606:4700:20::681a:a4b, 2606:4700:20::681a:b4b, 2606:4700:20::ac43:46b1
Response IP 104.26.10.75
Found Yes
Hash d7301f158d4e1646fd3f2d61c99c2cfd492b61fa1b4df0033d720823e03e3267
SimHash 351e57c1cef2

Groups

*

Rule Path
Disallow /wp-admin
Allow /wp-admin/admin-ajax.php
Disallow /*feed/
Disallow /page/
Disallow /tag/
Disallow /category/
Disallow /*comments/
Disallow /*trackback/
Disallow /*attachment/
Disallow /*?s=
Disallow /?attachment_id*
Disallow /wp-content/plugins/link-juice-optimizer/public/js/link-juice-optimizer.js

orthogaffe

Rule Path
Disallow /

ubicrawler

Rule Path
Disallow /

doc

Rule Path
Disallow /

zao

Rule Path
Disallow /

sitecheck.internetseer.com

Rule Path
Disallow /

zealbot

Rule Path
Disallow /

msiecrawler

Rule Path
Disallow /

sitesnagger

Rule Path
Disallow /

webstripper

Rule Path
Disallow /

webcopier

Rule Path
Disallow /

fetch

Rule Path
Disallow /

offline explorer

Rule Path
Disallow /

teleport

Rule Path
Disallow /

teleportpro

Rule Path
Disallow /

webzip

Rule Path
Disallow /

linko

Rule Path
Disallow /

httrack

Rule Path
Disallow /

microsoft.url.control

Rule Path
Disallow /

xenu

Rule Path
Disallow /

larbin

Rule Path
Disallow /

libwww

Rule Path
Disallow /

zyborg

Rule Path
Disallow /

download ninja

Rule Path
Disallow /

wget

Rule Path
Disallow /

grub-client

Rule Path
Disallow /

Other Records

Field Value
sitemap https://onretrieval.com/page-sitemap.xml
sitemap https://onretrieval.com/post-sitemap.xml
sitemap https://onretrieval.com/category-sitemap.xml

Comments

  • Bloqueos
  • Sitemaps
  • Bots tóxicos
  • Some bots are known to be trouble, particularly those designed to copy
  • entire sites. Please obey robots.txt.