grad-busovaca.com
robots.txt

Robots Exclusion Standard data for grad-busovaca.com

Resource Scan

Scan Details

Site Domain grad-busovaca.com
Base Domain grad-busovaca.com
Scan Status Ok
Last Scan2024-11-02T14:14:24+00:00
Next Scan 2024-11-09T14:14:24+00:00

Last Scan

Scanned2024-11-02T14:14:24+00:00
URL https://grad-busovaca.com/robots.txt
Domain IPs 159.69.104.135, 2a01:4f8:c0c:8ce::2
Response IP 159.69.104.135
Found Yes
Hash 5cdf7511c64569d559fe9841c97f15bd719cc40e781e62da33f4e9f4d61bd320
SimHash e822d092adc3

Groups

facebookexternalhit

Rule Path
Disallow

*

Rule Path
Disallow

*

Rule Path
Disallow /wp-admin/
Disallow /wp-includes/
Allow /wp-admin/admin-ajax.php
Disallow /?s=
Disallow /*?utm_*
Disallow /*?replytocom

Other Records

Field Value
sitemap Sitemap: https://grad-busovaca.com/sitemap_index.xml

Comments

  • Allow Facebook Scraper
  • Allow All Bots Full Access (Optional)
  • Block wp-admin and wp-includes (common practice to restrict non-public areas)
  • Allow access to Ajax requests for functionality
  • Block search result pages to prevent duplicate content issues
  • Block specific URL parameters if needed (adjust as per your site)
  • Sitemap link to help bots discover pages