sportscourant.com
robots.txt

Robots Exclusion Standard data for sportscourant.com

Resource Scan

Scan Details

Site Domain sportscourant.com
Base Domain sportscourant.com
Scan Status Ok
Last Scan2025-03-09T17:41:12+00:00
Next Scan 2025-03-16T17:41:12+00:00

Last Scan

Scanned2025-03-09T17:41:12+00:00
URL https://sportscourant.com/robots.txt
Domain IPs 2a02:4780:84:9d82:e40e:4516:4191:2cb2, 77.37.115.91
Response IP 77.37.66.46
Found Yes
Hash ae1f59f1034444512a83198bdfc13f282f8507b76528d340b87c28c37e57cbad
SimHash 4636c94166b3

Groups

*

Rule Path
Disallow /wp-admin/
Disallow /wp-includes/
Disallow /wp-content/plugins/
Disallow /wp-content/themes/
Disallow /xmlrpc.php
Disallow /trackback/
Disallow /feed/
Disallow /comments/
Disallow /author/
Disallow /search/
Disallow /*?
Disallow /?rest_route=
Allow /wp-content/uploads/
Allow /wp-admin/admin-ajax.php

googlebot

Rule Path
Allow /wp-content/uploads/
Disallow /tag/

bingbot

Rule Path
Allow /wp-content/uploads/
Disallow /tag/

badbot

Rule Path
Disallow /

anotherbot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

yandexbot

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

Other Records

Field Value
crawl-delay 10

Other Records

Field Value
sitemap https://www.sportscourant.com/sitemap_index.xml

Comments

  • Sitemap Location
  • General directives for all bots
  • Specific directives for Googlebot
  • Specific directives for Bingbot
  • Block problematic bots
  • Crawl delay for all bots to prevent server overload