anews.mx
robots.txt

Robots Exclusion Standard data for anews.mx

Resource Scan

Scan Details

Site Domain anews.mx
Base Domain anews.mx
Scan Status Ok
Last Scan2024-09-22T04:48:14+00:00
Next Scan 2024-09-29T04:48:14+00:00

Last Scan

Scanned2024-09-22T04:48:14+00:00
URL https://anews.mx/robots.txt
Domain IPs 104.21.36.40, 172.67.184.184, 2606:4700:3032::6815:2428, 2606:4700:3033::ac43:b8b8
Response IP 104.21.36.40
Found Yes
Hash 70847ae88e895125a4445f956868aa3381e1ef1baa528bcc233facaccdd4e493
SimHash 563ccd7357aa

Groups

ahrefsbot

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

yandexbot

Rule Path
Disallow /

*

Rule Path
Disallow /*.pdf$
Disallow /*.Pdf$
Disallow /*.PDF$
Disallow /*.zip$
Disallow /*.Zip$
Disallow /*.ZIP$
Disallow /wp-admin/
Disallow /cgi-bin/
Disallow /tmp/
Disallow /private/

Other Records

Field Value
sitemap https://anews.mx/sitemap.xml

Comments

  • Blocking known aggressive bots
  • Apply rules to all other bots
  • Crawl-delay directive is commented out to allow compliant bots to crawl faster
  • Crawl-delay: 60.00
  • Blocking specific file types
  • Blocking specific directories (add or remove as necessary)
  • General guidance for bots
  • Please respect these rules to avoid server overload.