noticiaalminuto.com
robots.txt

Robots Exclusion Standard data for noticiaalminuto.com

Resource Scan

Scan Details

Site Domain noticiaalminuto.com
Base Domain noticiaalminuto.com
Scan Status Ok
Last Scan2024-06-10T04:17:04+00:00
Next Scan 2024-06-17T04:17:04+00:00

Last Scan

Scanned2024-06-10T04:17:04+00:00
URL https://noticiaalminuto.com/robots.txt
Domain IPs 104.26.0.55, 104.26.1.55, 172.67.68.73, 2606:4700:20::681a:137, 2606:4700:20::681a:37, 2606:4700:20::ac43:4449
Response IP 172.67.68.73
Found Yes
Hash 75589527551ea2dcc84d635bc724bf60634bf5b6c2e1d4e187a5535b0fb3a8da
SimHash 2a1dcd13c650

Groups

*

Rule Path
Disallow /wp-admin/
Allow /wp-admin/admin-ajax.php

scrapy

Rule Path
Allow /

googlebot-image

Rule Path Comment
Disallow -
Allow / It is not a standard use of this directive but Google prefers it
Disallow /*.php$ -
Disallow /*.js$ -
Disallow /*.inc$ -
Disallow /*.css$ -
Disallow /*.txt$ -
Disallow /*?* -
Disallow /wp-*/ -

ia_archiver

Rule Path
Disallow /

Other Records

Field Value
sitemap https://noticiaalminuto.com/sitemap_index.xml

Comments

  • disallow all files in these directories
  • disallow all files ending with these extensions
  • disallow all files with? in url
  • disallow all files in /wp- directorys
  • disallow archiving site