gazetasp.com.br
robots.txt

Robots Exclusion Standard data for gazetasp.com.br

Resource Scan

Scan Details

Site Domain gazetasp.com.br
Base Domain gazetasp.com.br
Scan Status Failed
Failure StageFetching resource.
Failure ReasonServer returned a client error.
Last Scan2024-11-12T03:11:55+00:00
Next Scan 2024-11-19T03:11:55+00:00

Last Successful Scan

Scanned2024-11-03T19:20:24+00:00
URL https://gazetasp.com.br/robots.txt
Redirect https://www.gazetasp.com.br/robots.txt
Redirect Domain www.gazetasp.com.br
Redirect Base gazetasp.com.br
Domain IPs 104.21.41.161, 172.67.148.57, 2606:4700:3031::6815:29a1, 2606:4700:3035::ac43:9439
Redirect IPs 104.21.41.161, 172.67.148.57, 2606:4700:3031::6815:29a1, 2606:4700:3035::ac43:9439
Response IP 172.67.148.57
Found Yes
Hash f7e9522083b2e9795d187a46dc8ec93d5926afe452d346179dad94ea4e28c64b
SimHash b96d9a44e812

Groups

*

Rule Path
Disallow /noticia/versao_impressa/
Disallow /noticia/json/
Disallow /noticia/noticia_redirect/
Disallow /busca/
Disallow /frame_ultima_edicao/
Disallow /banner/clica_banner/
Disallow /cdn-cgi/

Other Records

Field Value
sitemap https://www.gazetasp.com.br/sitemap/
sitemap https://www.gazetasp.com.br/sitemap/news/
sitemap https://www.gazetasp.com.br/sitemap/categories/

Warnings

  • 1 invalid line.