saocarlosagora.com.br
robots.txt

Robots Exclusion Standard data for saocarlosagora.com.br

Resource Scan

Scan Details

Site Domain saocarlosagora.com.br
Base Domain saocarlosagora.com.br
Scan Status Ok
Last Scan2024-05-24T22:21:48+00:00
Next Scan 2024-05-31T22:21:48+00:00

Last Scan

Scanned2024-05-24T22:21:48+00:00
URL https://saocarlosagora.com.br/robots.txt
Redirect https://www.saocarlosagora.com.br/robots.txt
Redirect Domain www.saocarlosagora.com.br
Redirect Base saocarlosagora.com.br
Domain IPs 104.21.93.71, 172.67.206.102, 2606:4700:3035::6815:5d47, 2606:4700:3035::ac43:ce66
Redirect IPs 104.21.93.71, 172.67.206.102, 2606:4700:3035::6815:5d47, 2606:4700:3035::ac43:ce66
Response IP 172.67.206.102
Found Yes
Hash f9dd3848d5b8baad30216770fdd5e27120ea5a28d37c739b2cdc13e2f2e0f19c
SimHash ad6c5644f990

Groups

*

Rule Path
Disallow /noticia/versao_impressa/
Disallow /noticia/json/
Disallow /noticia/noticia_redirect/
Disallow /busca/
Disallow /frame_ultima_edicao/
Disallow /banner/clica_banner/
Disallow /cdn-cgi/

Other Records

Field Value
sitemap https://www.saocarlosagora.com.br/sitemap/
sitemap https://www.saocarlosagora.com.br/sitemap/news/
sitemap https://www.saocarlosagora.com.br/sitemap/categories/