jconline.com.br
robots.txt

Robots Exclusion Standard data for jconline.com.br

Resource Scan

Scan Details

Site Domain jconline.com.br
Base Domain jconline.com.br
Scan Status Failed
Failure StageFetching resource.
Failure ReasonCouldn't connect to server.
Last Scan2024-10-05T19:32:33+00:00
Next Scan 2025-01-03T19:32:33+00:00

Last Successful Scan

Scanned2022-02-18T00:10:51+00:00
URL http://jconline.com.br/robots.txt
Redirect https://jc.ne10.uol.com.br/robots.txt
Redirect Domain jc.ne10.uol.com.br
Redirect Base uol.com.br
Response IP 200.147.36.53
Found Yes
Hash c8f9fb4d1bb0e7558886f73ef19c7e2293e669aff7deb54f3dc26455fb5d757c
SimHash e8485ea3d1f3

Groups

*

Rule Path
Disallow /_temp/
Disallow /src/
Disallow /cdn/
Disallow /assets/
Disallow /*.pdf$
Disallow /*.json$
Disallow /search/*
Disallow /tags/*/page/*
Allow /*.jpg
Allow /*.JPG
Allow /*.jpeg
Allow /*.JPEG
Allow /*.png
Allow /*.PNG
Allow /*.gif
Allow /*.GIF

facebot

Rule Path
Allow /imagens/

facebookexternalhit

Rule Path
Allow /imagens/

googlebot-news

Rule Path
Allow *

yandex

Rule Path
Disallow /

slurp

Rule Path
Disallow /

baidoospider

Rule Path
Disallow /

Comments

  • Sitemap: https://jc.ne10.uol.com.br/sitemap.xml