impresso.jc.ne10.uol.com.br
robots.txt

Robots Exclusion Standard data for impresso.jc.ne10.uol.com.br

Resource Scan

Scan Details

Site Domain impresso.jc.ne10.uol.com.br
Base Domain uol.com.br
Scan Status Ok
Last Scan2024-04-25T13:08:34+00:00
Next Scan 2024-05-25T13:08:34+00:00

Last Scan

Scanned2024-04-25T13:08:34+00:00
URL https://impresso.jc.ne10.uol.com.br/robots.txt
Domain IPs 104.18.4.8, 104.18.5.8
Response IP 104.18.4.8
Found Yes
Hash 307f5755be84c6186f531d0b7ec74b38312b26bcea9924431473fee34f9396fa
SimHash 4868ce8311e7

Groups

*

Rule Path
Disallow /_temp/
Disallow /custom/
Disallow /exemplos/
Disallow /imagens/
Disallow /includes/
Disallow /static/
Disallow /src/
Disallow /cdn/
Disallow /_files/
Disallow /*.pdf$
Disallow /*.json$
Disallow /*.pdf
Disallow /*.jpg
Disallow /*.JPG
Disallow /*.jpeg
Disallow /*.JPEG
Disallow /*.png
Disallow /*.PNG
Disallow /*.gif
Disallow /*.GIF

facebot

Rule Path
Disallow /

facebookexternalhit

Rule Path
Disallow /

googlebot-news

Rule Path
Disallow /

yandex

Rule Path
Disallow /

slurp

Rule Path
Disallow /

baidoospider

Rule Path
Disallow /

googlebot

Rule Path
Disallow /

msnbot

Rule Path
Disallow /

googlebot-image

Rule Path
Disallow /

yahoo-mmcrawler

Rule Path
Disallow /

psbot

Rule Path
Disallow /