cnn.com.br
robots.txt

Robots Exclusion Standard data for cnn.com.br

Resource Scan

Scan Details

Site Domain cnn.com.br
Base Domain cnn.com.br
Scan Status Ok
Last Scan2024-11-14T20:36:33+00:00
Next Scan 2024-11-21T20:36:33+00:00

Last Scan

Scanned2024-11-14T20:36:33+00:00
URL https://www.cnn.com.br/robots.txt
Redirect https://www.cnnbrasil.com.br/robots.txt
Redirect Domain www.cnnbrasil.com.br
Redirect Base cnnbrasil.com.br
Domain IPs 18.230.212.161
Redirect IPs 192.0.66.182, 2a04:fa87:fffd::c000:42b6
Response IP 192.0.66.182
Found Yes
Hash 03e2895062d179f44cd1a09a56eec24e286094587df40fcd90aee087a4e59cb5
SimHash 1e31f0683c30

Groups

*

Rule Path
Disallow /wp-admin/
Disallow /component/
Disallow /obrigado-newsletter-business
Disallow /component
Disallow *amp.cnnbrasil.com.br/*
Disallow /youtube/video
Disallow /author
Disallow /author/
Disallow /author/*
Disallow /?s=
Disallow /?hidemenu=true*
Disallow *api.cnnbrasil.com.br/*
Disallow *mediastorage.cnnbrasil.com.br/*
Disallow /esportes/agenda/2025
Disallow /esportes/agenda/2026
Disallow /esportes/agenda/2027
Disallow /esportes/agenda/2028
Disallow /esportes/agenda/2029
Disallow /esportes/agenda/2030
Disallow /esportes/agenda/2031
Disallow /esportes/agenda/2032
Disallow /esportes/agenda/2033
Disallow /esportes/agenda/2034
Disallow /esportes/agenda/2035
Disallow /esportes/agenda/2036
Disallow /esportes/agenda/2037
Disallow /esportes/agenda/2038
Disallow /esportes/agenda/2039
Disallow /esportes/agenda/2040
Disallow /esportes/agenda/2041
Disallow /esportes/agenda/2042
Disallow /esportes/agenda/2043
Disallow /esportes/agenda/2044
Disallow /esportes/agenda/2045
Disallow /esportes/agenda/2046
Disallow /esportes/agenda/2047
Disallow /esportes/agenda/2048
Disallow /esportes/agenda/2049
Disallow /esportes/agenda/2050
Disallow /esportes/agenda/2051

gptbot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

anthropic-ai

Rule Path
Disallow /

Comments

  • robots.txt