cien.org.gt
robots.txt

Robots Exclusion Standard data for cien.org.gt

Resource Scan

Scan Details

Site Domain cien.org.gt
Base Domain cien.org.gt
Scan Status Failed
Failure StageFetching resource.
Failure ReasonServer returned a client error.
Last Scan2024-07-26T06:37:47+00:00
Next Scan 2024-10-24T06:37:47+00:00

Last Successful Scan

Scanned2023-03-12T00:57:45+00:00
URL https://cien.org.gt/robots.txt
Domain IPs 192.145.232.220
Response IP 192.145.232.220
Found Yes
Hash 53e86de92ed2976c7f181e98fcb69d53215ff552fb14a1f34a5ec59b6a0bffc2
SimHash 0155ee22ae42

Groups

dotbot

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

baiduspider

Rule Path
Disallow /

ezooms

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

yandexbot

Rule Path
Disallow /

lcc

Rule Path
Disallow /

petalbot

Rule Path
Disallow /

nutch

Rule Path
Disallow /

*

Rule Path
Disallow /wp-admin
Disallow /wp-includes
Disallow /wp-content/plugins
Disallow /wp-content/cache
Disallow /wp-content/themes
Disallow /wp-includes/js
Disallow /trackback
Disallow /category/*/*
Disallow */trackback
Disallow /*?*
Disallow /*?
Disallow /*~*
Disallow /*~