cafsti.org
robots.txt

Robots Exclusion Standard data for cafsti.org

Resource Scan

Scan Details

Site Domain cafsti.org
Base Domain cafsti.org
Scan Status Failed
Failure StageFetching resource.
Failure ReasonServer returned a client error.
Last Scan2024-08-08T09:50:50+00:00
Next Scan 2024-11-06T09:50:50+00:00

Last Successful Scan

Scanned2023-04-05T16:04:35+00:00
URL https://cafsti.org/robots.txt
Domain IPs 138.197.14.230
Response IP 138.197.14.230
Found Yes
Hash 07fc206b0f1be18ec7daeea1c7980f73dd2c5c7a6660bbc147d1988c038d04df
SimHash 4260434a33b8

Groups

adidxbot
ahrefsbot
aihitbot
alphaseobot
alphaseobot-sa
baiduspider
bingpreview
blexbot
careerbot
cliqzbot
dotbot
grapeshot
ichiro
icjobs
linkdexbot
magpie-crawler
megaindex
mj12bot
moget
naverbot
owlin
owlin bot
owlin bot v. 3.0
proximic
queryseekerspider
scrapy
scrapybot
semrushbot
sentibot
seokicks-robot
sogou
sogou spider
tkbot
trendkite-akashic-crawler
vagabondo
wbsearchbot
yandex
yandexbot
yeti
youdaobot

Rule Path
Disallow /

*

Rule Path
Allow /wp-includes/js/

*

Rule Path
Disallow /wp-admin/

*

Rule Path
Disallow /wp-includes/

*

Rule Path
Disallow /xmlrpc.php

*

Rule Path
Disallow /profile

*

Rule Path
Disallow /cgi-bin/

*

Rule Path
Disallow /wp-content/cache/

*

Rule Path
Disallow /trackback/

*

Rule Path
Disallow /comments/

*

Rule Path
Disallow /administrator/

*

Rule Path
Disallow */trackback/

*

Rule Path
Disallow */comments/

*

Rule Path
Disallow /license.txt

*

Rule Path
Disallow /*.php$

*

Rule Path
Disallow *?filter

*

Rule Path
Disallow /wp-content/themes/

*

Rule Path
Disallow /readme.html

Comments

  • Horrible bandwidth eating robots