cimorelli.com
robots.txt

Robots Exclusion Standard data for cimorelli.com

Resource Scan

Scan Details

Site Domain cimorelli.com
Base Domain cimorelli.com
Scan Status Ok
Last Scan2024-11-16T01:03:41+00:00
Next Scan 2024-12-16T01:03:41+00:00

Last Scan

Scanned2024-11-16T01:03:41+00:00
URL https://cimorelli.com/robots.txt
Domain IPs 65.254.231.125
Response IP 65.254.231.125
Found Yes
Hash 3dc5e440f000839d748aa46cbc7919f566d0fcb5df4e188cce7983f7a3cba34d
SimHash 5894fd13cf78

Groups

*

Rule Path
Allow /misc/*.css$
Allow /misc/*.css?
Allow /misc/*.js$
Allow /misc/*.js?
Allow /misc/*.gif
Allow /misc/*.jpg
Allow /misc/*.jpeg
Allow /misc/*.png
Allow /modules/*.css$
Allow /modules/*.css?
Allow /modules/*.js$
Allow /modules/*.js?
Allow /modules/*.gif
Allow /modules/*.jpg
Allow /modules/*.jpeg
Allow /modules/*.png
Allow /profiles/*.css$
Allow /profiles/*.css?
Allow /profiles/*.js$
Allow /profiles/*.js?
Allow /profiles/*.gif
Allow /profiles/*.jpg
Allow /profiles/*.jpeg
Allow /profiles/*.png
Allow /themes/*.css$
Allow /themes/*.css?
Allow /themes/*.js$
Allow /themes/*.js?
Allow /themes/*.gif
Allow /themes/*.jpg
Allow /themes/*.jpeg
Allow /themes/*.png
Disallow /includes/
Disallow /misc/
Disallow /modules/
Disallow /profiles/
Disallow /scripts/
Disallow /themes/
Disallow /cgi-bin/
Disallow /CHANGELOG.txt
Disallow /cron.php
Disallow /INSTALL.mysql.txt
Disallow /INSTALL.pgsql.txt
Disallow /INSTALL.sqlite.txt
Disallow /install.php
Disallow /INSTALL.txt
Disallow /LICENSE.txt
Disallow /MAINTAINERS.txt
Disallow /update.php
Disallow /UPGRADE.txt
Disallow /xmlrpc.php
Disallow /admin/
Disallow /comment/reply/
Disallow /filter/tips/
Disallow /node/add/
Disallow /search/
Disallow /user/register/
Disallow /user/password/
Disallow /user/login/
Disallow /user/logout/
Disallow /?q=admin%2F
Disallow /?q=comment%2Freply%2F
Disallow /?q=filter%2Ftips%2F
Disallow /?q=node%2Fadd%2F
Disallow /?q=search%2F
Disallow /?q=user%2Fpassword%2F
Disallow /?q=user%2Fregister%2F
Disallow /?q=user%2Flogin%2F
Disallow /?q=user%2Flogout%2F

Other Records

Field Value
crawl-delay 10

a6-indexer

Rule Path
Disallow /

alphaseobot

Rule Path
Disallow /

alphaseobot-sa

Rule Path
Disallow /

applebot

Rule Path
Disallow /

blackboard safeassign

Rule Path
Disallow /

blexbot/1.0

Rule Path
Disallow /

crawler4j

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

gigabot

Rule Path
Disallow /

liebaofast

Rule Path
Disallow /

mauibot

Rule Path
Disallow /

mauibot (crawler.feedback+wc@gmail.com)

Rule Path
Disallow /

megaindex.ru/2.0

Rule Path
Disallow /

mqqbrowser

Rule Path
Disallow /

nimbostratus-bot/v1.3.2

Rule Path
Disallow /

seekport crawler

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

semrushbot-sa

Rule Path
Disallow /

seznambot

Rule Path
Disallow /

sputnikbot/2.3

Rule Path
Disallow /

the knowledge ai

Rule Path
Disallow /

turnitinbot

Rule Path
Disallow /

ucbrowser

Rule Path
Disallow /

yacybot

Rule Path
Disallow /

yeti

Rule Path
Disallow /

yisouspider

Rule Path
Disallow /

cliqzbot/3.0

Rule Path
Disallow /

cliqzbot

Rule Path
Disallow /

seekport crawler

Rule Path
Disallow /

paracrawl

Rule Path
Disallow /

scrapy/1.5.0

Rule Path
Disallow /

scrapy

Rule Path
Disallow /

velenpublicwebcrawler (velen.io)

Rule Path
Disallow /

velenpublicwebcrawler

Rule Path
Disallow /

semrushbot/2~bl

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

pcore-http

Rule Path
Disallow /

the knowledge ai

Rule Path
Disallow /

mauibot

Rule Path
Disallow /

crawler.feedback+wc@gmail.com

Rule Path
Disallow /

cyotekwebcopy/1.0

Rule Path
Disallow /

centurybot9@gmail.com

Rule Path
Disallow /

crawler (crawler.feedback@gmail.com)

Rule Path
Disallow /

crawler

Rule Path
Disallow /

barkrowler/0.7 (+http://www.exensa.com/crawl)

Rule Path
Disallow /

go-http-client/1.1

Rule Path
Disallow /

test crawl

Rule Path
Disallow /

scalaj-http/1.0

Rule Path
Disallow /

bubing

Rule Path
Disallow /

wotbox/2.01

Rule Path
Disallow /

ccbot/2.0

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

ebibot

Rule Path
Disallow /

pcore-http/v0.24.5

Rule Path
Disallow /

testitest1

Rule Path
Disallow /

vegi bot

Rule Path
Disallow /

istellabot/t.1

Rule Path
Disallow /

istellabot/t.1.13

Rule Path
Disallow /

istellabot

Rule Path
Disallow /

ltx71

Rule Path
Disallow /

ltx71 - (http://ltx71.com/)

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

booglebot2

Rule Path
Disallow /

booglebot

Rule Path
Disallow /

booglebot 2.0

Rule Path
Disallow /

booglebot/2.0

Rule Path
Disallow /

mj12bot/v1.0.5

Rule Path
Disallow /

sitebot

Rule Path
Disallow /

baiduspider

Rule Path
Disallow /

baiduspider/2.0

Rule Path
Disallow /

influencebo

Rule Path
Disallow /

searchmetricsbot

Rule Path
Disallow /

sosospider

Rule Path
Disallow /

acoonbot

Rule Path
Disallow /

easouspider

Rule Path
Disallow /

businessdbbot

Rule Path
Disallow /

superfeedr

Rule Path
Disallow /

flipboardproxy

Rule Path
Disallow /

flipboard

Rule Path
Disallow /

yandexbot

Rule Path
Disallow /

jikespider

Rule Path
Disallow /

rogerbot

Rule Path
Disallow /

rogerbot/1.0

Rule Path
Disallow /

flipboardproxy

Rule Path
Disallow /

swebot

Rule Path
Disallow /

swebot

Rule Path
Disallow /

seznambot

Rule Path
Disallow /

www.80legs.com

Rule Path
Disallow /

nerdbynature.bot

Rule Path
Disallow /

comodospider

Rule Path
Disallow /

comodospider/nutch-1.2

Rule Path
Disallow /

daumoa

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

beetlebot

Rule Path
Disallow /

niki-bot

Rule Path
Disallow /

riddler

Rule Path
Disallow /

spbot

Rule Path
Disallow /

icarus6

Rule Path
Disallow /

icarus6

Rule Path
Disallow /

icarus

Rule Path
Disallow /

icarus

Rule Path
Disallow /

icarus6j

Rule Path
Disallow /

siteexplorer

Rule Path
Disallow /

seokicks-robot

Rule Path
Disallow /

knelson

Rule Path
Disallow /

knelson/0.9

Rule Path
Disallow /

wotbox/2.01

Rule Path
Disallow /

blexbot/1.0

Rule Path
Disallow /

yisouspider

Rule Path
Disallow /

qwantify

Rule Path
Disallow /

*

Rule Path
Disallow /chiasmus
Disallow /chiasmus/chiasmus.cgi
Disallow /synonymer/synonym_path.cgi
Disallow /synonymer/synonymer.cgi
Disallow /isomorphic_words/isomorphic_words.cgi
Disallow /vowels_away
Disallow /ordo/
Disallow /simple_rhyme/
Disallow /new_markov_words/bloggtraffsammanfattningar.cgi
Disallow /palin_gen/
Disallow /prefix_meld/
Disallow /minizinc/word_len3.mzn
Disallow /minizinc/word_len4.mzn
Disallow /minizinc/word_len5.mzn
Disallow /minizinc/word_golf_n3.mzn
Disallow /reading_scrambled_words/sim_0_0.swe
Disallow /spelling_out_words/swe_not_full_spelling.txt
Disallow /webblogg/mt-comments_x.cgi

Other Records

Field Value
crawl-delay 4.5

Comments

  • robots.txt
  • This file is to prevent the crawling and indexing of certain parts
  • of your site by web crawlers and spiders run by sites like Yahoo!
  • and Google. By telling these "robots" where not to go on your site,
  • you save bandwidth and server resources.
  • This file will be ignored unless it is at the root of your host:
  • Used: http://example.com/robots.txt
  • Ignored: http://example.com/site/robots.txt
  • For more information about the robots.txt standard, see:
  • http://www.robotstxt.org/robotstxt.html
  • CSS, JS, Images
  • Directories
  • Files
  • Paths (clean URLs)
  • Paths (no clean URLs)
  • Block bots
  • RDH, 08.19.19: I really don't want to block Applebot, but for now, I am. It is crawling us too much

Warnings

  • 10 invalid lines.