casa39.it
robots.txt

Robots Exclusion Standard data for casa39.it

Resource Scan

Scan Details

Site Domain casa39.it
Base Domain casa39.it
Scan Status Ok
Last Scan2024-06-09T16:54:54+00:00
Next Scan 2024-07-09T16:54:54+00:00

Last Scan

Scanned2024-06-09T16:54:54+00:00
URL https://casa39.it/robots.txt
Domain IPs 142.132.252.172
Response IP 142.132.252.172
Found Yes
Hash b1571d38252a3a1b500b2fcec7bf1c002ede83e49502887490a9441b3d8c3f19
SimHash 3b784150ba6a

Groups

*

Rule Path
Disallow /catalogsearch/
Disallow /catalog/product_compare/
Disallow /customer/
Disallow /catalog/category/view/
Disallow /catalog/product/view/
Disallow /manufacturer
Disallow /piastrelle
Disallow /materiale
Disallow /request_quote/cart/
Disallow /checkout/
Disallow /wishlist/
Disallow /cgi-bin/
Disallow /cleanup.php
Disallow /apc.php
Disallow /memcache.php
Disallow /phpinfo.php

googlebot-image

Rule Path
Allow /

googlebot

Rule Path
Allow /

twengabot-2.0

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

blexbot

Rule Path
Disallow /

sistrix

Rule Path
Disallow /

sistrix crawler

Rule Path
Disallow /

sistrix

Rule Path
Disallow /

seokicks-robot

Rule Path
Disallow /

jobs.de-robot

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

unisterbot

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

searchmetricsbot

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

surveybot

Rule Path
Disallow /

seodiver

Rule Path
Disallow /

spbot

Rule Path
Disallow /

wotbox

Rule Path
Disallow /

meanpathbot

Rule Path
Disallow /

backlinkcrawler

Rule Path
Disallow /

magpie-crawler

Rule Path
Disallow /

obot

Rule Path
Disallow /

fr-crawler

Rule Path
Disallow /

blexbot

Rule Path
Disallow /

megaindex.ru

Rule Path
Disallow /

megaindex.com

Rule Path
Disallow /

cloudservermarketspider

Rule Path
Disallow /

trendictionbot

Rule Path
Disallow /

exabot

Rule Path
Disallow /

careerbot

Rule Path
Disallow /

lipperhey-kaus-australis

Rule Path
Disallow /

seoscanners.net

Rule Path
Disallow /

metajobbot

Rule Path
Disallow /

spiderbot

Rule Path
Disallow /

linkstats

Rule Path
Disallow /

jobboersebot

Rule Path
Disallow /

iccrawler

Rule Path
Disallow /

plista

Rule Path
Disallow /

domain re-animator bot

Rule Path
Disallow /

lipperhey-kaus-australis

Rule Path
Disallow /

turnitinbot

Rule Path
Disallow /

coccoc

Rule Path
Disallow /

um-ic

Rule Path
Disallow /

mindupbot

Rule Path
Disallow /

sg-orbiter

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

qwantify

Rule Path
Disallow /

kraken

Rule Path
Disallow /

plukkie

Rule Path
Disallow /

safednsbot

Rule Path
Disallow /

haosouspider

Rule Path
Disallow /

rogerbot

Rule Path
Disallow /

openhosebot

Rule Path
Disallow /

thumbsniper

Rule Path
Disallow /

r6_commentreader

Rule Path
Disallow /

implisensebot

Rule Path
Disallow /

cliqzbot

Rule Path
Disallow /

aihitbot

Rule Path
Disallow /

trendictionbot

Rule Path
Disallow /

yandex

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

baiduspider

Rule Path
Disallow /

xovibot

Rule Path
Disallow /

seznambot

Rule Path
Disallow /

Other Records

Field Value
sitemap https://www.casa39.es/pub/google_sitemap_5_index.xml
sitemap https://www.arredo39.cn.com/pub/google_sitemap_arredo39_cn_com.xml
sitemap https://www.arredo39.fr/pub/google_sitemap_arredo39_fr.xml
sitemap https://www.arredo39.de/pub/google_sitemap_arredo39_de.xml
sitemap https://www.arredo39.com/pub/google_sitemap_arredo39_com.xml
sitemap https://www.arredo39.it/pub/google_sitemap_arredo39_ita.xml
sitemap https://www.casa39.fr/pub/google_sitemap_4_index.xml
sitemap https://www.casa39.com/pub/google_sitemap_3_index.xml
sitemap https://www.casa39.de/pub/google_sitemap_2_index.xml
sitemap https://www.casa39.it/pub/google_sitemap_1_index.xml

Comments

  • Do not crawl seach pages and not-SEO optimized catalog links
  • SERVER SETTINGS
  • Do not crawl common server technical folders and files
  • IMAGE CRAWLERS SETTINGS
  • Extra: Uncomment if you do not wish Google and Bing to index your images
  • User-agent: msnbot-media
  • Disallow: /
  • Disallow: Sistrix
  • Disallow: Sistrix
  • Disallow: Sistrix
  • Disallow: SEOkicks-Robot
  • Disallow: jobs.de-Robot
  • Backlink Analysis
  • Bot der Leipziger Unister Holding GmbH
  • http://moz.com/products
  • http://www.searchmetrics.com
  • http://www.majestic12.co.uk/projects/dsearch/mj12bot.php
  • http://www.domaintools.com/webmasters/surveybot.php
  • http://www.seodiver.com/bot
  • http://openlinkprofiler.org/bot
  • http://www.wotbox.com/bot/
  • http://www.meanpath.com/meanpathbot.html
  • http://www.backlinktest.com/crawler.html
  • http://www.brandwatch.com/magpie-crawler/
  • http://filterdb.iss.net/crawler/
  • http://webmeup-crawler.com
  • https://megaindex.com/crawler
  • http://www.cloudservermarket.com
  • http://www.trendiction.de/de/publisher/bot
  • http://www.exalead.com
  • http://www.career-x.de/bot.html
  • https://www.lipperhey.com/en/about/
  • https://www.lipperhey.com/en/about/
  • https://turnitin.com/robot/crawlerinfo.html
  • http://help.coccoc.com/
  • ubermetrics-technologies.com
  • datenbutler.de
  • http://searchgears.de/uber-uns/crawling-faq.html
  • http://commoncrawl.org/faq/
  • https://www.qwant.com/
  • http://linkfluence.net/
  • http://www.botje.com/plukkie.htm
  • https://www.safedns.com/searchbot
  • http://www.haosou.com/help/help_3_2.html
  • http://www.haosou.com/help/help_3_2.html
  • http://www.moz.com/dp/rogerbot
  • http://www.openhose.org/bot.html
  • http://www.screamingfrog.co.uk/seo-spider/
  • User-agent: Screaming Frog SEO Spider
  • Disallow: /
  • http://thumbsniper.com
  • http://www.radian6.com/crawler
  • http://cliqz.com/company/cliqzbot
  • https://www.aihitdata.com/about
  • http://www.trendiction.com/en/publisher/bot
  • https://yandex.com/support/webmaster/controlling-robot/robots-txt.xml#crawl-delay
  • http://help.baidu.com/question?prod_en=master&class=Baiduspider
  • http://www.xovibot.net/
  • http://fulltext.sblog.cz/
  • http://www.semrush.com/bot.html
  • User-agent: SemrushBot
  • Disallow: /

Warnings

  • 2 invalid lines.