cdkitchen.com
robots.txt

Robots Exclusion Standard data for cdkitchen.com

Resource Scan

Scan Details

Site Domain cdkitchen.com
Base Domain cdkitchen.com
Scan Status Ok
Last Scan2024-04-22T14:59:22+00:00
Next Scan 2024-04-29T14:59:22+00:00

Last Scan

Scanned2024-04-22T14:59:22+00:00
URL https://cdkitchen.com/robots.txt
Redirect https://www.cdkitchen.com/robots.txt
Redirect Domain www.cdkitchen.com
Redirect Base cdkitchen.com
Domain IPs 67.225.162.120
Redirect IPs 67.225.162.120
Response IP 67.225.162.120
Found Yes
Hash 690ed831225c323ac9d088fb8008e01ecd5f4bd46102ccce73b26648efe82d00
SimHash 5439dd514520

Groups

gptbot

Rule Path
Disallow /

daum

Rule Path
Disallow /

dataforseobot

Rule Path
Disallow /

uipbot/1.0 (uipbot@semasio.net)

Rule Path
Disallow /

the knowledge ai

Rule Path
Disallow /

velenpublicwebcrawler

Rule Path
Disallow /

awariorssbot
awariosmartbot

Rule Path
Disallow /

femtosearchbot

Rule Path
Disallow /

mediapartners-google

Rule Path
Allow /

yahoo-mmcrawler

Rule Path
Disallow /

timpibot/0.8

Rule Path
Disallow /

mauibot

Rule Path
Disallow /

semrushbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 120

siteauditbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 120

semrushbot-ba

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 120

siteauditbot/0.97

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 120

semrushbot-si

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 120

lcc

Rule Path
Disallow /

ltx71 - (http://ltx71.com/)

Rule Path
Disallow /

barkrowler/0.9

Rule Path
Disallow /

ptd-crawler

Rule Path
Disallow /

linguee

Rule Path
Disallow /

covario-ids

Rule Path
Disallow /

blexbot

Rule Path
Disallow /

covario-ids/1.0

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

ia_archiver

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

ahrefssiteaudit

Rule Path
Disallow /

coccoc

Rule Path
Disallow /

mixnode

Rule Path
Disallow /

aboundexbot

Rule Path
Disallow /

bingbot

Rule Path
Disallow /search/

ntentbot

Rule Path
Disallow /

*

Rule Path
Disallow /loadcontent/
Disallow /show/
Disallow /cgi-bin/
Disallow /cgi-bin/admin/
Disallow /php/test/
Disallow /cooking/questions/

yandex

Rule Path
Disallow /

baiduspider

Rule Path
Disallow /

msnbot

Rule Path
Disallow /*login.html*$

msnbot

Rule Path
Disallow /*addNewLink*.*$

bingbot

Rule Path
Disallow /search/*.*$

netresearchserver

Rule Path
Disallow /

aboutusbot

Rule Path
Disallow /

psbot

Rule Path
Disallow /

zeusbot

Rule Path
Disallow /

msiecrawler

Rule Path
Disallow /

archive.org_bot

Rule Path
Disallow /

java 1.1

Rule Path
Disallow /

lwp-trivial

Rule Path
Disallow /

python-urllib

Rule Path
Disallow /

python-webchecker

Rule Path
Disallow /

turnitinbot

Rule Path
Disallow /

wget

Rule Path
Disallow /

zeus link scout

Rule Path
Disallow /

turnitinbot/1.4 http://www.turnitin.com/robot/crawlerinfo.html

Rule Path
Disallow /

slysearch/1.0 http://www.plagiarism.org/crawler/robotinfo.html

Rule Path
Disallow /

flashget

Rule Path
Disallow /

Comments

  • This is just for bots that actually obey robots.txt files. We block bots
  • in other ways as well.
  • Disallow: /search/search.php
  • Disallow: /sponsored/
  • User-agent: SemrushBot
  • Disallow: /