ca.discount.wave-base.com
robots.txt

Robots Exclusion Standard data for ca.discount.wave-base.com

Resource Scan

Scan Details

Site Domain ca.discount.wave-base.com
Base Domain wave-base.com
Scan Status Ok
Last Scan2024-10-22T12:20:45+00:00
Next Scan 2024-11-21T12:20:45+00:00

Last Scan

Scanned2024-10-22T12:20:45+00:00
URL https://ca.discount.wave-base.com/robots.txt
Domain IPs 2600:9000:2022:1600:d:47fe:7bc0:93a1, 2600:9000:2022:2400:d:47fe:7bc0:93a1, 2600:9000:2022:5000:d:47fe:7bc0:93a1, 2600:9000:2022:5600:d:47fe:7bc0:93a1, 2600:9000:2022:9200:d:47fe:7bc0:93a1, 2600:9000:2022:9400:d:47fe:7bc0:93a1, 2600:9000:2022:d600:d:47fe:7bc0:93a1, 2600:9000:2022:f000:d:47fe:7bc0:93a1, 54.230.112.117, 54.230.112.20, 54.230.112.21, 54.230.112.92
Response IP 3.164.206.22
Found Yes
Hash 066a5d9435fa475c0d0221c3a4f2a202eb22ac41506be27fadc32a1002e73fc8
SimHash 9b42d5526d60

Groups

*

Rule Path
Disallow /favicon.ico
Disallow /msg-pic/
Disallow /sale/redirect/
Disallow /26297917/
Disallow /*.php$
Disallow /*.php?*
Disallow /*.asp$
Disallow /*.jsp$
Disallow /.well-known/
Disallow /.env
Disallow /.*
Disallow /service/sync-with
Disallow /wordpress/
Disallow /vendor/
Disallow /wp-*/
Disallow /wp-*/
Disallow /cgi-bin/
Disallow /jp
Disallow /jp?*
Disallow /jp/
Disallow /tw
Disallow /tw?*
Disallow /tw/
Disallow /hk
Disallow /hk?*
Disallow /hk/

chatglm

Rule Path
Disallow /

dataforseobot

Rule Path
Disallow /

mail.ru_bot

Rule Path
Disallow /

zoominfobot

Rule Path
Disallow /

velenpublicwebcrawler

Rule Path
Disallow /

petalbot

Rule Path
Disallow /

paqlebot

Rule Path
Disallow /

barkrowler

Rule Path
Disallow /

babbar

Rule Path
Disallow /

muckrack

Rule Path
Disallow /

criteobot

Rule Path
Disallow /

censysinspect

Rule Path
Disallow /

cincraw

Rule Path
Disallow /

go-http-client

Rule Path
Disallow /

unknown

Rule Path
Disallow /

bytespider

Rule Path
Disallow /

bytedance

Rule Path
Disallow /

spider-feedback@bytedance.com

Rule Path
Disallow /

bytedance.com

Rule Path
Disallow /

ia_archiver

Rule Path
Disallow /

archive.org_bot

Rule Path
Disallow /

ia_archiver-web.archive.org

Rule Path
Disallow /

coccocbot-web

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

blexbot

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

geedobot

Rule Path
Disallow /

netestate ne crawler (+http://www.website-datenbank.de/)

Rule Path
Disallow /

petalbot

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

siteauditbot

Rule Path
Disallow /

semrushbot-ba

Rule Path
Disallow /

semrushbot-si

Rule Path
Disallow /

semrushbot-swa

Rule Path
Disallow /

semrushbot-ct

Rule Path
Disallow /

splitsignalbot

Rule Path
Disallow /

semrushbot-coub

Rule Path
Disallow /

cazoodlebot

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

dotbot/1.1

Rule Path
Disallow /

gigabot

Rule Path
Disallow /

etaospider

Rule Path
Disallow /

sogou web spider

Rule Path
Disallow /

new-sogou-spider

Rule Path
Disallow /

sogou

Rule Path
Disallow /

sogou mobile spider

Rule Path
Disallow /

sogou inst spider

Rule Path
Disallow /

sogou pic spider

Rule Path
Disallow /

sogou head spider

Rule Path
Disallow /

sogou orion spider

Rule Path
Disallow /

sogou-test-spider

Rule Path
Disallow /

sogou spider

Rule Path
Disallow /

sogou pic agent

Rule Path
Disallow /

sogou spider2

Rule Path
Disallow /

sogou blog

Rule Path
Disallow /

sogou news spider

Rule Path
Disallow /

sogou orion spider

Rule Path
Disallow /

chinasospider

Rule Path
Disallow /

sosospider

Rule Path
Disallow /

yisouspider

Rule Path
Disallow /

easouspider

Rule Path
Disallow /

etaospider

Rule Path
Disallow /

blexbot

Rule Path
Disallow /

jikespider

Rule Path
Disallow /

proximic

Rule Path
Disallow /

mozilla/5.0 (compatible;picmole/1.0 +http://www.picmole.com)

Rule Path
Disallow /

lexxebot/1.0

Rule Path
Disallow /

nextgensearchbot

Rule Path
Disallow /

mozilla/5.0 (compatible; spbot/2.0; http://www.seoprofiler.com/bot/ )

Rule Path
Disallow /

sosospider

Rule Path
Disallow /

sosospider+(+http://help.soso.com/webspider.htm)

Rule Path
Disallow /

sitebot/0.1

Rule Path
Disallow /

sitebot

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

crystalsemanticsbot

Rule Path
Disallow /

crystalsemanticsbot

Rule Path
Disallow /

netseer crawler

Rule Path
Disallow /

trovitbot

Rule Path
Disallow /

lexxebot

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

ezooms

Rule Path
Disallow /

discobot

Rule Path
Disallow /

jyxobot

Rule Path
Disallow /

sogou

Rule Path
Disallow /

sogou spider

Rule Path
Disallow /

sogou web spider/4.0(+http://www.sogou.com/docs/help/webmasters.htm

Product Comment
sogou web spider/4.0(+http://www.sogou.com/docs/help/webmasters.htm 07)
Rule Path
Disallow /

sistrix

Rule Path
Disallow /

heritrix

Rule Path
Disallow /

garlikcrawler/1.1 (http://garlik.com/, crawler@garlik.com)

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

nerdbynature.bot

Rule Path
Disallow /

mozilla/4.0 (compatible; msie 5.0; windows nt; digext; dts agent

Rule Path
Disallow /

psbot

Rule Path
Disallow /

wbsearchbot

Rule Path
Disallow /

addthis.com robot tech.support@clearspring.com

Rule Path
Disallow /

addthis.com

Rule Path
Disallow /

ia_archiver

Rule Path
Disallow /

proximic

Rule Path
Disallow /

discoverybot

Rule Path
Disallow /

bl.uk_lddc_bot

Rule Path
Disallow /

istellabot

Rule Path
Disallow /

seokicks

Rule Path
Disallow /

seokicks-robot

Rule Path
Disallow /

unisterbot

Rule Path
Disallow /

bender

Rule Path
Disallow /

wotbox

Rule Path
Disallow /

yasni

Rule Path
Disallow /

jikespider

Rule Path
Disallow /

netestate ne crawler

Rule Path
Disallow /

exabot

Rule Path
Disallow /

pixray-seeker

Rule Path
Disallow /

linguee

Rule Path
Disallow /

integromedb

Rule Path
Disallow /

searchmetricsbot

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

blexbot

Rule Path
Disallow /

bdcbot

Rule Path
Disallow /

grapeshotcrawler

Rule Path
Disallow /

wesee:search

Rule Path
Disallow /

turnitinbot

Rule Path
Disallow /

admantx

Rule Path
Disallow /

spbot

Rule Path
Disallow /

bubing

Rule Path
Disallow /

sogou develop spider

Rule Path
Disallow /

sogou head spider

Rule Path
Disallow /

sogou js robot

Rule Path
Disallow /

sogou orion spider

Rule Path
Disallow /

sogou pic agent

Rule Path
Disallow /

sogou pic spider

Rule Path
Disallow /

sogou push spider

Rule Path
Disallow /

sogou spider

Rule Path
Disallow /

sogou web spider

Rule Path
Disallow /

sogou-test-spider

Rule Path
Disallow /

powermapper

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

sogou spider

Rule Path
Disallow /

seokicks-robot

Rule Path
Disallow /

blexbot

Rule Path
Disallow /

sistrix crawler

Rule Path
Disallow /

uptimerobot/2.0

Rule Path
Disallow /

ezooms robot

Rule Path
Disallow /

perl lwp

Rule Path
Disallow /

blexbot

Rule Path
Disallow /

netestate ne crawler (+http://www.website-datenbank.de/)

Rule Path
Disallow /

wiseguys robot

Rule Path
Disallow /

turnitin robot

Rule Path
Disallow /

turnitinbot

Rule Path
Disallow /

turnitin bot

Rule Path
Disallow /

turnitinbot/3.0 (http://www.turnitin.com/robot/crawlerinfo.html)

Rule Path
Disallow /

turnitinbot/3.0

Rule Path
Disallow /

heritrix

Rule Path
Disallow /

pimonster

Rule Path
Disallow /

searchmetricsbot

Rule Path
Disallow /

eccp/1.0 (search@eniro.com)

Rule Path
Disallow /

sogou spider

Rule Path
Disallow /

youdaobot

Rule Path
Disallow /

gsa-crawler (enterprise; t4-knhh62cdkc2w3; gsa_manage@nikon-sys.co.jp)

Rule Path
Disallow /

megaindex.ru/2.0

Rule Path
Disallow /

megaindex.ru

Rule Path
Disallow /

megaindex.ru

Rule Path
Disallow /

yandexbot

Rule Path
Disallow /

yandex

Rule Path
Disallow /

Comments

  • https://developers.google.com/search/docs/crawling-indexing/robots/robots_txt
  • Disallow: //
  • Disallow: /nz
  • Disallow: /nz?*
  • Disallow: /nz/
  • Disallow: /doc/
  • Crawl-delay: 1
  • User-agent: Mediapartners-Google
  • Allow: /nz
  • Allow: /nz?*
  • Allow: /nz/
  • User-agent: Google-Display-Ads-Bot
  • Allow: /nz
  • Allow: /nz?*
  • Allow: /nz/
  • To block Coccocbot from crawling:
  • To block Dotbot from crawling:
  • To block Common Crawl Bot from crawling:
  • To block Common Crawl Bot from crawling:
  • To block GeedoBot from crawling:
  • Block netEstate NE Crawler (+http://www.website-datenbank.de/)
  • To block PetalBot from crawling:
  • To block AhrefsBot from crawling:
  • Blocked SemrushBot since it is creating lot of requests
  • To block SemrushBot from crawling your site for different SEO and technical issues:
  • To block SemrushBot from crawling your site for Backlink Audit tool:
  • To block SemrushBot from crawling your site for On Page SEO Checker tool and similar tools:
  • To block SemrushBot from checking URLs on your site for SWA tool:
  • To block SemrushBot from crawling your site for Content Analyzer and Post Tracking tools:
  • To block SplitSignalBot from crawling your site for SplitSignal tool:
  • To block SemrushBot-COUB from crawling your site for Content Outline Builder tool:
  • Block CazoodleBot as it does not present correct accept content headers
  • Block MJ12bot as it is just noise
  • Block dotbot as it cannot parse base urls properly
  • Block Gigabot
  • Block Etaospider
  • Sogou Bot Blocking
  • Block MJ12bot as it is just noise
  • Block Ahrefs
  • Block Sogou
  • Block SEOkicks
  • Block BlexBot
  • Block SISTRIX
  • Block Uptime robot
  • Block Ezooms Robot
  • Block Perl LWP
  • Block BlexBot
  • Block netEstate NE Crawler (+http://www.website-datenbank.de/)
  • Block WiseGuys Robot
  • Block Turnitin Robot
  • Block Heritrix
  • Block pricepi
  • Block Searchmetrics Bot
  • Block Eniro
  • Block SoGou
  • Block Youdao
  • Block Nikon JP Crawler
  • Block MegaIndex.ru
  • User-agent: Baiduspider
  • Disallow: /
  • Block YandexBot

Warnings

  • 4 invalid lines.