afiliadoscpa.com.br
robots.txt

Robots Exclusion Standard data for afiliadoscpa.com.br

Archived Snapshots

Resource Scan

Scan Details

Site Domain	afiliadoscpa.com.br
Base Domain	afiliadoscpa.com.br
Scan Status	Failed
Failure Stage	Fetching resource.
Failure Reason	Server returned a client error.
Last Scan	2024-08-28T10:02:27+00:00
Next Scan	2024-11-26T10:02:27+00:00

Last Successful Scan

Scanned	2023-08-05T06:22:51+00:00
URL	https://afiliadoscpa.com.br/robots.txt
Domain IPs	184.174.36.112
Response IP	184.174.36.112
Found	Yes
Hash	9100d3da44985b6e0415f0755eadcb87674871fab269a7edafe3b0d31e8da285
SimHash	f8953111e640

Groups

mauibot

Rule	Path
Disallow	/

Rule

Path

Disallow

blexbot

Rule	Path
Disallow	/

Rule

Path

Disallow

seo spider

Rule	Path
Disallow	/

Rule

Path

Disallow

yandexbot

Rule	Path
Disallow	/

Rule

Path

Disallow

alexawebsearchplatform

Rule	Path
Disallow	/

Rule

Path

Disallow

archive.org_bot

Rule	Path
Disallow	/

Rule

Path

Disallow

baiduspider

Rule	Path
Disallow	/

Rule

Path

Disallow

betabot

Rule	Path
Disallow	/

Rule

Path

Disallow

petalbot

Rule	Path
Disallow	/

Rule

Path

Disallow

dotbot

Rule	Path
Disallow	/

Rule

Path

Disallow

screaming frog seo spider

Rule	Path
Disallow	/

Rule

Path

Disallow

crawl

Rule	Path
Disallow	/

Rule

Path

Disallow

exabot

Rule	Path
Disallow	/

Rule

Path

Disallow

gigabot

Rule	Path
Disallow	/

Rule

Path

Disallow

infoseek sidewinder

Rule	Path
Disallow	/

Rule

Path

Disallow

linkchecker

Rule	Path
Disallow	/

Rule

Path

Disallow

netsongbot

Rule	Path
Disallow	/

Rule

Path

Disallow

dataforseobot

Rule	Path
Disallow	/

Rule

Path

Disallow

dataforseo-bot

Rule	Path
Disallow	/

Rule

Path

Disallow

mj12bot/

Rule	Path
Disallow	/

Rule

Path

Disallow

linkdexbot

Rule	Path
Disallow	/

Rule

Path

Disallow

adsbot

Rule	Path
Disallow	/

Rule

Path

Disallow

awariosmartbot

Rule	Path
Disallow	/

Rule

Path

Disallow

feedspot

Rule	Path
Disallow	/

Rule

Path

Disallow

sogou web spider

Rule	Path
Disallow	/

Rule

Path

Disallow

feedspot/1.0

Rule	Path
Disallow	/

Rule

Path

Disallow

keybot

Rule	Path
Disallow	/

Rule

Path

Disallow

zoominfobot

Rule	Path
Disallow	/

Rule

Path

Disallow

feedspot

Rule	Path
Disallow	/

Rule

Path

Disallow

checkhost

Rule	Path
Disallow	/

Rule

Path

Disallow

orbbot

Rule	Path
Disallow	/

Rule

Path

Disallow

yandeximages

Rule	Path
Disallow	/

Rule

Path

Disallow

wellknownbot

Rule	Path
Disallow	/

Rule

Path

Disallow

surdotlybot

Rule	Path
Disallow	/

Rule

Path

Disallow

gdnplus.com

Rule	Path
Disallow	/

Rule

Path

Disallow

online-webceo-bot

Rule	Path
Disallow	/

Rule

Path

Disallow

scrapy

Rule	Path
Disallow	/

Rule

Path

Disallow

rogerbot

Rule	Path
Disallow	/

Rule

Path

Disallow

mj12bot

Rule	Path
Disallow	/

Rule

Path

Disallow

dotbot

Rule	Path
Disallow	/

Rule

Path

Disallow

alexibot

Rule	Path
Disallow	/

Rule

Path

Disallow

surveybot

Rule	Path
Disallow	/

Rule

Path

Disallow

xenu

Rule	Path
Disallow	/

Rule

Path

Disallow

exabot

Rule	Path
Disallow	/

Rule

Path

Disallow

gigabot

Rule	Path
Disallow	/

Rule

Path

Disallow

blekkobot

Rule	Path
Disallow	/

Rule

Path

Disallow

mecrawler

Rule	Path
Disallow	/

Rule

Path

Disallow

ia_archiver

Rule	Path
Disallow	/

Rule

Path

Disallow

turnitinbot

Rule

Path

Disallow

python-requests

Rule

Path

Disallow

python-urllib/2.7

Rule

Path

Disallow

curl/7.35.0

Rule

Path

Disallow

wp_is_mobile

Rule

Path

Disallow

java/11.0.10

Rule

Path

Disallow

photon

Rule

Path

Disallow

photon/1.0

Rule

Path

Disallow

serendeputybot

Rule

Path

Disallow

webpage-inspector.com

Rule

Path

Disallow

twingly recon

Rule

Path

Disallow

bidtellect/0.0.958.0

Rule

Path

Disallow

slackbot-linkexpanding 1.0

Rule

Path

Disallow

df bot 1.0

Rule

Path

Disallow

who.is bot

Rule

Path

Disallow

sem rush bot

Rule

Path

Disallow

yandex bot

Rule

Path

Disallow

sogou bot

Rule

Path

Disallow

majestic bot

Rule

Path

Disallow

exalead bot

Rule

Path

Disallow

baidu bot

Rule

Path

Disallow

mediamathbot/1.0

Rule

Path

Disallow

trendictionbot0

Rule

Path

Disallow

omgili/0.5

Rule

Path

Disallow

iframely

Rule

Path

Disallow

cortex/1.0

Rule

Path

Disallow

trendictionbot0.5.0

Rule

Path

Disallow

domains project/1.3.7

Rule

Path

Disallow

trendictionbot

Rule

Path

Disallow

mediamathbot

Rule

Path

Disallow

omgili

Rule

Path

Disallow

universalfeedparser/5.2.1

Rule

Path

Disallow

woorankreview/2.0

Rule

Path

Disallow

feedbot/1.0

Rule

Path

Disallow

seekportbot

Rule

Path

Disallow

siteauditbot

Rule

Path

Disallow

daum/4.1

Rule

Path

Disallow

paperlibot/2.1

Rule

Path

Disallow

aboutusbot

Rule

Path

Disallow

neevabot/1.0

Rule

Path

Disallow

feedly/1.0

Rule

Path

Disallow

heritrix/3.3.0

Rule

Path

Disallow

wget

Rule

Path

Disallow

microsoft.url.control

Rule

Path

Disallow

ubicrawler

Rule

Path

Disallow

icc-crawler

Rule

Path

Disallow

sitecheck.internetseer.com

Rule

Path

Disallow

zealbot

Rule

Path

Disallow

webstripper

Rule

Path

Disallow

webcopier

Rule

Path

Disallow

httrack

Rule

Path

Disallow

libwww

Rule

Path

Disallow

baiduspider-image

Rule

Path

Disallow

k2spider

Rule

Path

Disallow

megaindex.ru

Rule

Path

Disallow

megaindex.com

Rule

Path

Disallow

linguee bot

Rule

Path

Disallow

mappy

Rule

Path

Disallow

garlikcrawler

Rule

Path

Disallow

feedbot

Rule

Path

Disallow

feedlyapp

Rule

Path

Disallow

feedly

Rule

Path

Disallow

wordpress

Rule

Path

Disallow

heritrix

Rule

Path

Disallow

fullstorybot

Rule

Path

Disallow

ias-va/3.1

Rule

Path

Disallow

xpanse-bot

Rule

Path

Disallow

phxbot/0.1

Rule

Path

Disallow

http://2ip.io

Rule

Path

Disallow

semrushbot

Rule

Path

Disallow

ahrefsbot

Rule

Path

Disallow

mojeebot

Rule

Path

Disallow

siteexplorer

Rule

Path

Disallow

uptimebot

Rule

Path

Disallow

screaming frog seo spider

Rule

Path

Disallow

phxbot

Rule

Path

Disallow

zgrab/0.x

Rule

Path

Disallow

wappalyzer

Rule

Path

Disallow

checkmarknetwork/1.0

Rule

Path

Disallow

duckduckgo-favicons-bot/1.0;

Rule

Path

Disallow

semrushbot

Rule

Path

Disallow

gptbot

Rule

Path

Disallow

*

Rule

Path

Disallow

/admin/

Disallow

/auth/

Disallow

/assets/browser-update*.js

Disallow

/email/

Disallow

/session

Disallow

/user-api-key

Disallow

/*?api_key*

Disallow

/*?*api_key*

Disallow

/badges

Disallow

/u/

Disallow

/my

Disallow

/search

Disallow

/tag/*/l

Disallow

/t/*/*.rss

Disallow

/c/*.rss

googlebot

Rule

Path

Disallow

/admin/

Disallow

/auth/

Disallow

/assets/browser-update*.js

Disallow

/email/

Disallow

/session

Disallow

/user-api-key

Disallow

/*?api_key*

Disallow

/*?*api_key*

Other Records

Field

Value

sitemap

https://afiliadoscpa.com.br/sitemap.xml

Comments

See http://www.robotstxt.org/robotstxt.html for documentation on how to use the robots.txt file

Warnings

2 invalid lines.

afiliadoscpa.com.brrobots.txt

Resource Scan

Scan Details

Last Successful Scan

Groups

mauibot

blexbot

seo spider

yandexbot

alexawebsearchplatform

archive.org_bot

baiduspider

betabot

petalbot

dotbot

screaming frog seo spider

crawl

exabot

gigabot

infoseek sidewinder

linkchecker

netsongbot

dataforseobot

dataforseo-bot

mj12bot/

linkdexbot

adsbot

awariosmartbot

feedspot

sogou web spider

feedspot/1.0

keybot

zoominfobot

feedspot

checkhost

orbbot

yandeximages

wellknownbot

surdotlybot

gdnplus.com

online-webceo-bot

scrapy

rogerbot

mj12bot

dotbot

alexibot

surveybot

xenu

exabot

gigabot

blekkobot

mecrawler

ia_archiver

turnitinbot

python-requests

python-urllib/2.7

curl/7.35.0

wp_is_mobile

java/11.0.10

photon

photon/1.0

serendeputybot

webpage-inspector.com

twingly recon

bidtellect/0.0.958.0

slackbot-linkexpanding 1.0

df bot 1.0

who.is bot

sem rush bot

yandex bot

sogou bot

majestic bot

exalead bot

baidu bot

mediamathbot/1.0

trendictionbot0

omgili/0.5

iframely

cortex/1.0

trendictionbot0.5.0

afiliadoscpa.com.br
robots.txt