acp.it
robots.txt

Robots Exclusion Standard data for acp.it

Resource Scan

Scan Details

Site Domain acp.it
Base Domain acp.it
Scan Status Failed
Failure ReasonScan timed out.
Last Scan2024-06-12T04:42:49+00:00
Next Scan 2024-06-19T04:42:49+00:00

Last Successful Scan

Scanned2024-05-28T02:32:56+00:00
URL https://acp.it/robots.txt
Domain IPs 195.35.24.150
Response IP 195.35.24.150
Found Yes
Hash e9dbe0f51f2df87562164247a5c3529a227c2ec6a7021aebe8241ef494cef872
SimHash b65e5ae284b3

Groups

baiduspider
yandex
uptimebot
dataprovider.com
mj12bot
ahrefsbot
ccbot
petalbot
buckyohare
buck
gptbot
blexbot
seznambot
sogou spider
seokicks-robot
seokicks
discobot
blekkobot
blexbot
sistrix crawler
ezooms robot
netestate ne crawler
wiseguys robot
turnitin robot
babya discoverer
exabot
zealbot
sitesnagger
webstripper
webcopier
fetch
offline explorer
teleport
teleportpro
acunetix
webzip
linko
httrack
larbin
libwww
zyborg
download ninja
k2spider
webreaper
woorank
checkmarknetwork/1.0 (+http://www.checkmarknetwork.com/spider.html)

Rule Path
Disallow /

*

Rule Path
Disallow /*login
Disallow /*/*login
Disallow /*/accesso_negato
Disallow /*/image_captcha
Disallow /*.py
Disallow /*.ini
Disallow /sito.cfg
Disallow /cgi-bin/*
Disallow /old_contatti
Disallow /asset*
Disallow /blog/*
Disallow /content*
Disallow /event*
Disallow /file*
Disallow /glossary*
Disallow /link*
Disallow /forum*
Disallow /misc*
Disallow /old*
Disallow /sites*
Disallow /system*
Disallow /taxonomy*
Disallow /tmp/*
Disallow /user/*
Disallow /wp-*

Other Records

Field Value
crawl-delay 10

Other Records

Field Value
sitemap https://acp.it/sitemap_index.xml

Comments

  • FROM https://en.wikipedia.org/robots.txt
  • ALL GOOD SPIDER
  • PATH DENIED

Warnings

  • `host` is not a known field.
  • `request-rate` is not a known field.