abuse.shaunc.com
robots.txt

Robots Exclusion Standard data for abuse.shaunc.com

Resource Scan

Scan Details

Site Domain abuse.shaunc.com
Base Domain shaunc.com
Scan Status Ok
Last Scan2024-05-04T06:27:16+00:00
Next Scan 2024-05-18T06:27:16+00:00

Last Scan

Scanned2024-05-04T06:27:16+00:00
URL https://abuse.shaunc.com/robots.txt
Domain IPs 172.93.52.73
Response IP 172.93.52.73
Found Yes
Hash 782b69c66b6a88eaa036d043222784846a35461a318ee32408d40cfaac90bcba
SimHash 5155407665d1

Groups

twitterbot

Rule Path
Allow *

ia_archiver

Rule Path
Disallow /

archive.org_bot

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

bytespider

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

turnitinbot

Rule Path
Disallow /

friendlycrawler

Rule Path
Disallow /

ltx71 - (http://ltx71.com/)

Rule Path
Disallow /

mixnodecache

Rule Path
Disallow /

checkmarknetwork/1.0 (+http://www.checkmarknetwork.com/spider.html)

Rule Path
Disallow /

panscient.com

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

ioncrawl

Rule Path
Disallow /

daum

Rule Path
Disallow /

femtosearchbot

Rule Path
Disallow /

smtbot

Rule Path
Disallow /

senutobot

Rule Path
Disallow /

blexbot

Rule Path
Disallow /

serpstatbot

Rule Path
Disallow /

seokicks

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

claudebot

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

dataprovider

Rule Path
Disallow /

neevabot

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

arquivo-web-crawler

Rule Path
Disallow /

dataforseobot

Rule Path
Disallow /

Comments

  • Sitemap for incident content
  • Sitemap: https://abuse.shaunc.com/sitemap.xml