data-alliance.net
robots.txt

Robots Exclusion Standard data for data-alliance.net

Resource Scan

Scan Details

Site Domain data-alliance.net
Base Domain data-alliance.net
Scan Status Ok
Last Scan2025-10-30T15:06:03+00:00
Next Scan 2025-11-29T15:06:03+00:00

Last Scan

Scanned2025-10-30T15:06:03+00:00
URL https://data-alliance.net/robots.txt
Redirect https://www.data-alliance.net/robots.txt
Redirect Domain www.data-alliance.net
Redirect Base data-alliance.net
Domain IPs 104.26.10.114, 104.26.11.114, 172.67.75.165, 2606:4700:20::681a:a72, 2606:4700:20::681a:b72, 2606:4700:20::ac43:4ba5
Redirect IPs 104.26.10.114, 104.26.11.114, 172.67.75.165, 2606:4700:20::681a:a72, 2606:4700:20::681a:b72, 2606:4700:20::ac43:4ba5
Response IP 104.26.11.114
Found Yes
Hash 3ac11e8042be129e64002ac526c48451c9549c49a6d64bfcfcf13d42bf8d3625
SimHash 7b9e5911c3e6

Groups

*

Rule Path
Disallow /brands/*
Disallow /productupdates.php
Disallow /remote.php
Disallow /viewfile.php
Disallow /admin/
Disallow /*sort%3D
Disallow /__socialshop/

etaospider

Rule Path
Disallow /

mozilla/5.0 (compatible; etaospider/1.0; http://open.etao.com/dev/etaospider)

Rule Path
Disallow /

mozilla/5.0 (compatible; fatbot 2.0; http://www.thefind.com/crawler)

Rule Path
Disallow /

mozilla/5.0 (compatible; easouspider; +http://www.easou.com/search/spider.html)

Rule Path
Disallow /

mozilla/5.0 (compatible; ahrefsbot/5.0; +http://ahrefs.com/robot/)

Rule Path
Disallow /

baiduspider

Rule Path
Disallow /

baiduspider+

Rule Path
Disallow /

baiduspider+ (+; http://www.baidu.com/search/spider.htm; ubuntu; compatible; version; platform; .net clr 2.0.50727; version)

Rule Path
Disallow /

baiduspider+(+http://www.baidu.com/search/spider.htm)

Rule Path
Disallow /

baiduspider+(+http://www.baidu.com/search/spider_jp.html)

Rule Path
Disallow /

*

Rule Path
Disallow /*?_bc_fsnf=1*
Disallow /*%26_bc_fsnf%3D1*

ai2bot
ai2bot-dolma
amazonbot
applebot
applebot-extended
bytespider
ccbot
chatgpt-user
claude-web
claudebot
diffbot
facebookbot
friendlycrawler
gptbot
google-extended
googleother
googleother-image
googleother-video
icc-crawler
isscyberriskcrawler
imagesiftbot
kangaroo bot
meta-externalagent
meta-externalfetcher
oai-searchbot
perplexitybot
petalbot
scrapy
sidetrade indexer bot
timpibot
velenpublicwebcrawler
webzio-extended
youbot
anthropic-ai
cohere-ai
facebookexternalhit
iaskspider/2.0
img2dataset
omgili
omgilibot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10