links.giant.rocks
robots.txt

Robots Exclusion Standard data for links.giant.rocks

Resource Scan

Scan Details

Site Domain links.giant.rocks
Base Domain giant.rocks
Scan Status Failed
Failure StageFetching resource.
Failure ReasonCouldn't connect to server.
Last Scan2025-12-29T23:35:20+00:00
Next Scan 2026-03-29T23:35:20+00:00

Last Successful Scan

Scanned2025-08-25T01:14:25+00:00
URL https://links.giant.rocks/robots.txt
Domain IPs 24.144.120.18, 2604:a880:400:d0::205d:b001
Response IP 24.144.120.18
Found Yes
Hash b3a816074f9cd11edfa3681f5b434f9cd3695e05c92fdea9f819317b495a996f
SimHash 2453cb01c2c0

Groups

adsbot-google
adsbot-google-mobile
adsbot-google-mobile-apps
adidxbot
algolia crawler
applebot
applenewsbot
baiduspider
baiduspider-image
baiduspider-news
baiduspider-video
bingbot
bingpreview
bublupbot
ccbot
cliqzbot
coccoc
coccocbot-image
coccocbot-web
daumoa
dazoobot
deusu
duckduckbot
duckduckgo-favicons-bot
euripbot
exploratodo
facebookcatalog
facebookexternalhit
facebot
feedly
findxbot
gooblog
googlebot
googlebot-image
googlebot-mobile
googlebot-news
googlebot-video
haosouspider
ichiro
istellabot
jikespider
lycos
mail.ru
mediapartners-google
microsoftpreview
mojeekbot
msnbot
msnbot-media
orangebot
pinterest
plukkie
qwantify
rambler
semanticscholarbot
seznambot
sosospider
slurp
sogou blog
sogou inst spider
sogou news spider
sogou orion spider
sogou spider2
sogou web spider
twitterbot
whatsapp
yacybot
yandex
yandexmobilebot
yepbot
yeti
yioopbot
yoozbot
youdaobot
*
addsearchbot
ai2bot
ai2bot-dolma
aihitbot
amazonbot
andibot
anthropic-ai
applebot-extended
awario
bedrockbot
bigsur.ai
brightbot 1.0
bytespider
chatgpt agent
chatgpt-user
claude-searchbot
claude-user
claude-web
claudebot
cloudvertexbot
cohere-ai
cohere-training-data-crawler
cotoyogi
crawlspace
datenbank crawler
devin
diffbot
duckassistbot
echobot bot
echoboxbot
facebookbot
factset_spyderbot
firecrawlagent
friendlycrawler
gemini-deep-research
google-cloudvertexbot
google-extended
googleagent-mariner
googleother
googleother-image
googleother-video
gptbot
iaskspider/2.0
icc-crawler
imagesiftbot
img2dataset
isscyberriskcrawler
kangaroo bot
linerbot
meta-externalagent
meta-externalagent
meta-externalfetcher
meta-externalfetcher
mistralai-user
mistralai-user/1.0
mycentralaiscraperbot
netestate imprint crawler
novaact
oai-searchbot
omgili
omgilibot
operator
pangubot
panscient
panscient.com
perplexity-user
perplexitybot
petalbot
phindbot
poseidon research crawler
qualifiedbot
quillbot
quillbot.com
sbintuitionsbot
scrapy
semrushbot-ocob
semrushbot-swa
sidetrade indexer bot
thinkbot
tiktokspider
timpibot
velenpublicwebcrawler
wardbot
webzio-extended
wpbot
yak
yandexadditional
yandexadditionalbot
youbot

Rule Path
Disallow
Disallow /

Comments

  • robots.txt merged from multiple sources
  • Source 1: https://www.ditig.com/robots.txt
  • ROBOTS.TXT
  • Updates and informantion can be found at:
  • https://www.ditig.com/publications/robots-txt-template
  • This document is licensed with a CC BY-NC-SA 4.0 license.
  • Last update: 2025-03-04
  • so.com chinese search engine
  • google.com landing page quality checks
  • google.com app resource fetcher
  • bing ads bot
  • algolia.com search
  • apple.com search engine
  • baidu.com chinese search engine
  • bing.com international search engine
  • bublup.com suggestion/search engine
  • commoncrawl.org open repository of web crawl data
  • cliqz.com german in-product search engine
  • coccoc.com vietnamese search engine
  • daum.net korean search engine
  • dazoo.fr french search engine
  • deusu.de german search engine
  • duckduckgo.com international privacy search engine
  • eurip.com european search engine
  • exploratodo.com latin search engine
  • facebook.com social network
  • feedly.com feed fetcher
  • findx.com european search engine
  • goo.ne.jp japanese search engine
  • google.com international search engine
  • so.com chinese search engine
  • goo.ne.jp japanese search engine
  • istella.it italian search engine
  • jike.com / chinaso.com chinese search engine
  • lycos.com & hotbot.com international search engine
  • mail.ru russian search engine
  • google.com adsense bot
  • Preview bot for Microsoft products
  • mojeek.com search engine
  • bing.com international search engine
  • orange.com international search engine
  • pinterest.com social networtk
  • botje.nl dutch search engine
  • qwant.com french search engine
  • rambler.ru russian search engine
  • semanticscholar.org scientific search engine
  • seznam.cz czech search engine
  • soso.com chinese search engine
  • yahoo.com international search engine
  • sogou.com chinese search engine
  • twitter.com social media bot
  • whatsapp.com preview bot
  • yacy.net p2p search software
  • yandex.com russian search engine
  • yep.com search engine
  • search.naver.com south korean search engine
  • yioop.com international search engine
  • yooz.ir iranian search engine
  • youdao.com chinese search engine
  • crawling rule(s) for above bots
  • disallow all other bots
  • ----
  • Additional rules from: https://raw.githubusercontent.com/ai-robots-txt/ai.robots.txt/refs/heads/main/robots.txt
  • ----

Warnings

  • 3 invalid lines.