steampunk-explorer.com
robots.txt

Robots Exclusion Standard data for steampunk-explorer.com

Archived Snapshots

Resource Scan

Scan Details

Site Domain	steampunk-explorer.com
Base Domain	steampunk-explorer.com
Scan Status	Ok
Last Scan	2025-12-02T10:54:37+00:00
Next Scan	2026-01-01T10:54:37+00:00

Last Scan

Scanned	2025-12-02T10:54:37+00:00
URL	https://steampunk-explorer.com/robots.txt
Domain IPs	104.26.8.71, 104.26.9.71, 172.67.74.198, 2606:4700:20::681a:847, 2606:4700:20::681a:947, 2606:4700:20::ac43:4ac6
Response IP	104.26.9.71
Found	Yes
Hash	53c937e84310cf2f26a1eb81de4d5fcf149a67714bd350613629362fcf74dd47
SimHash	e43ac911c7d4

Groups

*

Rule	Path
Allow	/

Rule

Path

Allow

amazonbot

Rule	Path
Disallow	/

Rule

Path

Disallow

applebot-extended

Rule	Path
Disallow	/

Rule

Path

Disallow

bytespider

Rule	Path
Disallow	/

Rule

Path

Disallow

ccbot

Rule	Path
Disallow	/

Rule

Path

Disallow

claudebot

Rule	Path
Disallow	/

Rule

Path

Disallow

google-extended

Rule	Path
Disallow	/

Rule

Path

Disallow

gptbot

Rule	Path
Disallow	/

Rule

Path

Disallow

meta-externalagent

Rule	Path
Disallow	/

Rule

Path

Disallow

*

Rule	Path
Disallow	/core/
Disallow	/profiles/
Disallow	/README.md
Disallow	/web.config
Disallow	/admin
Disallow	/comment/reply
Disallow	/filter/tips
Disallow	/node/add
Disallow	/search
Disallow	/user/register
Disallow	/user/password
Disallow	/user/login
Disallow	/user/logout
Disallow	/city-guides/listings
Disallow	/events
Disallow	/?q=admin
Disallow	/?q=comment%2Freply
Disallow	/?q=filter%2Ftips
Disallow	/?q=node%2Fadd
Disallow	/?q=search
Disallow	/?q=user%2Fpassword
Disallow	/?q=user%2Fregister
Disallow	/?q=user%2Flogin
Disallow	/?q=user%2Flogout

Rule

Path

Disallow

/core/

Disallow

/profiles/

Disallow

/README.md

Disallow

/web.config

Disallow

/admin

Disallow

/comment/reply

Disallow

/filter/tips

Disallow

/node/add

Disallow

/search

Disallow

/user/register

Disallow

/user/password

Disallow

/user/login

Disallow

/user/logout

Disallow

/city-guides/listings

Disallow

/events

Disallow

/?q=admin

Disallow

/?q=comment%2Freply

Disallow

/?q=filter%2Ftips

Disallow

/?q=node%2Fadd

Disallow

/?q=search

Disallow

/?q=user%2Fpassword

Disallow

/?q=user%2Fregister

Disallow

/?q=user%2Flogin

Disallow

/?q=user%2Flogout

Other Records

Field	Value
crawl-delay	30

Field

Value

crawl-delay

ai2bot
ai2bot-dolma
aihitbot
amazonbot
andibot
anthropic-ai
applebot-extended
awario
bedrockbot
brightbot 1.0
bytespider
ccbot
claude-web
claudebot
cohere-ai
cohere-training-data-crawler
cotoyogi
crawlspace
datenbank crawler
devin
diffbot
echobot bot
echoboxbot
facebookbot
factset_spyderbot
firecrawlagent
friendlycrawler
gemini-deep-research
google-cloudvertexbot
google-extended
googleother
googleother-image
googleother-video
gptbot
iaskspider/2.0
icc-crawler
imagesiftbot
img2dataset
isscyberriskcrawler
kangaroo bot
meta-externalagent
meta-externalagent
meta-externalfetcher
meta-externalfetcher
mistralai-user
mistralai-user/1.0
mycentralaiscraperbot
netestate imprint crawler
novaact
omgili
omgilibot
operator
pangubot
panscient
panscient.com
petalbot
phindbot
poseidon research crawler
qualifiedbot
quillbot
quillbot.com
sbintuitionsbot
scrapy
semrushbot-ocob
semrushbot-swa
sidetrade indexer bot
summalybot
thinkbot
tiktokspider
timpibot
velenpublicwebcrawler
wardbot
webzio-extended
wpbot
yandexadditional
yandexadditionalbot
youbot

Rule	Path
Disallow	/

Rule

Path

Disallow

Comments

As a condition of accessing this website, you agree to abide by the following
content signals:
(a) If a content-signal = yes, you may collect content for the corresponding
use.
(b) If a content-signal = no, you may not collect content for the
corresponding use.
(c) If the website operator does not include a content signal for a
corresponding use, the website operator neither grants nor restricts
permission via content signal with respect to the corresponding use.
The content signals and their meanings are:
search: building a search index and providing search results (e.g., returning
hyperlinks and short excerpts from your website's contents). Search does not
include providing AI-generated search summaries.
ai-input: inputting content into one or more AI models (e.g., retrieval
augmented generation, grounding, or other real-time taking of content for
generative AI search answers).
ai-train: training or fine-tuning AI models.
ANY RESTRICTIONS EXPRESSED VIA CONTENT SIGNALS ARE EXPRESS RESERVATIONS OF
AND RELATED RIGHTS IN THE DIGITAL SINGLE MARKET.
BEGIN Cloudflare Managed content
END Cloudflare Managed Content
robots.txt
This file is to prevent the crawling and indexing of certain parts
of your site by web crawlers and spiders run by sites like Yahoo!
and Google. By telling these "robots" where not to go on your site,
you save bandwidth and server resources.
This file will be ignored unless it is at the root of your host:
Used: http://example.com/robots.txt
Ignored: http://example.com/site/robots.txt
For more information about the robots.txt standard, see:
http://www.robotstxt.org/robotstxt.html
For syntax checking, see:
http://www.robotstxt.org/checker.html
Directories
Files
Paths (clean URLs)
Paths (no clean URLs)
Block AI crawlers

Warnings

`content-signal` is not a known field.

steampunk-explorer.comrobots.txt

Resource Scan

Scan Details

Last Scan

Groups

*

amazonbot

applebot-extended

bytespider

ccbot

claudebot

google-extended

gptbot

meta-externalagent

*

Other Records

Comments

Warnings

steampunk-explorer.com
robots.txt