/.well-known/

Log In Sign Up

puha.no
robots.txt

Robots Exclusion Standard data for puha.no

Archived Snapshots

Resource Scan

Scan Details

Site Domain	puha.no
Base Domain	puha.no
Scan Status	Ok
Last Scan	2026-01-08T21:55:33+00:00
Next Scan	2026-01-09T21:55:33+00:00

Last Scan

Scanned	2026-01-08T21:55:33+00:00
URL	https://puha.no/robots.txt
Domain IPs	2a02:2350:5:10c:8017:a682:ff35:8709, 46.30.213.162
Response IP	46.30.213.162
Found	Yes
Hash	f3dbc13005d884fcbbd3ab2b1517998298d7f5ea845a002b05af063e701d9a7a
SimHash	76c00b07c7c0

Groups

*

Rule

Path

Disallow

/wp-admin/

Disallow

/paasche/

gptbot
claudebot
claude-user
claude-searchbot
ccbot
google-extended
applebot-extended
facebookbot
meta-externalagent
meta-externalfetcher
diffbot
perplexitybot
perplexity‑user
omgili
omgilibot
webzio-extended
imagesiftbot
bytespider
tiktokspider
amazonbot
youbot
semrushbot-ocob
petalbot
velenpublicwebcrawler
turnitinbot
timpibot
oai-searchbot
icc-crawler
ai2bot
ai2bot-dolma
dataforseobot
awariobot
awariosmartbot
awariorssbot
google-cloudvertexbot
pangubot
kangaroo bot
sentibot
img2dataset
meltwater
seekr
peer39_crawler
cohere-ai
cohere-training-data-crawler
duckassistbot
scrapy
cotoyogi
aihitbot
factset_spyderbot
firecrawlagent

Rule

Path

Disallow

/

*

Rule

Path

Allow

/

Back to top

Other Records

Field

Value

sitemap

https://puha.no/wp-sitemap.xml

Back to top

Comments

Block all known AI crawlers and assistants
from using content for training AI models.
Source: https://robotstxt.com/ai
Block any non-specified AI crawlers (e.g., new
or unknown bots) from using content for training
AI models, while allowing the website to be
indexed and accessed by bots. These directives
are still experimental and may not be supported
by all AI crawlers.

Back to top

Warnings

`content-usage` is not a known field.
`disallowaitraining` is not a known field.

Back to top