sjl.fi
robots.txt

Robots Exclusion Standard data for sjl.fi

Archived Snapshots

Resource Scan

Scan Details

Site Domain	sjl.fi
Base Domain	sjl.fi
Scan Status	Ok
Last Scan	2025-07-27T19:42:15+00:00
Next Scan	2025-08-03T19:42:15+00:00

Last Scan

Scanned	2025-07-27T19:42:15+00:00
URL	https://sjl.fi/robots.txt
Redirect	https://www.sjl.fi:443/robots.txt
Redirect Domain	www.sjl.fi
Redirect Base	sjl.fi
Domain IPs	54.246.245.212
Redirect IPs	18.172.242.126, 18.172.242.19, 18.172.242.69, 18.172.242.74
Response IP	18.165.72.129
Found	Yes
Hash	31cd8e79aae59001df5cb3baa00e3d5763aa99c2e20ec272b052a274c8941e2a
SimHash	7230f15d3514

Groups

amazonbot

Rule	Path
Disallow	/

Rule

Path

Disallow

anthropic-ai

Rule	Path
Disallow	/

Rule

Path

Disallow

claudebot

Rule	Path
Disallow	/

Rule

Path

Disallow

claude-web

Rule	Path
Disallow	/

Rule

Path

Disallow

bytespider

Rule	Path
Disallow	/

Rule

Path

Disallow

gptbot

Rule	Path
Disallow	/

Rule

Path

Disallow

chatgpt-user

Rule	Path
Disallow	/

Rule

Path

Disallow

cohere-ai

Rule	Path
Disallow	/

Rule

Path

Disallow

ccbot

Rule	Path
Disallow	/

Rule

Path

Disallow

diffbot

Rule	Path
Disallow	/

Rule

Path

Disallow

facebookbot

Rule	Path
Disallow	/

Rule

Path

Disallow

google-extended

Rule	Path
Disallow	/

Rule

Path

Disallow

imagesiftbot

Rule	Path
Disallow	/

Rule

Path

Disallow

meta-externalagent

Rule	Path
Disallow	/

Rule

Path

Disallow

omgilibot

Rule	Path
Disallow	/

Rule

Path

Disallow

omgili

Rule	Path
Disallow	/

Rule

Path

Disallow

oai-searchbot

Rule	Path
Disallow	/

Rule

Path

Disallow

perplexitybot

Rule	Path
Disallow	/

Rule

Path

Disallow

youbot

Rule	Path
Disallow	/

Rule

Path

Disallow

googlebot

Rule	Path
Disallow	/kaupalliset/*.jpg$
Disallow	/kaupalliset/*.Jpg$
Disallow	/kaupalliset/*.jPg$
Disallow	/kaupalliset/*.jpG$
Disallow	/kaupalliset/*.jPG$
Disallow	/kaupalliset/*.JPg$
Disallow	/kaupalliset/*.JpG$
Disallow	/kaupalliset/*.JPG$
Disallow	/kaupalliset/*.png$
Disallow	/kaupalliset/*.Png$
Disallow	/kaupalliset/*.pNg$
Disallow	/kaupalliset/*.pnG$
Disallow	/kaupalliset/*.pNG$
Disallow	/kaupalliset/*.PNg$
Disallow	/kaupalliset/*.PnG$
Disallow	/kaupalliset/*.PNG$
Disallow	/kaupalliset/*.gif$
Disallow	/kaupalliset/*.Gif$
Disallow	/kaupalliset/*.gIf$
Disallow	/kaupalliset/*.giF$
Disallow	/kaupalliset/*.gIF$
Disallow	/kaupalliset/*.GIf$
Disallow	/kaupalliset/*.GiF$
Disallow	/kaupalliset/*.GIF$

Rule

Path

Disallow

/kaupalliset/*.jpg$

Disallow

/kaupalliset/*.Jpg$

Disallow

/kaupalliset/*.jPg$

Disallow

/kaupalliset/*.jpG$

Disallow

/kaupalliset/*.jPG$

Disallow

/kaupalliset/*.JPg$

Disallow

/kaupalliset/*.JpG$

Disallow

/kaupalliset/*.JPG$

Disallow

/kaupalliset/*.png$

Disallow

/kaupalliset/*.Png$

Disallow

/kaupalliset/*.pNg$

Disallow

/kaupalliset/*.pnG$

Disallow

/kaupalliset/*.pNG$

Disallow

/kaupalliset/*.PNg$

Disallow

/kaupalliset/*.PnG$

Disallow

/kaupalliset/*.PNG$

Disallow

/kaupalliset/*.gif$

Disallow

/kaupalliset/*.Gif$

Disallow

/kaupalliset/*.gIf$

Disallow

/kaupalliset/*.giF$

Disallow

/kaupalliset/*.gIF$

Disallow

/kaupalliset/*.GIf$

Disallow

/kaupalliset/*.GiF$

Disallow

/kaupalliset/*.GIF$

Other Records

Field	Value
sitemap	https://www.sjl.fi/sitemap.xml

Field

Value

sitemap

https://www.sjl.fi/sitemap.xml

Comments

Scraping is not allowed for training AI language models, or selling to AI companies
Amazon: used to improve/enable Alexa to answer questions
Anthropic/Claude: provides no documentation whether these are effective
Anthropic/Claude
Anthropic/Claude
ByteDance LLMs, including Doubao
ChatGPT crawler
ChatGPT plugins
Cohere: associated with Cohere's chatbot
Common Crawl
Diffbot: collects data to train LLMs
Facebook: crawls to improve language models
Google: Bard and Vertex AI generative APIs
ImagesiftBot: associated with a company that produces models for image generation
Meta
Omgilibot/webz.io: sells data for training LLMs
OpenAI Search
Perplexity AI
SuSea
Disable indexing of native ad images
Sitemap

sjl.firobots.txt

Resource Scan

Scan Details

Last Scan

Groups

amazonbot

anthropic-ai

claudebot

claude-web

bytespider

gptbot

chatgpt-user

cohere-ai

ccbot

diffbot

facebookbot

google-extended

imagesiftbot

meta-externalagent

omgilibot

omgili

oai-searchbot

perplexitybot

youbot

googlebot

Other Records

Comments

sjl.fi
robots.txt