nvp.se
robots.txt

Robots Exclusion Standard data for nvp.se

Archived Snapshots

Resource Scan

Scan Details

Site Domain	nvp.se
Base Domain	nvp.se
Scan Status	Ok
Last Scan	2024-04-27T20:59:49+00:00
Next Scan	2024-05-04T20:59:49+00:00

Last Scan

Scanned	2024-04-27T20:59:49+00:00
URL	https://nvp.se/robots.txt
Redirect	https://www.nvp.se/robots.txt
Redirect Domain	www.nvp.se
Redirect Base	nvp.se
Domain IPs	34.149.169.35
Redirect IPs	146.75.117.91, 2a04:4e42:9::347
Response IP	151.101.37.91
Found	Yes
Hash	4d4913fd24187cb0600eaae967188c26a2c9a990249732a648bd8fdb8bfb18c5
SimHash	625ff1444d74

Groups

*

Rule	Path
Disallow	/sok/
Disallow	/kop/
Disallow	/logga-in
Disallow	/bn/id/*
Disallow	/foljer
Disallow	/api/*

Rule

Path

Disallow

/sok/

Disallow

/kop/

Disallow

/logga-in

Disallow

/bn/id/*

Disallow

/foljer

Disallow

/api/*

ccbot

Rule	Path
Disallow	/

Rule

Path

Disallow

/

chatgpt-user

Rule	Path
Disallow	/

Rule

Path

Disallow

/

gptbot

Rule	Path
Disallow	/

Rule

Path

Disallow

/

google-extended

Rule	Path
Disallow	/

Rule

Path

Disallow

/

omgilibot

Rule	Path
Disallow	/

Rule

Path

Disallow

/

omgili

Rule	Path
Disallow	/

Rule

Path

Disallow

/

facebookbot

Rule	Path
Disallow	/

Rule

Path

Disallow

/

Back to top

Comments

Common Crawl robot, the resulting dataset is the primary training corpus in every LLM.
ChatGPT robot, used to improve the ChatGPT LLM.
ChatGPT robot, may be used to improve the ChatGPT LLM.
Robot used to improve Bard and Vertex AI LLMs.
webz.io robot, the resulting dataset can and is purchased to train LLMs.
webz.io robot, the resulting dataset can and is purchased to train LLMs.
FacebookBot crawls public web pages to improve LLMs for Facebook's speech recognition technology.

Back to top

nvp.serobots.txt

Resource Scan

Scan Details

Last Scan

Groups

*

ccbot

chatgpt-user

gptbot

google-extended

omgilibot

omgili

facebookbot

Comments

nvp.se
robots.txt