fbievan.live
robots.txt

Robots Exclusion Standard data for fbievan.live

Archived Snapshots

Resource Scan

Scan Details

Site Domain	fbievan.live
Base Domain	fbievan.live
Scan Status	Ok
Last Scan	2025-08-28T18:57:45+00:00
Next Scan	2025-09-11T18:57:45+00:00

Last Scan

Scanned	2025-08-28T18:57:45+00:00
URL	https://fbievan.live/robots.txt
Domain IPs	143.198.225.189
Response IP	143.198.225.189
Found	Yes
Hash	09336821db4cf62725cd14eca73b59631cf03395c8f281c67ac13819c4d17395
SimHash	70341f43c475

Groups

ccbot

Rule	Path
Disallow	/

Rule

Path

Disallow

chatgpt-user

Rule	Path
Disallow	/

Rule

Path

Disallow

gptbot

Rule	Path
Disallow	/

Rule

Path

Disallow

google-extended

Rule	Path
Disallow	/

Rule

Path

Disallow

anthropic-ai

Rule	Path
Disallow	/

Rule

Path

Disallow

omgilibot

Rule	Path
Disallow	/

Rule

Path

Disallow

omgili

Rule	Path
Disallow	/

Rule

Path

Disallow

facebookbot

Rule	Path
Disallow	/

Rule

Path

Disallow

bytespider

Rule	Path
Disallow	/

Rule

Path

Disallow

img2dataset

Rule	Path
Disallow	/

Rule

Path

Disallow

claude-web

Rule	Path
Disallow	/

Rule

Path

Disallow

magpie-crawler

Rule	Path
Disallow	/

Rule

Path

Disallow

ahrefsbot

Rule	Path
Disallow	/

Rule

Path

Disallow

perplexitybot

Rule	Path
Disallow	/

Rule

Path

Disallow

cohere-ai

Rule	Path
Disallow	/

Rule

Path

Disallow

amazonbot

Rule	Path
Disallow	/

Rule

Path

Disallow

applebot

Rule	Path
Disallow	/

Rule

Path

Disallow

youbot

Rule	Path
Disallow	/

Rule

Path

Disallow

friendlycrawler

Rule	Path
Disallow	/

Rule

Path

Disallow

Comments

from https://neil-clarke.com/block-the-bots-that-feed-ai-models-by-scraping-your-website/
from https://github.com/healsdata/ai-training-opt-out
may not work, needs more research (see https://github.com/rom1504/img2dataset/issues/48)
AhrefsBot crawls for data for an "SEO Dataset"âone of their "products" based on this dataset is "AI Writing Tools"
from https://www.cyberciti.biz/web-developer/block-openai-bard-bing-ai-crawler-bots-using-robots-txt-file/
from https://netfuture.ch/2023/07/blocking-ai-crawlers-robots-txt-chatgpt/
from https://claytonerrington.com/blog/robots-and-ai/
from https://darkvisitors.com/
from https://imho.alex-kunz.com/2024/01/25/an-update-on-friendly-crawler/

fbievan.liverobots.txt

Resource Scan

Scan Details

Last Scan

Groups

ccbot

chatgpt-user

gptbot

google-extended

anthropic-ai

omgilibot

omgili

facebookbot

bytespider

img2dataset

claude-web

magpie-crawler

ahrefsbot

perplexitybot

cohere-ai

amazonbot

applebot

youbot

friendlycrawler

Comments

fbievan.live
robots.txt