firefly.world
robots.txt

Robots Exclusion Standard data for firefly.world

Archived Snapshots

Resource Scan

Scan Details

Site Domain	firefly.world
Base Domain	firefly.world
Scan Status	Ok
Last Scan	2025-08-26T03:14:58+00:00
Next Scan	2025-09-25T03:14:58+00:00

Last Scan

Scanned	2025-08-26T03:14:58+00:00
URL	https://www.firefly.world/robots.txt
Domain IPs	9.141.186.126
Response IP	9.141.186.126
Found	Yes
Hash	4ee0be00840a70a5b8d55ad018bc920425133d6b88846c16a7db2c89ed43f421
SimHash	b8929d0bc174

Groups

feeddemon

Rule	Path
Disallow	/

Rule

Path

Disallow

bot/0.1 (bot for jce)

Rule	Path
Disallow	/

Rule

Path

Disallow

crawldaddy

Rule	Path
Disallow	/

Rule

Path

Disallow

java

Rule	Path
Disallow	/

Rule

Path

Disallow

jullo

Rule	Path
Disallow	/

Rule

Path

Disallow

feedly

Rule	Path
Disallow	/

Rule

Path

Disallow

universalfeedparser

Rule	Path
Disallow	/

Rule

Path

Disallow

apachebench

Rule	Path
Disallow	/

Rule

Path

Disallow

swiftbot

Rule	Path
Disallow	/

Rule

Path

Disallow

yandexbot

Rule	Path
Disallow	/

Rule

Path

Disallow

jikespider

Rule	Path
Disallow	/

Rule

Path

Disallow

mj12bot

Rule	Path
Disallow	/

Rule

Path

Disallow

zmeu phpmyadmin

Rule	Path
Disallow	/

Rule

Path

Disallow

winhttp

Rule	Path
Disallow	/

Rule

Path

Disallow

easouspider

Rule	Path
Disallow	/

Rule

Path

Disallow

httpclient

Rule	Path
Disallow	/

Rule

Path

Disallow

microsoft url control

Rule	Path
Disallow	/

Rule

Path

Disallow

yyspider

Rule	Path
Disallow	/

Rule

Path

Disallow

jaunty

Rule	Path
Disallow	/

Rule

Path

Disallow

obot

Rule	Path
Disallow	/

Rule

Path

Disallow

python-urllib

Rule	Path
Disallow	/

Rule

Path

Disallow

indy library

Rule	Path
Disallow	/

Rule

Path

Disallow

flightdeckreports bot

Rule	Path
Disallow	/

Rule

Path

Disallow

linguee bot

Rule	Path
Disallow	/

Rule

Path

Disallow

*

No rules defined. All paths allowed.

Other Records

Field	Value
sitemap	https://www.firefly.world/sitemap.xml

Field

Value

sitemap

https://www.firefly.world/sitemap.xml

Comments

robots.txt
This file is to prevent the crawling and indexing of certain parts
of your site by web crawlers and spiders run by sites like Yahoo!
and Google. By telling these "robots" where not to go on your site,
you save bandwidth and server resources.
This file will be ignored unless it is at the root of your host:
Used: http://example.com/robots.txt
Ignored: http://example.com/site/robots.txt
For more information about the robots.txt standard, see:
http://www.robotstxt.org/robotstxt.html
robots.txt
This file is to prevent the crawling and indexing of certain parts
of your site by web crawlers and spiders run by sites like Yahoo!
and Google. By telling these "robots" where not to go on your site,
you save bandwidth and server resources.
This file will be ignored unless it is at the root of your host:
Used: http://example.com/robots.txt
Ignored: http://example.com/site/robots.txt
For more information about the robots.txt standard, see:
http://www.robotstxt.org/robotstxt.html
Custom crawlers disable

firefly.worldrobots.txt

Resource Scan

Scan Details

Last Scan

Groups

feeddemon

bot/0.1 (bot for jce)

crawldaddy

java

jullo

feedly

universalfeedparser

apachebench

swiftbot

yandexbot

jikespider

mj12bot

zmeu phpmyadmin

winhttp

easouspider

httpclient

microsoft url control

yyspider

jaunty

obot

python-urllib

indy library

flightdeckreports bot

linguee bot

*

Other Records

Comments

firefly.world
robots.txt