starbeamrainbowlabs.com
robots.txt

Robots Exclusion Standard data for starbeamrainbowlabs.com

Resource Scan

Scan Details

Site Domain starbeamrainbowlabs.com
Base Domain starbeamrainbowlabs.com
Scan Status Ok
Last Scan2024-09-24T06:45:35+00:00
Next Scan 2024-10-08T06:45:35+00:00

Last Scan

Scanned2024-09-24T06:45:35+00:00
URL https://starbeamrainbowlabs.com/robots.txt
Domain IPs 2001:41d0:e:74b::1, 5.196.73.75
Response IP 5.196.73.75
Found Yes
Hash 2f32b623ecc451cd721983410f11a6b674d96f5920e5a7afa05a9c8a5efcd1e2
SimHash a21ed951ccd1

Groups

*

Rule Path
Disallow /flowcharts/
Disallow /original.7z
Disallow /gulpfile.js

Other Records

Field Value
crawl-delay 5

google

Rule Path
Disallow /stealthpackager.php

baiduspider
baiduspider-video
baiduspider-image
baiduspider-render

Rule Path
Disallow /

gptbot

Rule Path
Disallow /
Disallow /

ccbot

Rule Path
Disallow /

googleother

Rule Path
Disallow /

Other Records

Field Value
sitemap https://starbeamrainbowlabs.com/sitemap.xml
sitemap https://starbeamrainbowlabs.com/blog/feed.php

Comments

  • starbeamrainbowlabs.com robot rules
  • If you are a robot, you MUST obey the rules in this file.
  • If you don't, you risk being blocked.
  • If you want to crawl in places that this file forbids, please contact
  • webmaster@starbeamrainbowlabs.com.
  • All Robots
  • Please don't kill the server :(
  • Contact webmaster@starbeamrainbowlabs.com if you want to crawl faster -
  • a specific time for your crawling can be arranged.
  • Sitemap location
  • Generated using a simple node script - not always completely up to date
  • You are best off recursing and crawling yourself - just don't confuse
  • yourself with stealthpackager.php
  • If you do want the sitemap updated, please just ask.
  • Suggestions of good sitemap generation tools are most welcome.
  • Don't crawl the random design phase files, please
  • Legitimate bot override rules
  • To make sure that legitimate bots don't go crawling all over
  • stealthpackager.php, disallow rules are added here. If you want your bot
  • added, contact me at webmaster@starbeamrainbowlabs.com.
  • Just for the record, stealthpackager.php is a honeypot (shh, don't go telling
  • anyone!).
  • Illigitimate bot override rules
  • If you have been added here, then it means that I have some kind of problem
  • with the way you are crawling. Usually I will leave a note next to each
  • blocked bot with details of the problem(s). If you want to be removed from
  • this list, please email bugs@starbeamrainbowlabs.com and I will consider it.
  • Usually you will be allowed a trial run with your new (updated) bot before
  • you are allowed to crawl properly.
  • IF YOU CONINTUE TO CRAWL WHILST BLOCKED, YOUR IP ADDRESS(ES) WILL BE
  • BANNED INDEFINITELY.
  • The baidu spider is broken and crawls way too much
  • We get odd referrers from Baidu
  • Baidu's IP Address blocks I've observed are now blocked in my firewall because
  • they haven't stopped. If you've sorted your crawler out to be more polite and
  • would like to be unblocked, please let me know :-)
  • Scraping my content and using it to train an AI is not permitted. Reason: My
  • content is CC-BY-SA, meaning you have to cite your sources. No AI models I
  • know do this.
  • ChatGPT and other large language models that do not have altruistic goals are
  • not permitted to crawl this site, see also GPTBot.
  • Google is not permitted to use content from this website for reearch and
  • development.
  • Ref https://developers.google.com/search/docs/crawling-indexing/overview-google-crawlers#google-extended
  • IP Address Blocklist
  • IP Addresses (or ranges) in this list have been blocked at IP level, meaning
  • that they are no longer allowed to make connections to the site.
  • Note that this list is not complete - additional bots and ranges are blocked
  • too, but are not specified here. If you are interested in the full list of
  • ranges that are blocked, please get in touch (and be prepared to verify your
  • identity).
  • Baidu Spider - Sorry, you weren't taking notice of the above.
  • Megaindex.ru - Suspicious bot, with no clear information about what it does

Warnings

  • `google-extended` is not a known field.