globalskywatch.com
robots.txt

Robots Exclusion Standard data for globalskywatch.com

Resource Scan

Scan Details

Site Domain globalskywatch.com
Base Domain globalskywatch.com
Scan Status Ok
Last Scan2024-11-14T20:01:43+00:00
Next Scan 2024-11-21T20:01:43+00:00

Last Scan

Scanned2024-11-14T20:01:43+00:00
URL http://globalskywatch.com/robots.txt
Domain IPs 38.154.153.250
Response IP 38.154.153.250
Found Yes
Hash 87896adea2d21e90738034f8e2544106a9f6f19f47482fedbb2416e5d3eb7602
SimHash 8c70ef314eeb

Groups

apple-pubsub
asterias
baiduspider
baiduspider-image
baiduspider-video
baiduspider-news
baiduspider-favo
baiduspider-cpro
baiduspider-ads
bingbot
buzzbot
dotbot
dotbot/1.1
duckduckbot
duckduckbot/1.0
duckduckbot/1.1
exabot
facebookexternalhit
facebot
gigabot
gigablastopensource
googlebot
googlebot-image
googlebot-mobile
googlebot-video
ia_archiver
irlbot
jobboersebot
kraken-crawler/*
megaindex.ru
megaindex.ru/*
mj12bot
mj12bot/*
msnbot
okhttp
psbot
robozilla
scoutjet
simplepie
sitelockspider
slurp
sogou pic spider
sogou head spider
sogou web spider
sogou orion spider
teoma
twiceler
twitterbot
yahoo-mmcrawler
yahoo-blogs/v3.9
yandex

Rule Path
Disallow /cgi-bin/

Other Records

Field Value
crawl-delay 30

*

Rule Path
Disallow /

Comments

  • Disallow: /gallery/
  • Address the big bots specifically.
  • Crawl Delay generally not recommended because of robot efficiency.
  • User-agent: Googlebot
  • Disallow: /gallery/
  • User-agent: bingbot
  • Disallow: /gallery/
  • Crawl-Delay: 5
  • User-agent: MSNBot
  • Disallow: /gallery/
  • Crawl-Delay: 5
  • User-agent: Slurp
  • Disallow: /gallery/
  • Crawl-Delay: 5
  • This user agent supports a maximum crawl delay of 20 seconds.
  • User-agent: XoviBot
  • Disallow: /gallery/
  • Crawl-Delay: 10
  • Special Cases
  • Cuil gets stuck in Gallery's checkout, so keep it out of Gallery altogether.
  • User-agent: twiceler
  • Disallow: /gallery/
  • Slow Yandex way down. It's a worthless Russian search bot.
  • User-agent: Yandex
  • Crawl-Delay: 30
  • Slow MJ12bot down because it's worthless to us.
  • User-agent: MJ12bot
  • Crawl-Delay: 30
  • Magpie-Crawler is killing our server 8/1/2014.
  • User-agent: magpie-crawler
  • Disallow: /
  • Block a bunch of other bad bots.
  • User-agent: Rogerbot
  • Disallow: /
  • User-agent: Exabot
  • Disallow: /
  • User-agent: Dotbot
  • Disallow: /
  • User-agent: Gigabot
  • Disallow: /
  • User-agent: AhrefsBot
  • Disallow: /
  • User-agent: BlackWidow
  • Disallow: /
  • User-agent: Bot\ mailto:craftbot@yahoo.com
  • Disallow: /
  • User-agent: ChinaClaw
  • Disallow: /
  • User-agent: Custo
  • Disallow: /
  • User-agent: DISCo
  • Disallow: /
  • User-agent: Download\ Demon
  • Disallow: /
  • User-agent: eCatch
  • Disallow: /
  • User-agent: EirGrabber
  • Disallow: /
  • User-agent: EmailSiphon
  • Disallow: /
  • User-agent: EmailWolf
  • Disallow: /
  • User-agent: Express\ WebPictures
  • Disallow: /
  • User-agent: ExtractorPro
  • Disallow: /
  • User-agent: EyeNetIE
  • Disallow: /
  • User-agent: FlashGet
  • Disallow: /
  • User-agent: GetRight
  • Disallow: /
  • User-agent: GetWeb!
  • Disallow: /
  • User-agent: Go!Zilla
  • Disallow: /
  • User-agent: Go-Ahead-Got-It
  • Disallow: /
  • User-agent: GrabNet
  • Disallow: /
  • User-agent: Grafula
  • Disallow: /
  • User-agent: HMView
  • Disallow: /
  • User-agent: HTTrack
  • Disallow: /
  • User-agent: Image\ Stripper
  • Disallow: /
  • User-agent: Image\ Sucker
  • Disallow: /
  • Allow throttled Gallery for all other engines.
  • User-agent: *
  • Disallow: /gallery/
  • Crawl-delay: 20