account.idahostatesman.com
robots.txt

Robots Exclusion Standard data for account.idahostatesman.com

Resource Scan

Scan Details

Site Domain account.idahostatesman.com
Base Domain idahostatesman.com
Scan Status Failed
Failure StageFetching resource.
Failure ReasonRequest timed out.
Last Scan2024-06-15T02:29:20+00:00
Next Scan 2024-09-13T02:29:20+00:00

Last Successful Scan

Scanned2021-10-19T12:26:15+00:00
URL https://account.idahostatesman.com/robots.txt
Found Yes
Hash 16be9cd74846544b23aff7df943f9f7f628d0b6eccafcabf91a05eecae8dad38
SimHash 6843f2af477c

Groups

mediapartners-google

Rule Path
Disallow

yandex

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 3

baiduspider

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 3

bingbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 2

slurp

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 2

genieo

Rule Path
Disallow /

ecoresearch

Rule Path
Disallow /

spinn3r

Rule Path
Disallow /

blp_bbot

Rule Path
Disallow /

r6_commentreader

Rule Path
Disallow /

omgilibot

Rule Path
Disallow /

easouspider

Rule Path
Disallow /

a6-indexer

Rule Path
Disallow /

magpie-crawler

Rule Path
Disallow /

Comments

  • Global allow for Mediapartners - this is used by Google to place ads in content,
  • not indexing purposes.
  • Temporary throttles to decrease load during launch
  • Temporary blocks of uninteresting bots