spanglefish.com
robots.txt

Robots Exclusion Standard data for spanglefish.com

Resource Scan

Scan Details

Site Domain spanglefish.com
Base Domain spanglefish.com
Scan Status Ok
Last Scan2024-10-27T12:26:39+00:00
Next Scan 2024-11-03T12:26:39+00:00

Last Scan

Scanned2024-10-27T12:26:39+00:00
URL https://spanglefish.com/robots.txt
Domain IPs 54.73.18.6
Response IP 54.73.18.6
Found Yes
Hash 1d153fb994ead94a82c8921f95164fabc760f8f38202ad40f649cb99c9a997e6
SimHash 4b16d81252a5

Groups

*

Rule Path
Disallow /*.pdf$
Disallow /*.doc$
Disallow /*.docx$
Disallow /*.xls$
Disallow /*.xlsx$
Disallow /*.ppt$
Disallow /*.pptx$
Disallow /*.rtf$
Disallow /*.mp3$
Disallow /*/documents/

googlebot-image

Rule Path
Disallow /

grapeshot

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

petalbot

Rule Path
Disallow /

bytespider

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

seekportbot

Rule Path
Disallow /

barkrowler

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

googlebot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 1

applebot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 3

bingbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 3

geedoproductsearch

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 20

serpstatbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 20

coccocbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 20

sogou web spider

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 20

blexbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 20

Comments

  • robots.txt file to dissuade over-enthusiastic crawling.
  • copied to all sites 15/7/24
  • dotbot isn't following delay to moved to disallow
  • disallow
  • delay