planet.com
robots.txt

Robots Exclusion Standard data for planet.com

Resource Scan

Scan Details

Site Domain planet.com
Base Domain planet.com
Scan Status Ok
Last Scan2025-08-27T19:50:07+00:00
Next Scan 2025-09-26T19:50:07+00:00

Last Scan

Scanned2025-08-27T19:50:07+00:00
URL https://planet.com/robots.txt
Redirect https://www.planet.com:443/robots.txt
Redirect Domain www.planet.com
Redirect Base planet.com
Domain IPs 34.120.196.216
Redirect IPs 34.120.196.216
Response IP 34.120.196.216
Found Yes
Hash c8ff4189a517f2a3038d09d63bc3c72e236a0eb6c1e9c71e69e290e08e67e2c9
SimHash 5c01917182a4

Groups

amazonbot

Rule Path
Disallow /

anthropic-ai

Rule Path
Disallow /

applebot

Rule Path
Disallow /

applebot-extended

Rule Path
Disallow /

bytespider

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

claude-web

Rule Path
Disallow /

claudebot

Rule Path
Disallow /

cohere-ai

Rule Path
Disallow /

diffbot

Rule Path
Disallow /

facebookbot

Rule Path
Disallow /

facebookexternalhit

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

imagesiftbot

Rule Path
Disallow /

magpie-crawler

Rule Path
Disallow /

meta-externalagent

Rule Path
Disallow /

meta-externalfetcher

Rule Path
Disallow /

oai-searchbot

Rule Path
Disallow /

omgili

Rule Path
Disallow /

omgilibot

Rule Path
Disallow /

openai

Rule Path
Disallow /

perplexity-user

Rule Path
Disallow /

perplexitybot

Rule Path
Disallow /

petalbot

Rule Path
Disallow /

scrapy

Rule Path
Disallow /

timpibot

Rule Path
Disallow /

turnitinbot

Rule Path
Disallow /

twitterbot

Rule Path
Disallow /

yandexadditional

Rule Path
Disallow /

yandexadditionalbot

Rule Path
Disallow /

youbot

Rule Path
Disallow /

*

Rule Path
Allow /
Disallow /search
Disallow /search/*
Disallow /marketplace/*
Disallow /ignite25
Disallow /ignite25/
Disallow /products-v2a/
Disallow /products-v2b/
Disallow /admin
Disallow /api/internal

Other Records

Field Value
crawl-delay 1

Other Records

Field Value
sitemap https://www.planet.com/sitemap-index.xml

Comments

  • Bandwidth saving measures
  • Allow all other crawlers but with specific restrictions
  • Sitemap and host information

Warnings

  • `host` is not a known field.