metarcentral.com
robots.txt

Robots Exclusion Standard data for metarcentral.com

Resource Scan

Scan Details

Site Domain metarcentral.com
Base Domain metarcentral.com
Scan Status Ok
Last Scan2026-03-04T09:25:18+00:00
Next Scan 2026-03-11T09:25:18+00:00

Last Scan

Scanned2026-03-04T09:25:18+00:00
URL https://metarcentral.com/robots.txt
Domain IPs 116.203.221.102
Response IP 116.203.221.102
Found Yes
Hash 88e1d444d1315ba2ba0e5035fb3d43998dd5550a1cafff42638f7c7422aed850
SimHash 720b7a3162a8

Groups

*

Rule Path
Allow /
Allow /airport/
Allow /airports/
Allow /learn/
Allow /calculator/
Allow /region/
Allow /country/
Allow /about
Allow /privacy
Allow /disclaimer
Allow /fir/
Allow /aircraft/
Allow /sitemap*.xml
Disallow /api/
Disallow /admin/
Disallow /scripts/
Disallow /includes/
Disallow /setup/
Disallow /vendor/
Disallow /cache/
Disallow /public/
Disallow /docs/
Disallow /sql/
Disallow /tests/
Disallow /*.php$
Disallow /*.log$
Disallow /*.sql$
Disallow /display/
Disallow /display?
Disallow /airport/*?metar=
Disallow /*?metar=
Disallow /airport/*/historical?
Disallow /weather/*/historical?
Disallow /*?date=
Disallow /*?time=
Disallow /*?lang=
Disallow /*?*&*&*
Disallow /*?print=
Disallow /*?mobile=
Disallow /*?format=
Disallow /*?debug=
Disallow /*?preview=
Disallow /airport/*/taf?
Disallow /airport/*/charts?
Disallow /airport/*/notam?
Disallow /weather/metar/
Disallow /weather/taf/

googlebot

Rule Path
Allow /airport/
Allow /learn/
Allow /calculator/

bingbot

Rule Path
Allow /airport/
Allow /learn/

Other Records

Field Value
crawl-delay 2

yandexbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 3

ahrefsbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 30

semrushbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 30

dotbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 30

mj12bot

Rule Path
Disallow /

blexbot

Rule Path
Disallow /

dataforseobot

Rule Path
Disallow /

gptbot

Rule Path
Allow /
Allow /llms.txt
Allow /llms-full.txt
Allow /.well-known/
Disallow /api/

chatgpt-user

Rule Path
Allow /
Allow /llms.txt
Allow /llms-full.txt
Allow /.well-known/
Disallow /api/

claude-web

Rule Path
Allow /
Allow /llms.txt
Allow /llms-full.txt
Allow /.well-known/
Disallow /api/

anthropic-ai

Rule Path
Allow /
Allow /llms.txt
Allow /llms-full.txt
Allow /.well-known/
Disallow /api/

perplexitybot

Rule Path
Allow /
Allow /llms.txt
Allow /llms-full.txt
Allow /.well-known/
Disallow /api/

applebot-extended

Rule Path
Allow /
Allow /llms.txt
Allow /llms-full.txt
Allow /.well-known/
Disallow /api/

cohere-ai

Rule Path
Allow /
Allow /llms.txt
Allow /llms-full.txt
Allow /.well-known/
Disallow /api/

google-extended

Rule Path
Allow /
Allow /llms.txt
Allow /llms-full.txt
Allow /.well-known/
Disallow /api/

Other Records

Field Value
sitemap https://metarcentral.com/sitemap-index.xml

Comments

  • MetarCentral Aviation Weather - Robots.txt
  • Optimized for crawl budget and content quality
  • Last updated: 2026-02-16
  • === ALLOW: Quality Content ===
  • === DISALLOW: API & System Files ===
  • === DISALLOW: Display/Utility Pages ===
  • === DISALLOW: Query Parameters (Crawl Budget) ===
  • Nearby weather parameter creates duplicate content
  • Historical date parameters create infinite crawl paths
  • Language parameters (only English supported)
  • Multiple query parameters
  • Print/mobile/format variations
  • === DISALLOW: Low-Value Sub-Pages Without Weather ===
  • TAF/historical/NOTAM pages depend on weather data availability
  • Individual pages set noindex headers, but block crawling to save budget
  • === LEGACY URLs: Redirect to canonical ===
  • /weather/metar/ and /weather/taf/ redirect to /airport/ pages
  • === SEARCH ENGINE SPECIFIC RULES ===
  • Allow Googlebot reasonable crawl rate
  • === BLOCK: Aggressive SEO Crawlers ===
  • === AI CRAWLERS: Allow Content, Block API ===
  • Reference: https://metarcentral.com/llms.txt
  • === SITEMAP ===