flightaware.com
robots.txt

Robots Exclusion Standard data for flightaware.com

Resource Scan

Scan Details

Site Domain flightaware.com
Base Domain flightaware.com
Scan Status Ok
Last Scan2024-10-28T23:52:30+00:00
Next Scan 2024-11-27T23:52:30+00:00

Last Scan

Scanned2024-10-28T23:52:30+00:00
URL https://flightaware.com/robots.txt
Redirect https://www.flightaware.com/robots.txt
Redirect Domain www.flightaware.com
Redirect Base flightaware.com
Domain IPs 104.18.39.201, 172.64.148.55, 2606:4700:4400::6812:27c9, 2606:4700:4400::ac40:9437
Redirect IPs 104.18.39.201, 172.64.148.55, 2606:4700:4400::6812:27c9, 2606:4700:4400::ac40:9437
Response IP 104.18.39.201
Found Yes
Hash 68e760c86341551a9f93d65fb6cb9540d8a069416ed00ea4a2c6acbd371de925
SimHash 20d3595d4775

Groups

ubicrawler

Rule Path
Disallow /

doc

Rule Path
Disallow /

zao

Rule Path
Disallow /

sitecheck.internetseer.com

Rule Path
Disallow /

zealbot

Rule Path
Disallow /

msiecrawler

Rule Path
Disallow /

sitesnagger

Rule Path
Disallow /

webstripper

Rule Path
Disallow /

webcopier

Rule Path
Disallow /

fetch

Rule Path
Disallow /

offline explorer

Rule Path
Disallow /

teleport

Rule Path
Disallow /

teleportpro

Rule Path
Disallow /

webzip

Rule Path
Disallow /

linko

Rule Path
Disallow /

httrack

Rule Path
Disallow /

microsoft.url.control

Rule Path
Disallow /

xenu

Rule Path
Disallow /

larbin

Rule Path
Disallow /

libwww

Rule Path
Disallow /

zyborg

Rule Path
Disallow /

download ninja

Rule Path
Disallow /

grub-client

Rule Path
Disallow /

k2spider

Rule Path
Disallow /

npbot

Rule Path
Disallow /

webreaper

Rule Path
Disallow /

irvine

Rule Path
Disallow /

bingbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 1

wget

Rule Path
Disallow /account/
Disallow /analysis/
Disallow /include/
Disallow /live/
Disallow /resources/
Disallow /map/
Disallow /me/
Disallow /photos/

*

Rule Path
Disallow /account/
Disallow /ajax/
Disallow /bait/
Disallow /errors/
Disallow /live/flight/id/
Disallow /mapi/
Disallow /me/
Disallow /photos/upload.rvt
Disallow /photos/upload
Disallow /mp/
Disallow /live/report.rvt
Disallow /adsb/register
Disallow /adsb/request
Disallow /adsb/flightfeeder/terms
Disallow /photos/crowdsource/
Disallow /photos/crowdsource.rvt
Disallow /commercial/flightxml/v3/

applebot

Rule Path
Disallow /ajax/

cliqzbot

Rule Path
Disallow /

twitterbot

Rule Path
Allow /news/
Allow /about/careers/

gptbot

Rule Path
Disallow /

Comments

  • robots.txt for www.flightaware.com hosted by yatyu.dal.flightaware.com
  • Specific unwanted clients
  • Command line recursive requests as well as automated fetching from the non-
  • exportable data is not acceptable.
  • See:
  • https://www.flightaware.com/about/termsofuse
  • https://www.flightaware.com/commercial/flightxml/
  • General robot rules
  • Stop Applebot from beating the crap out of ajax endpoints (specifically the
  • static flight map one)
  • Allow Twitter to grab article and careers blobs
  • Stop OpenAI/ChatGPT from training on our data

Warnings

  • 2 invalid lines.