postcrossing.com
robots.txt

Robots Exclusion Standard data for postcrossing.com

Resource Scan

Scan Details

Site Domain postcrossing.com
Base Domain postcrossing.com
Scan Status Ok
Last Scan2024-09-21T08:38:16+00:00
Next Scan 2024-09-28T08:38:16+00:00

Last Scan

Scanned2024-09-21T08:38:16+00:00
URL https://postcrossing.com/robots.txt
Redirect https://www.postcrossing.com/robots.txt
Redirect Domain www.postcrossing.com
Redirect Base postcrossing.com
Domain IPs 3.67.120.29
Redirect IPs 3.67.120.29
Response IP 3.67.120.29
Found Yes
Hash 0fd5ede64b25bde81eef3d05f9d4d3011cc26bac1efbf22fcfd67b7c79a71b6d
SimHash f61071d9a6f7

Groups

*

Rule Path
Disallow /travelingpostcard/*
Disallow /user/*/traveling
Disallow /user/*/gallery/popular
Disallow /user/*/map
Disallow /send/pm/
Allow /

googlebot-image

Rule Path
Disallow /travelingpostcard/*
Disallow /user/*/traveling
Disallow /user/*/gallery/popular
Disallow /user/*/map
Disallow /send/pm/
Disallow /postcards/*
Disallow /user/*/gallery
Disallow /gallery
Disallow /country/*
Allow /

imagesiftbot

Rule Path
Disallow /

awariorssbot
awariosmartbot

Rule Path
Disallow /postcards/

mediapartners-google

Rule Path
Allow /

archive.org_bot

Rule Path
Disallow /user/*
Disallow /postcards/*
Disallow /gallery
Allow /

screaming frog seo spider

Rule Path
Disallow /

scrapy

Rule Path
Disallow /

scrapybot

Rule Path
Disallow /

sitecheck.internetseer.com

Rule Path
Disallow /

zealbot

Rule Path
Disallow /

msiecrawler

Rule Path
Disallow /

sitesnagger

Rule Path
Disallow /

webstripper

Rule Path
Disallow /

webcopier

Rule Path
Disallow /

fetch

Rule Path
Disallow /

offline explorer

Rule Path
Disallow /

teleport

Rule Path
Disallow /

teleportpro

Rule Path
Disallow /

webzip

Rule Path
Disallow /

linko

Rule Path
Disallow /

httrack

Rule Path
Disallow /

microsoft.url.control

Rule Path
Disallow /

xenu

Rule Path
Disallow /

larbin

Rule Path
Disallow /

libwww

Rule Path
Disallow /

zyborg

Rule Path
Disallow /

download ninja

Rule Path
Disallow /

fast

Rule Path
Disallow /

wget

Rule Path
Disallow /

grub-client

Rule Path
Disallow /

k2spider

Rule Path
Disallow /

npbot

Rule Path
Disallow /

webreaper

Rule Path
Disallow /

amazonbot

Rule Path
Disallow /travelingpostcard/*
Disallow /user/*/traveling
Disallow /user/*/gallery/popular
Disallow /user/*/map
Disallow /postcards/
Disallow /user/
Disallow /gallery

applebot-extended

Rule Path
Disallow /travelingpostcard/*
Disallow /user/*/traveling
Disallow /user/*/gallery/popular
Disallow /user/*/map
Disallow /send/pm/
Disallow /postcards/
Disallow /user/
Disallow /gallery

bytespider

Rule Path
Disallow /travelingpostcard/*
Disallow /user/*/traveling
Disallow /user/*/gallery/popular
Disallow /user/*/map
Disallow /send/pm/
Disallow /postcards/
Disallow /user/
Disallow /gallery

ccbot

Rule Path
Disallow /travelingpostcard/*
Disallow /user/*/traveling
Disallow /user/*/gallery/popular
Disallow /user/*/map
Disallow /send/pm/
Disallow /postcards/
Disallow /user/
Disallow /gallery

claudebot

Rule Path
Disallow /travelingpostcard/*
Disallow /user/*/traveling
Disallow /user/*/gallery/popular
Disallow /user/*/map
Disallow /send/pm/
Disallow /postcards/
Disallow /user/
Disallow /gallery

diffbot

Rule Path
Disallow /travelingpostcard/*
Disallow /user/*/traveling
Disallow /user/*/gallery/popular
Disallow /user/*/map
Disallow /send/pm/
Disallow /postcards/
Disallow /user/
Disallow /gallery

facebookbot

Rule Path
Disallow /travelingpostcard/*
Disallow /user/*/traveling
Disallow /user/*/gallery/popular
Disallow /user/*/map
Disallow /send/pm/
Disallow /postcards/
Disallow /user/
Disallow /gallery

google-extended

Rule Path
Disallow /travelingpostcard/*
Disallow /user/*/traveling
Disallow /user/*/gallery/popular
Disallow /user/*/map
Disallow /send/pm/
Disallow /postcards/
Disallow /user/
Disallow /gallery

gptbot

Rule Path
Disallow /travelingpostcard/*
Disallow /user/*/traveling
Disallow /user/*/gallery/popular
Disallow /user/*/map
Disallow /send/pm/
Disallow /postcards/
Disallow /user/
Disallow /gallery

omgili

Rule Path
Disallow /travelingpostcard/*
Disallow /user/*/traveling
Disallow /user/*/gallery/popular
Disallow /user/*/map
Disallow /send/pm/
Disallow /postcards/
Disallow /user/
Disallow /gallery

anthropic-ai

Rule Path
Disallow /travelingpostcard/*
Disallow /user/*/traveling
Disallow /user/*/gallery/popular
Disallow /user/*/map
Disallow /send/pm/
Disallow /postcards/
Disallow /user/
Disallow /gallery

claude-web

Rule Path
Disallow /travelingpostcard/*
Disallow /user/*/traveling
Disallow /user/*/gallery/popular
Disallow /user/*/map
Disallow /send/pm/
Disallow /postcards/
Disallow /user/
Disallow /gallery

cohere-ai

Rule Path
Disallow /travelingpostcard/*
Disallow /user/*/traveling
Disallow /user/*/gallery/popular
Disallow /user/*/map
Disallow /send/pm/
Disallow /postcards/
Disallow /user/
Disallow /gallery

Comments

  • postcrossing.com robots.txt file
  • NOTE: Entries in robots.txt don't inherit from '*'. Or not all bots know how to anyway, hence the repetition
  • only the right user can open it, so stop doing 403's
  • Don't need the extra load
  • only the right user can open it, so stop doing 403's
  • extra
  • AdSense crawler
  • Wayback machine: don't overdue it
  • If you don't know how to behave, you are not welcome
  • Please respect our Terms of Service: spiders/scrappers are only allowed with explicit permission
  • below here is primarily from https://en.wikipedia.org/robots.txt
  • Some bots are known to be trouble, particularly those designed to copy
  • entire sites. Please obey robots.txt.
  • Misbehaving: requests much too fast:
  • Sorry, wget in its recursive mode is a frequent problem.
  • Please read the man page and use it properly; there is a
  • --wait option you can use to set the delay between hits,
  • for instance.
  • The 'grub' distributed client has been *very* poorly behaved.
  • Doesn't follow robots.txt anyway, but...
  • Hits many times per second, not acceptable
  • http://www.nameprotect.com/botinfo.html
  • A capture bot, downloads gazillions of pages with no public benefit
  • http://www.webreaper.net/
  • AI bots create needless extra load, so limiting to just basics
  • below is from https://darkvisitors.com/docs/robots-txt