koreanwar.org
robots.txt

Robots Exclusion Standard data for koreanwar.org

Resource Scan

Scan Details

Site Domain koreanwar.org
Base Domain koreanwar.org
Scan Status Ok
Last Scan2024-09-03T20:56:09+00:00
Next Scan 2024-10-03T20:56:09+00:00

Last Scan

Scanned2024-09-03T20:56:09+00:00
URL https://www.koreanwar.org/robots.txt
Domain IPs 97.77.214.34
Response IP 97.77.214.34
Found Yes
Hash c45e052120aefbc89bc1b415c4d67bf6a25b03d4ae38a78e56e0787daeed20a6
SimHash a05441b14587

Groups

bingbot

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

dataforseobot

Rule Path
Disallow /

amazonbot

Rule Path
Disallow /state-results/

amazonbot

Rule Path
Disallow /html/

amazonbot

Rule Path
Disallow /state/

amazonbot

Rule Path
Disallow /chart/

amazonbot

Rule Path
Disallow /kccf1/

amazonbot

Rule Path
Disallow /awards/

gptbot

Rule Path
Disallow /

the knowledge ai

Rule Path
Disallow /

dataprovider

Rule Path
Disallow /

alphaseobot-sa

Rule Path
Disallow /

amazonbot

Rule Path
Disallow /

archive.org_bot

Rule Path
Disallow /

archive-org.com

Rule Path
Disallow /

barkrowler/0.9

Rule Path
Disallow /

blexbot

Rule Path
Disallow /

bluechipbacklinks

Rule Path
Disallow /

cliqzbot

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

domainappender

Rule Path
Disallow /

garlikcrawler

Rule Path
Disallow /

ia_archiver

Rule Path
Disallow /

ia_archiver-web.archive.org

Rule Path
Disallow /

linguee

Rule Path
Disallow /

piplbot

Rule Path
Disallow /

mauibot

Rule Path
Disallow /

mojeekbot

Rule Path
Disallow /

panscient.com

Rule Path
Disallow /

seznambot

Rule Path
Disallow /

vebidoobot

Rule Path
Disallow /

npbot

Rule Path
Disallow /

velenpublicwebcrawler

Rule Path
Disallow /

zoominfobot

Rule Path
Disallow /

seokicks-robot

Rule Path
Disallow /

seokicks

Rule Path
Disallow /

wget

Rule Path
Disallow /

speedy

Rule Path
Disallow /

ezooms

Rule Path
Disallow /

ezooms/1.0; ezooms.bot@gmail.com

Rule Path
Disallow /

red

Rule Path
Disallow /

spbot

Rule Path
Disallow /

turnitinbot

Rule Path
Disallow /

hailoobot

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

discobot

Rule Path
Disallow /

shopwiki

Rule Path
Disallow /

voilabot

Rule Path
Disallow /

daumoa

Rule Path
Disallow /

naverbot

Rule Path
Disallow /

sogou web spider

Rule Path
Disallow /

sogou spider

Rule Path
Disallow /

cuil.com

Rule Path
Disallow /

yandeximages

Rule Path
Disallow /

yandex

Rule Path
Disallow /

scoutjet

Rule Path
Disallow /

twiceler

Rule Path
Disallow /

gigabot

Rule Path
Disallow /

scrubby

Rule Path
Disallow /

robozilla

Rule Path
Disallow /

nutch

Rule Path
Disallow /

ia_archiver

Rule Path
Disallow /

baiduspider

Rule Path
Disallow /

baiduspider-video

Rule Path
Disallow /

baiduspider-image

Rule Path
Disallow /

yeti

Rule Path
Disallow /

asterias

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

exabot

Rule Path
Disallow /

kalooga

Rule Path
Disallow /

scoutjet

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

semrushbot-sa

Rule Path
Disallow /

sitecheck.internetseer.com

Rule Path
Disallow /

zealbot

Rule Path
Disallow /

msiecrawler

Rule Path
Disallow /

webreaper

Rule Path
Disallow /

sitesnagger

Rule Path
Disallow /

webstripper

Rule Path
Disallow /

webcopier

Rule Path
Disallow /

fetch

Rule Path
Disallow /

offline explorer

Rule Path
Disallow /

teleport

Rule Path
Disallow /

teleportpro

Rule Path
Disallow /

webzip

Rule Path
Disallow /

linko

Rule Path
Disallow /

httrack

Rule Path
Disallow /

microsoft.url.control

Rule Path
Disallow /

xenu

Rule Path
Disallow /

larbin

Rule Path
Disallow /

libwww

Rule Path
Disallow /

zyborg

Rule Path
Disallow /

download ninja

Rule Path
Disallow /

spinn3r

Rule Path
Disallow /

teoma

Rule Path
Disallow /
Disallow /chart/
Disallow /google3/
Disallow /google_2009/
Disallow /GOOGLE_2009_utex/
Disallow /GOOGLE_2010/
Disallow /GOOGLE_2011/
Disallow /html/korean_war_project_remembrance_search_2.html?KCCF1.ST=*
Disallow /html/korean_war_project_remembrance.html?KCCF1__KEY=*
Disallow /html/korean_war_project_remembrance_search_2_2013.html
Disallow /html/korean_war_project_remembrance_search_2_2017.html
Disallow /html/korean_war_project_remembrance_search_6_2017.html?key=*
Disallow /html/korean_war_project_remembrance_search_6_2013.html
Disallow /html/korean_war_project_remembrance_search_6_2013.html?key=*
Disallow /html/korean_war_google_earth.html
Disallow /html/2011_2id_go_award_individual.html?key_2id=*
Disallow /html/2011_2id_go_award_by_unit.html?PageNum_Looking=*&unit1=*
Disallow /html/maps_marines.html
Disallow /html/finding_the_families.html?key=*
Disallow /html/2017_state_search_*
Disallow /ads.txt
Disallow /html/korean_war_databases.html
Disallow /html/history_and_reference.html?dept_key=*
Disallow /html/bookstore_book.html?bookstore_id=*
Disallow /html/usmc_korean_war_records_unit.html?pid=*
Disallow /html/korean_war_maps_results.html?id=*
Disallow /html/korean_war_maps_results_navy.html?id=*
Disallow /html/2011_2id_nara_records.html
Disallow /html/korean_war_project_remembrance_search_2_2013.html?KCCF1.ST=*
Disallow /html/korean_war_project_remembrance_search_2_2013.html?KCCF1.ST=*&alpha=*
Disallow /html/2017_state_search_*
Disallow /html/korean_war_veterans_memorial_s.html
Disallow /html/maps_army.html
Disallow /html/history_and_reference.html
Disallow /html/finding_the_families.html?KCCF1__Key=*
Disallow /html/2011_2id_go_award_by_unit.html?unit1=*
Disallow /html/korean_war_maps_results_l752.html?id=*
Disallow /html/usmc_korean_war_records.html
Disallow /html/heartbreak_ridge_-_text.html
Disallow /html/heartbreak_ridge.html
Disallow /html/chapter_one.html
Disallow /html/maps_utex.html
Disallow /html/panmunjom.html
Disallow /html/dmz_war.html
Disallow /html/korean_war_google_earth.html

Other Records

Field Value
sitemap https://www.koreanwar.org/sitemap.xml

Comments

  • robots.txt for https://www.koreanwar.org
  • robots.txt generated at http://www.mcanerin.com
  • http://www.mcanerin.com/EN/search-engine/robots-txt.asp
  • date: 09/02/2024 updated
  • add or remove both Bing and msn bots
  • Proprietary German backlinks service.
  • wget run recusively just wrecks server capacity
  • This spider's output isn't public.
  • Entireweb
  • Poorly behaved bot
  • Foreign-language bot
  • Russian image search engine
  • These bots are designed to duplicate entire sites.

Warnings

  • 2 invalid lines.