lvsun.com
robots.txt

Robots Exclusion Standard data for lvsun.com

Resource Scan

Scan Details

Site Domain lvsun.com
Base Domain lvsun.com
Scan Status Ok
Last Scan2024-09-25T12:23:51+00:00
Next Scan 2024-10-02T12:23:51+00:00

Last Scan

Scanned2024-09-25T12:23:51+00:00
URL https://lvsun.com/robots.txt
Redirect https://lasvegassun.com/robots.txt
Redirect Domain lasvegassun.com
Redirect Base lasvegassun.com
Domain IPs 104.18.16.197, 104.18.17.197, 2606:4700::6812:10c5, 2606:4700::6812:11c5
Redirect IPs 104.19.177.74, 104.19.178.74, 2606:4700::6813:b14a, 2606:4700::6813:b24a
Response IP 104.19.178.74
Found Yes
Hash 3303e6f3182fbb526b9205e4de0367b6998d893c688f7a4f1c4d8e3e2f41bf0f
SimHash c290d850e297

Groups

*

Rule Path
Disallow /restaurants/search/
Disallow *reminder/
Disallow *ufcsn
Disallow /%3A
Disallow /%3A/
Disallow /*rawhtml*
Disallow /702show*
Disallow /accounts*
Disallow /accounts/login*
Disallow /admin/
Disallow /blogs/robin-leachs-las-vegas-celebrity-watch*
Disallow /cgi-bin/
Disallow /comments*
Disallow /compare/
Disallow /compare/
Disallow /contact/
Disallow /content/
Disallow /dossier*
Disallow /events/search/?category=*
Disallow /events/search/
Disallow /feedback/
Disallow /fileadmin/
Disallow /flag/
Disallow /mailfriend*
Disallow /mailfriend/
Disallow /mma-sn/
Disallow /r/
Disallow /search/
Disallow /slideshow_xml/
Disallow /sun/dossier*
Disallow /sunbin*
Disallow /sunbin/
Disallow /ufc-sn/
Disallow /ufc-video-sn/
Disallow /users/
Disallow /wec-sn/
Disallow */cdn-cgi/l/email-protection*
Disallow */slideshow_xml/
Disallow */xml/
Disallow *inlines/
Disallow *cdn-cgi/
Disallow */drudged/
Disallow *charges-fly-among-las-vegas-nightclub-investors*
Disallow *scaled.0325_met_health03*

directcrawler

Rule Path
Disallow /

voilabot

Rule Path
Disallow /

java/1.5.0_11

Rule Path
Disallow /

java/1.4.1_04

Rule Path
Disallow /

gsa-crawler

Rule Path
Disallow /

shopwiki

Rule Path
Disallow /

weborama-fetcher

Rule Path
Disallow /

krzana bot

Rule Path
Disallow /

clickagy intelligence bot v2

Rule Path
Disallow /

sputnikbot/2.3

Rule Path
Disallow /

googlebot-image

Rule Path
Disallow /*?
Disallow /*_t12*
Disallow /*_t18*
Disallow /*_t19*
Disallow /*_t20*
Disallow /*_t27*
Disallow /*_t30*
Disallow /*_t37*
Disallow /*_t60*
Disallow /*_t61*
Disallow /*_t65*
Disallow /*_t96*
Disallow /*_r45x*
Disallow /*_r50x*
Disallow /*_r90x*
Disallow /*_r60x*
Disallow /*_r100x*
Disallow /*_r104x*
Disallow /*_r180x*
Disallow /*_r340x*
Disallow /*_r415x*
Disallow /*_tx50*

twitterbot

Rule Path
Disallow

Other Records

Field Value
sitemap https://lasvegassun.com/sitemap.xml

Comments

  • Crawl-delay: 10
  • specific URLS to block
  • other user agents
  • User-agent: ia_archiver-web.archive.org
  • Disallow: /
  • images
  • added to try to force twitter to behave