nyp.org
robots.txt

Robots Exclusion Standard data for nyp.org

Resource Scan

Scan Details

Site Domain nyp.org
Base Domain nyp.org
Scan Status Failed
Failure StageFetching resource.
Failure ReasonServer returned a client error.
Last Scan2024-05-17T13:37:55+00:00
Next Scan 2024-08-15T13:37:55+00:00

Last Successful Scan

Scanned2023-04-01T01:15:01+00:00
URL https://www.nyp.org/robots.txt
Domain IPs 104.18.8.239, 104.18.9.239, 2606:4700::6812:8ef, 2606:4700::6812:9ef
Response IP 104.18.9.239
Found Yes
Hash 5d69b170dd9ee481ac1be852c7b320595f2b1137ecb77ca28d8f6f05a7bef957
SimHash 3116fd514264

Groups

*

Rule Path
Allow /core/*.css$
Allow /core/*.css?
Allow /core/*.js$
Allow /core/*.js?
Allow /core/*.gif
Allow /core/*.jpg
Allow /core/*.jpeg
Allow /core/*.png
Allow /core/*.svg
Allow /profiles/*.css$
Allow /profiles/*.css?
Allow /profiles/*.js$
Allow /profiles/*.js?
Allow /profiles/*.gif
Allow /profiles/*.jpg
Allow /profiles/*.jpeg
Allow /profiles/*.png
Allow /profiles/*.svg
Disallow /core/
Disallow /profiles/
Disallow /README.txt
Disallow /web.config
Disallow /admin/
Disallow /comment/reply/
Disallow /filter/tips
Disallow /node/add/
Disallow /search/
Disallow /user/register/
Disallow /user/password/
Disallow /user/login/
Disallow /user/logout/
Disallow /index.php/admin/
Disallow /index.php/comment/reply/
Disallow /index.php/filter/tips
Disallow /index.php/node/add/
Disallow /index.php/search/
Disallow /index.php/user/password/
Disallow /index.php/user/register/
Disallow /index.php/user/login/
Disallow /index.php/user/logout/
Disallow /node/*
Disallow /search-results
Disallow /publications/search
Disallow /healthlibrary/articles/document/document/*
Disallow /healthlibrary/diagnosis-short/document/document/*
Disallow /healthlibrary/diagnosis/document/document/*
Disallow /healthlibrary/multimedia/document/document/*
Disallow /healthlibrary/other-details/document/document/*
Disallow /healthlibrary/shorts/document/document/*
Disallow /healthlibrary/special/document/document/*
Disallow /healthlibrary/surgical-details/document/document/*
Disallow /healthlibrary/symptoms/document/document/*
Disallow /healthlibrary/tests/document/document/*
Disallow /healthlibraryajax/*
Disallow /taxonomy/*
Disallow /cs/Satellite
Disallow /library/*
Disallow /cancer/library/*
Disallow /digestive/library/*
Disallow /heart/library/*
Disallow /morganstanley/library/*
Disallow /neuroscience/library/*
Disallow /pediatrics/healthlibrary/*
Disallow /pediatrics/library/*
Disallow /rehabmed/library/*
Disallow /vascular/library/*
Disallow /womens/library/*
Disallow /rcollection
Disallow /healthlibrary/focus/electronic-medical-and-health-records

Other Records

Field Value
crawl-delay 10

httpfetch spider

Rule Path
Disallow /

discobot

Rule Path
Disallow /

steeler

Rule Path
Disallow /

linguee

Rule Path
Disallow /

sogou

Rule Path
Disallow /

sogou web spider

Rule Path
Disallow /

sosospider

Rule Path
Disallow /

sosospider+

Rule Path
Disallow /

yodaobot

Rule Path
Disallow /

jikespider

Rule Path
Disallow /

ezooms

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

yeti

Rule Path
Disallow /

yandex

Rule Path
Disallow /

fess

Rule Path
Disallow /

exabot

Rule Path
Disallow /

skimbot

Rule Path
Disallow /

globalspec link checker

Rule Path
Disallow /

hiscan

Rule Path
Disallow /

daumoa

Rule Path
Disallow /

seokicks-robot

Rule Path
Disallow /

covariocse

Rule Path
Disallow /

yisouspider

Rule Path
Disallow /

zumbot

Rule Path
Disallow /

appcodescrawler

Rule Path
Disallow /

appcodes

Rule Path
Disallow /

easouspider

Rule Path
Disallow /

screenerbot

Rule Path
Disallow /

slurp

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

bingbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

msnbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

msnbot-media

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

baiduspider

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

heritrix

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

dotbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

Other Records

Field Value
sitemap https://www.nyp.org/sitemap.xml

Comments

  • robots.txt
  • This file is to prevent the crawling and indexing of certain parts
  • of your site by web crawlers and spiders run by sites like Yahoo!
  • and Google. By telling these "robots" where not to go on your site,
  • you save bandwidth and server resources.
  • This file will be ignored unless it is at the root of your host:
  • Used: http://example.com/robots.txt
  • Ignored: http://example.com/site/robots.txt
  • For more information about the robots.txt standard, see:
  • http://www.robotstxt.org/robotstxt.html
  • CSS, JS, Images
  • Directories
  • Files
  • Paths (clean URLs)
  • Paths (no clean URLs)
  • Resources

Warnings

  • 4 invalid lines.