novinky.cz
robots.txt

Robots Exclusion Standard data for novinky.cz

Resource Scan

Scan Details

Site Domain novinky.cz
Base Domain novinky.cz
Scan Status Ok
Last Scan2024-11-16T16:24:23+00:00
Next Scan 2024-11-23T16:24:23+00:00

Last Scan

Scanned2024-11-16T16:24:23+00:00
URL https://novinky.cz/robots.txt
Redirect https://www.novinky.cz/robots.txt
Redirect Domain www.novinky.cz
Redirect Base novinky.cz
Domain IPs 2a02:598:2::151, 2a02:598:a::78:151, 77.75.76.151, 77.75.78.151
Redirect IPs 2a02:598:2::151, 2a02:598:a::78:151, 77.75.76.151, 77.75.78.151
Response IP 77.75.76.151
Found Yes
Hash 742ae466ee494c94e07e11b9081eaf10397429352c2b07b7d276e9d5e8f3ea79
SimHash 480049374009

Groups

*

Rule Path
Disallow /clanek/*timeline--pageItem%3D
Disallow /*mol-gallery--expanded%3D
Disallow /*mol-gallery--selected%3D
Disallow /*fts--order%3Dalphabetical-desc
Disallow /*fts--order%3Daccept-time-desc
Disallow /*fts--order%3Daccept-time-asc
Disallow /*fts--order%3Dranking-asc
Disallow /*fts--order%3Dranking-desc
Disallow /*fts--order%3Drandom
Disallow /fts--search
Disallow /*expandRelatedDocuments
Disallow /*ribbon--menu%3D
Disallow /*ribbon--search%3D
Disallow /diskuse
Disallow /*menu--open
Disallow /*previewToken%3D
Disallow /*bankid%3D

seznamsocialbot

Rule Path
Allow /

Other Records

Field Value
sitemap https://www.novinky.cz/sitemaps/sitemap_articles.xml
sitemap https://www.novinky.cz/sitemaps/sitemap_news.xml
sitemap https://www.novinky.cz/sitemaps/sitemap_sections.xml
sitemap https://www.novinky.cz/sitemaps/sitemap_tags.xml

Comments

  • dont crawl pagination on article pages
  • dont crawl the same page with opened gallery
  • from gallery in newsfeed
  • category of photo contests
  • legacy parameters
  • historical parameters
  • article preview
  • social profile preview tab