besttime2travel.com
robots.txt

Robots Exclusion Standard data for besttime2travel.com

Resource Scan

Scan Details

Site Domain besttime2travel.com
Base Domain besttime2travel.com
Scan Status Ok
Last Scan2024-10-04T08:25:57+00:00
Next Scan 2024-10-11T08:25:57+00:00

Last Scan

Scanned2024-10-04T08:25:57+00:00
URL https://besttime2travel.com/robots.txt
Redirect https://www.besttime2travel.com/robots.txt
Redirect Domain www.besttime2travel.com
Redirect Base besttime2travel.com
Domain IPs 2406:da18:9d0:143f:2124:4e9c:36a9:d9de, 52.221.42.138
Redirect IPs 104.21.40.220, 172.67.157.37, 2606:4700:3031::6815:28dc, 2606:4700:3033::ac43:9d25
Response IP 104.21.40.220
Found Yes
Hash 7ffbb35f5c83084e61f2588872c624c3899abe3ed78055d0e310850be9cc3104
SimHash b8149f4bcd40

Groups

mj12bot

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

sogou spider

Rule Path
Disallow /

seokicks-robot

Rule Path
Disallow /

blexbot

Rule Path
Disallow /

sistrix crawler

Rule Path
Disallow /

uptimerobot/2.0

Rule Path
Disallow /

proximic

Rule Path
Disallow /

nerdybot

Rule Path
Disallow /

ezooms robot

Rule Path
Disallow /

perl lwp

Rule Path
Disallow /

blexbot

Rule Path
Disallow /

netestate ne crawler (+http://www.website-datenbank.de/)

Rule Path
Disallow /

wiseguys robot

Rule Path
Disallow /

turnitin robot

Rule Path
Disallow /

pimonster

Rule Path
Disallow /

pimonster

Rule Path
Disallow /

pi-monster

Rule Path
Disallow /

searchmetricsbot

Rule Path
Disallow /

eccp/1.0 (search@eniro.com)

Rule Path
Disallow /

yandex

Rule Path
Disallow /

yandex

Rule Path Comment
Disallow / blocks access to whole site

sogou spider

Rule Path
Disallow /

youdaobot

Rule Path
Disallow /

gsa-crawler (enterprise; t4-knhh62cdkc2w3; gsa_manage@nikon-sys.co.jp)

Rule Path
Disallow /
Disallow /*/details%3Bjsessionid%3D*

megaindex.ru/2.0

Rule Path
Disallow /

icc-crawler/2.0

Rule Path
Disallow /

megaindex.ru

Rule Path
Disallow /

megaindex.ru

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

magpie-crawler

Rule Path
Disallow /

blexbot

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

semrushbot-sa

Rule Path
Disallow /

trendictionbot

Rule Path
Disallow /

seznambot

Rule Path
Disallow /

*

Rule Path
Allow /core/*.css$
Allow /core/*.css?
Allow /core/*.js$
Allow /core/*.js?
Allow /core/*.gif
Allow /core/*.jpg
Allow /core/*.jpeg
Allow /core/*.png
Allow /core/*.svg
Allow /profiles/*.css$
Allow /profiles/*.css?
Allow /profiles/*.js$
Allow /profiles/*.js?
Allow /profiles/*.gif
Allow /profiles/*.jpg
Allow /profiles/*.jpeg
Allow /profiles/*.png
Allow /profiles/*.svg
Allow /country/
Allow /search/pt/
Allow /search/ie/
Allow /search/th/
Allow /search/be/
Allow /search/tt/
Allow /search/za/
Allow /search/au/
Allow /search/tw/
Allow /search/ch/
Allow /search/jp/
Allow /search/hk/
Allow /search/cl/
Allow /search/ar/
Allow /search/cr/
Allow /search/to/
Allow /search/si/
Allow /search/mx/
Allow /search/ec/
Allow /search/ie/
Allow /search/th/
Allow /search/za/
Allow /search/be/
Allow /search/cz/
Allow /country/
Allow /search/
Disallow /search/
Disallow /core/
Disallow /profiles/
Disallow /README.txt
Disallow /web.config
Disallow /admin/
Disallow /comment/reply/
Disallow /filter/tips
Disallow /search/
Disallow /user/register/
Disallow /user/password/
Disallow /user/login/
Disallow /user/logout/
Disallow /index.php/admin/
Disallow /index.php/comment/reply/
Disallow /index.php/filter/tips
Disallow /index.php/node/add/
Disallow /index.php/search/
Disallow /index.php/user/password/
Disallow /index.php/user/register/
Disallow /index.php/user/login/
Disallow /index.php/user/logout/

Comments

  • robots.txt
  • This file is to prevent the crawling and indexing of certain parts
  • of your site by web crawlers and spiders run by sites like Yahoo!
  • and Google. By telling these "robots" where not to go on your site,
  • you save bandwidth and server resources.
  • This file will be ignored unless it is at the root of your host:
  • Used: http://example.com/robots.txt
  • Ignored: http://example.com/site/robots.txt
  • For more information about the robots.txt standard, see:
  • http://www.robotstxt.org/robotstxt.html
  • User-agent: ChatGPT
  • Disallow: /
  • User-agent: OpenAI
  • Disallow: /
  • Block MJ12bot as it is just noise
  • Block Ahrefs
  • Block Sogou
  • Block SEOkicks
  • Block BlexBot
  • Block SISTRIX
  • Block Uptime robot
  • Block Ezooms Robot
  • Block Perl LWP
  • Block BlexBot
  • Block netEstate NE Crawler (+http://www.website-datenbank.de/)
  • Block WiseGuys Robot
  • Block Turnitin Robot
  • Block pricepi
  • Block Searchmetrics Bot
  • Block Eniro
  • Block YandexBot
  • Block Baidu
  • User-agent: Baiduspider
  • User-agent: Baiduspider-video
  • User-agent: Baiduspider-image
  • Disallow: /
  • Block SoGou
  • Block Youdao
  • Block Nikon JP Crawler
  • Block PDP URL with JsessionID
  • Block MegaIndex.ru
  • CSS, JS, Images
  • Country Pages
  • Directories
  • Files
  • Paths (clean URLs)
  • Disallow: /node/add/
  • Paths (no clean URLs)

Warnings

  • 2 invalid lines.