cinema.com.my
robots.txt

Robots Exclusion Standard data for cinema.com.my

Resource Scan

Scan Details

Site Domain cinema.com.my
Base Domain cinema.com.my
Scan Status Ok
Last Scan2024-10-30T02:12:49+00:00
Next Scan 2024-11-06T02:12:49+00:00

Last Scan

Scanned2024-10-30T02:12:49+00:00
URL https://cinema.com.my/robots.txt
Domain IPs 103.197.57.4
Response IP 103.197.57.4
Found Yes
Hash c6f2c117beca7b938629e7ff1c57c221ef4c272cdb60cece66f7be5e6b7a083f
SimHash 290c811d48b1

Groups

*

Rule Path
Disallow /forum35/
Disallow /fox/
Disallow /mc2011/
Disallow /id/
Disallow /bm/
Disallow /inhome/
Disallow /videos/
Disallow /bios/bios_contents.aspx
Disallow /news/newslist.aspx
Disallow /showtimes/default.asp
Disallow /showtimes/cinemas_showtimes.aspx
Disallow /showtimes/movie_showtimes.aspx
Disallow /showtimes/showtimes.aspx
Disallow /movies/details.aspx?fb_comment_id*
Disallow /moviefreak/moviefreak_details.aspx?search=*.mf_*
Disallow /moviefreak/trivia_details.aspx?search=*.mf_*
Disallow /news/interviews.aspx?ribbitwinawardatniff
Disallow /articles/features_details.aspx/?search*
Disallow /articles/news_details.aspx/?search*
Disallow /articles/features_details.aspx/news_details.aspx*
Disallow /articles/features_details.aspx/gallery_details.aspx*
Disallow /articles/features_details.aspx/features_details.aspx*
Disallow /articles/features_details.aspx/cinnamon_diary_details.aspx*
Disallow /articles/features_details.aspx/events_details.aspx*

Comments

  • Prevent crawling DFP tags
  • Disallow: /55577451/
  • Directories
  • Files
  • Path matches
  • User-agent: Googlebot-News
  • Disallow: