sinarharian.com.my
robots.txt

Robots Exclusion Standard data for sinarharian.com.my

Resource Scan

Scan Details

Site Domain sinarharian.com.my
Base Domain sinarharian.com.my
Scan Status Ok
Last Scan2024-11-18T06:04:51+00:00
Next Scan 2024-11-25T06:04:51+00:00

Last Scan

Scanned2024-11-18T06:04:51+00:00
URL https://sinarharian.com.my/robots.txt
Redirect https://www.sinarharian.com.my/robots.txt
Redirect Domain www.sinarharian.com.my
Redirect Base sinarharian.com.my
Domain IPs 104.18.87.98, 104.18.88.98
Redirect IPs 104.18.87.98, 104.18.88.98
Response IP 104.18.88.98
Found Yes
Hash b284a2a13bc22f4945420651254fec7eaac0f16e2984a2141296522e8fd5270b
SimHash 91ed70476ffc

Groups

*

Rule Path
Disallow /ajax/*
Disallow /print*
Disallow /getRelatedArticles*
Disallow /getMostReadArticles*
Disallow /article_count/*
Disallow /get-menu-header*
Disallow /search*
Disallow /morearticles/*
Disallow /article.php*
Disallow /login-mgt
Disallow /*.php
Disallow /archive/*
Disallow /rss
Disallow /rssFeed/*
Disallow /widget/*
Disallow */page/*

grapeshot

Rule Path
Disallow