cna.com.tw
robots.txt

Robots Exclusion Standard data for cna.com.tw

Resource Scan

Scan Details

Site Domain cna.com.tw
Base Domain cna.com.tw
Scan Status Ok
Last Scan2024-09-20T07:25:05+00:00
Next Scan 2024-09-27T07:25:05+00:00

Last Scan

Scanned2024-09-20T07:25:05+00:00
URL https://cna.com.tw/robots.txt
Redirect https://www.cna.com.tw/robots.txt
Redirect Domain www.cna.com.tw
Redirect Base cna.com.tw
Domain IPs 34.160.71.205
Redirect IPs 34.160.71.205
Response IP 34.160.71.205
Found Yes
Hash 73b1c90d0d4a321855167b3467e55d281b07e5e69ba759d5c7ed6274a572b00f
SimHash 6805d6591c58

Groups

*

Rule Path
Allow /.well-known/amphtml/apikey.pub
Disallow /culture/search.aspx?q=*
Disallow /Postwrite/search?q=*
Disallow /video/search?q=*
Disallow /Proj_GoodBook/Search/*/*.aspx
Disallow /Proj_GoodBook/search.aspx?q=*
Disallow /culture/topicpreview/*
Disallow /culture/articlepreview/*
Disallow /gallery/*
Disallow /Proj_County/*
Disallow /proj_county/*
Disallow /search/hysearchws.aspx?q=*

Other Records

Field Value
sitemap https://www.cna.com.tw/atomfeed_cfp.xml
sitemap https://www.cna.com.tw/sitemap_fromRemote_cfp.xml
sitemap https://www.cna.com.tw/GoogleNewsSitemap_fromRemote_cfp.xml

Comments

  • User-agent: ia_archiver
  • Disallow: /MakerList/Index?*
  • Disallow: /MakerContent/Index?*
  • Disallow: /VideoList/Index?*
  • Disallow: /VideoContent/Index?*
  • Disallow: /postwrite2021/*