m.104.com.tw
robots.txt

Robots Exclusion Standard data for m.104.com.tw

Resource Scan

Scan Details

Site Domain m.104.com.tw
Base Domain 104.com.tw
Scan Status Ok
Last Scan2024-06-06T20:37:58+00:00
Next Scan 2024-06-20T20:37:58+00:00

Last Scan

Scanned2024-06-06T20:37:58+00:00
URL https://m.104.com.tw/robots.txt
Domain IPs 122.147.53.41
Response IP 122.147.53.41
Found Yes
Hash 1d50b32a90ecf0332ad9024849802d59ff2edcf47fea25f126d03b7ecea15308
SimHash d050d4f2a901

Groups

*

Rule Path
Allow /$
Allow /search/
Allow /job/
Allow /company/
Allow /custSearch/
Disallow *keyword%3D*xyz*
Disallow *keyword%3D*telegram*
Disallow *keyword%3D*tg%3A*
Disallow *keyword%3D*.*
Disallow *keyword%3D*%3A*
Disallow *keyword%3D*%3B*
Disallow *keyword%3D*%28*
Disallow *keyword%3D*%7C*
Disallow *keyword%3D*/*
Disallow *keyword%3D*~*
Disallow *keyword%3D*%40*
Disallow *excludeCompanyByCustno%3D*
Disallow *excludeIndustryCat%3D*
Disallow *expansionType%3D*
Disallow *langStatus%3D*
Disallow *remoteWork%3D*
Disallow *recommendJob%3D*
Disallow *langStatus%3D*
Disallow *langFlag%3D*
Disallow *hotJob%3D*
Disallow *kwop%3D*
Disallow *order%3D*
Disallow *page%3D*
Disallow *mode%3D*
Disallow *asc%3D*
Disallow *irsTag%3D*
Disallow *isnew%3D*
Disallow *ro%3D*
Disallow *utm_*
Disallow /

gptbot

Rule Path
Disallow /

Other Records

Field Value
crawl-delay 10

jobdiggerspider
cliqzbot
trovitbot

Rule Path
Disallow /

linkedinbot

Rule Path
Disallow /job/
Disallow /company/
Disallow /jobs/

Warnings

  • 1 invalid line.