104.com.tw
robots.txt

Robots Exclusion Standard data for 104.com.tw

Resource Scan

Scan Details

Site Domain 104.com.tw
Base Domain 104.com.tw
Scan Status Ok
Last Scan2024-06-15T20:02:08+00:00
Next Scan 2024-06-29T20:02:08+00:00

Last Scan

Scanned2024-06-15T20:02:08+00:00
URL https://104.com.tw/robots.txt
Redirect https://www.104.com.tw/robots.txt
Redirect Domain www.104.com.tw
Redirect Base 104.com.tw
Domain IPs 104.18.9.226
Redirect IPs 104.18.8.226, 104.18.9.226
Response IP 104.18.8.226
Found Yes
Hash 741321c55a38dc3c3df400d88117b146403959524ba831cefe757860f062b24b
SimHash d0541432e901

Groups

*

Rule Path
Allow /$
Allow /job/
Allow /company/
Allow /jobs/
Allow /expats/
Allow /mobile/
Allow /faq/
Allow /feedback/
Allow /csr/
Allow /favicon.ico
Disallow *keyword%3D*xyz*
Disallow *keyword%3D*telegram*
Disallow *keyword%3D*tg%3A*
Disallow *keyword%3D*.*
Disallow *keyword%3D*%3A*
Disallow *keyword%3D*%3B*
Disallow *keyword%3D*%28*
Disallow *keyword%3D*%7C*
Disallow *keyword%3D*/*
Disallow *keyword%3D*~*
Disallow *keyword%3D*%40*
Disallow *excludeCompanyByCustno%3D*
Disallow *excludeIndustryCat%3D*
Disallow *expansionType%3D*
Disallow *langStatus%3D*
Disallow *remoteWork%3D*
Disallow *recommendJob%3D*
Disallow *langStatus%3D*
Disallow *langFlag%3D*
Disallow *hotJob%3D*
Disallow *kwop%3D*
Disallow *order%3D*
Disallow *page%3D*
Disallow *mode%3D*
Disallow *asc%3D*
Disallow *irsTag%3D*
Disallow *isnew%3D*
Disallow *ro%3D*
Disallow *utm_*
Disallow *%26/students/%3D*null%26*
Disallow /

gptbot

Rule Path
Allow /$
Allow /jobs/
Allow /faq/
Allow /expats/
Allow /feedback/
Allow /csr/
Disallow /

Other Records

Field Value
crawl-delay 10

jobdiggerspider
cliqzbot
trovitbot

Rule Path
Disallow /

linkedinbot

Rule Path
Disallow /job/
Disallow /company/
Disallow /jobs/

Warnings

  • 1 invalid line.