marieclaire.com.tw
robots.txt

Robots Exclusion Standard data for marieclaire.com.tw

Archived Snapshots

Resource Scan

Scan Details

Site Domain	marieclaire.com.tw
Base Domain	marieclaire.com.tw
Scan Status	Ok
Last Scan	2024-11-13T17:14:42+00:00
Next Scan	2024-11-20T17:14:42+00:00

Last Scan

Scanned	2024-11-13T17:14:42+00:00
URL	https://marieclaire.com.tw/robots.txt
Redirect	https://www.marieclaire.com.tw/robots.txt
Redirect Domain	www.marieclaire.com.tw
Redirect Base	marieclaire.com.tw
Domain IPs	35.241.47.28
Redirect IPs	35.241.47.28
Response IP	35.241.47.28
Found	Yes
Hash	c9b51bda55f1c7c238449606a77ae8514fefa176d1d6978b2fce19724845b964
SimHash	2a96cc30c793

Groups

*

Rule	Path
Disallow	/preview/
Disallow	/admin/
Disallow	/ap/
Disallow	/etc/
Disallow	/tmp/
Disallow	/ADdemo/
Disallow	/channel/demo_view/
Disallow	/mobile/
Disallow	/talk/view/
Disallow	/slide_content/
Disallow	/slide/
Disallow	/slice/
Disallow	/share/
Disallow	/share/fb/
Disallow	/insight/
Disallow	/ajax/
Disallow	/api/

Rule

Path

Disallow

/preview/

Disallow

/admin/

Disallow

/ap/

Disallow

/etc/

Disallow

/tmp/

Disallow

/ADdemo/

Disallow

/channel/demo_view/

Disallow

/mobile/

Disallow

/talk/view/

Disallow

/slide_content/

Disallow

/slide/

Disallow

/slice/

Disallow

/share/

Disallow

/share/fb/

Disallow

/insight/

Disallow

/ajax/

Disallow

/api/

gptbot

Rule	Path
Disallow	/

Rule

Path

Disallow

peer39_crawler

Rule	Path
Disallow	/

Rule

Path

Disallow

velenpublicwebcrawler

Rule	Path
Disallow	/

Rule

Path

Disallow

blexbot

Rule	Path
Disallow	/

Rule

Path

Disallow

petalbot

Rule	Path
Disallow	/

Rule

Path

Disallow

trendictionbot

Rule	Path
Disallow	/

Rule

Path

Disallow

Other Records

Field	Value
crawl-delay	10

Field

Value

crawl-delay

Other Records

Field	Value
sitemap	https://www.marieclaire.com.tw/sitemap.xml
sitemap	https://www.marieclaire.com.tw/google-news.xml

Field

Value

sitemap

https://www.marieclaire.com.tw/sitemap.xml

sitemap

https://www.marieclaire.com.tw/google-news.xml

marieclaire.com.twrobots.txt

Resource Scan

Scan Details

Last Scan

Groups

*

gptbot

peer39_crawler

velenpublicwebcrawler

blexbot

petalbot

trendictionbot

Other Records

Other Records

marieclaire.com.tw
robots.txt