combine.tw
robots.txt

Robots Exclusion Standard data for combine.tw

Archived Snapshots

Resource Scan

Scan Details

Site Domain	combine.tw
Base Domain	combine.tw
Scan Status	Ok
Last Scan	2025-09-09T22:51:21+00:00
Next Scan	2025-09-16T22:51:21+00:00

Last Scan

Scanned	2025-09-09T22:51:21+00:00
URL	https://combine.tw/robots.txt
Domain IPs	104.21.13.214, 172.67.133.42, 2606:4700:3031::6815:dd6, 2606:4700:3031::ac43:852a
Response IP	172.67.133.42
Found	Yes
Hash	71431d1a6ee786275ee578685f104ccfb5d2146ff02ee175898a0c0dd96396f8
SimHash	36757be32512

Groups

googlebot

Rule	Path
Allow	*.js
Allow	*.css
Allow	/wp-admin/admin-ajax.php
Disallow	/?s=
Disallow	/search/
Disallow	/wp-admin
Disallow	/*/feed/
Disallow	/wp-login.php
Disallow	/wp-register.php
Disallow	/trackback/

Rule

Path

Allow

*.js

Allow

*.css

Allow

/wp-admin/admin-ajax.php

Disallow

/?s=

Disallow

/search/

Disallow

/wp-admin

Disallow

/*/feed/

Disallow

/wp-login.php

Disallow

/wp-register.php

Disallow

/trackback/

*

Rule	Path
Allow	/wp-admin/admin-ajax.php
Disallow	/wp-admin
Disallow	/wp-login.php
Disallow	/trackback/
Disallow	/wp-register.php

Rule

Path

Allow

/wp-admin/admin-ajax.php

Disallow

/wp-admin

Disallow

/wp-login.php

Disallow

/trackback/

Disallow

/wp-register.php

Back to top

Other Records

Field	Value
sitemap	https://combine.tw/sitemap_index.xml
sitemap	https://combine.tw/post-sitemap.xml
sitemap	https://combine.tw/sitemap.xml
sitemap	https://combine.tw/news-sitemap.xml

Field

Value

sitemap

https://combine.tw/sitemap_index.xml

sitemap

https://combine.tw/post-sitemap.xml

sitemap

https://combine.tw/sitemap.xml

sitemap

https://combine.tw/news-sitemap.xml

Back to top

Comments

This virtual robots.txt file was created by the Virtual Robots.txt WordPress plugin: https://www.wordpress.org/plugins/pc-robotstxt/
WordPress Robots.txt Boilerplate v1.1 by Daniel Cuttridge
UTF-8 BOM Tested & Approved
Allow files critical for rendering
Allow AJAX - Do Not Remove
Prevent crawl-budget waste on search pages
Prevent private admin areas from being crawled
Prevent duplicate /feed/ pages from being crawled
Prevent login page crawls etc
Prevent register page crawls etc
Prevent Trackback Neg SEO
Allow AJAX - Do Not Remove
Add all sitemaps

Back to top

combine.twrobots.txt

Resource Scan

Scan Details

Last Scan

Groups

googlebot

*

Other Records

Comments

combine.tw
robots.txt