websitehcm.com
robots.txt

Robots Exclusion Standard data for websitehcm.com

Archived Snapshots

Resource Scan

Scan Details

Site Domain	websitehcm.com
Base Domain	websitehcm.com
Scan Status	Ok
Last Scan	2025-10-12T10:23:41+00:00
Next Scan	2025-11-11T10:23:41+00:00

Last Scan

Scanned	2025-10-12T10:23:41+00:00
URL	https://websitehcm.com/robots.txt
Domain IPs	103.75.186.15
Response IP	103.75.186.15
Found	Yes
Hash	96021a7bda030646b2e264015f0d9bbfd4007ec0ea3563de8a4bfe7f59e0dedb
SimHash	a152d9210c5b

Groups

*

Rule	Path
Disallow	/search/
Disallow	/?s=
Disallow	*/1000
Disallow	*/1000/
Disallow	*//1000
Disallow	*//1000/
Disallow	*?amp

Rule

Path

Disallow

/search/

Disallow

/?s=

Disallow

*/1000

Disallow

*/1000/

Disallow

*//1000

Disallow

*//1000/

Disallow

*?amp

*

Rule	Path
Allow	/

Rule

Path

Allow

/

Back to top

Other Records

Field	Value
sitemap	https://websitehcm.com/sitemap_index.xml

Field

Value

sitemap

https://websitehcm.com/sitemap_index.xml

Back to top

Comments

Chặn mọi URL có chứa /1000 hoặc kết thúc bằng /1000
Chặn mọi URL có chứa //1000 (2 dấu gạch chéo)
Chặn mọi URL có chứa ?amp

Back to top

websitehcm.comrobots.txt

Resource Scan

Scan Details

Last Scan

Groups

*

*

Other Records

Comments

websitehcm.com
robots.txt