mia.org.my
robots.txt

Robots Exclusion Standard data for mia.org.my

Archived Snapshots

Resource Scan

Scan Details

Site Domain	mia.org.my
Base Domain	mia.org.my
Scan Status	Ok
Last Scan	2026-01-22T06:16:46+00:00
Next Scan	2026-02-21T06:16:46+00:00

Last Scan

Scanned	2026-01-22T06:16:46+00:00
URL	https://mia.org.my/robots.txt
Domain IPs	104.26.12.139, 104.26.13.139, 172.67.74.84, 2606:4700:20::681a:c8b, 2606:4700:20::681a:d8b, 2606:4700:20::ac43:4a54
Response IP	172.67.74.84
Found	Yes
Hash	a9ca965126620991bac6fba8445a21a6b5f3837765179f945b275158b20ff791
SimHash	44354913cd54

Groups

*

Rule	Path
Allow	/

Rule

Path

Allow

/

amazonbot

Rule	Path
Disallow	/

Rule

Path

Disallow

/

applebot-extended

Rule	Path
Disallow	/

Rule

Path

Disallow

/

bytespider

Rule	Path
Disallow	/

Rule

Path

Disallow

/

ccbot

Rule	Path
Disallow	/

Rule

Path

Disallow

/

claudebot

Rule	Path
Disallow	/

Rule

Path

Disallow

/

google-extended

Rule	Path
Disallow	/

Rule

Path

Disallow

/

gptbot

Rule	Path
Disallow	/

Rule

Path

Disallow

/

meta-externalagent

Rule	Path
Disallow	/

Rule

Path

Disallow

/

*

Rule	Path
Disallow	/*blackhole
Disallow	/?blackhole

Rule

Path

Disallow

/*blackhole

Disallow

/?blackhole

*

Rule	Path
Disallow

Rule

Path

Disallow

Back to top

Other Records

Field	Value
sitemap	https://mia.org.my/sitemap_index.xml

Field

Value

sitemap

https://mia.org.my/sitemap_index.xml

Back to top

Comments

As a condition of accessing this website, you agree to abide by the following
content signals:
(a) If a Content-Signal = yes, you may collect content for the corresponding
use.
(b) If a Content-Signal = no, you may not collect content for the
corresponding use.
(c) If the website operator does not include a Content-Signal for a
corresponding use, the website operator neither grants nor restricts
permission via Content-Signal with respect to the corresponding use.
The content signals and their meanings are:
search: building a search index and providing search results (e.g., returning
hyperlinks and short excerpts from your website's contents). Search does not
include providing AI-generated search summaries.
ai-input: inputting content into one or more AI models (e.g., retrieval
augmented generation, grounding, or other real-time taking of content for
generative AI search answers).
ai-train: training or fine-tuning AI models.
ANY RESTRICTIONS EXPRESSED VIA CONTENT SIGNALS ARE EXPRESS RESERVATIONS OF
AND RELATED RIGHTS IN THE DIGITAL SINGLE MARKET.
BEGIN Cloudflare Managed content
END Cloudflare Managed Content
START YOAST BLOCK
---------------------------
---------------------------
END YOAST BLOCK

Back to top

Warnings

`content-signal` is not a known field.

Back to top

mia.org.myrobots.txt

Resource Scan

Scan Details

Last Scan

Groups

*

amazonbot

applebot-extended

bytespider

ccbot

claudebot

google-extended

gptbot

meta-externalagent

*

*

Other Records

Comments

Warnings

mia.org.my
robots.txt