datajournalism.ntu.edu.tw
robots.txt

Robots Exclusion Standard data for datajournalism.ntu.edu.tw

Archived Snapshots

Resource Scan

Scan Details

Site Domain	datajournalism.ntu.edu.tw
Base Domain	ntu.edu.tw
Scan Status	Failed
Failure Stage	Fetching resource.
Failure Reason	Server returned a client error.
Last Scan	2025-07-02T17:04:34+00:00
Next Scan	2025-09-30T17:04:34+00:00

Last Successful Scan

Scanned	2024-08-28T18:14:23+00:00
URL	https://datajournalism.ntu.edu.tw/robots.txt
Domain IPs	74.114.154.18, 74.114.154.22
Response IP	74.114.154.22
Found	Yes
Hash	5d63c3bc30c3a524191e61f41583f085015e6ca6ee262dfbdebc0a8c5da1bc1c
SimHash	eb94d8438486

Groups

*

Rule	Path
Disallow	/random
Disallow	/day
Disallow	/sticky-ad-iframe.html
Disallow	/privacy/consent

Rule

Path

Disallow

/random

Disallow

/day

Disallow

/sticky-ad-iframe.html

Disallow

/privacy/consent

Other Records

Field	Value
crawl-delay	1

Field

Value

crawl-delay

ccbot

Rule	Path
Disallow	/

Rule

Path

Disallow

sentibot

Rule	Path
Disallow	/

Rule

Path

Disallow

google-extended

Rule	Path
Disallow	/

Rule

Path

Disallow

facebookbot

Rule	Path
Disallow	/

Rule

Path

Disallow

omgili

Rule	Path
Disallow	/

Rule

Path

Disallow

omgilibot

Rule	Path
Disallow	/

Rule

Path

Disallow

amazonbot

Rule	Path
Disallow	/

Rule

Path

Disallow

claudebot

Rule	Path
Disallow	/

Rule

Path

Disallow

anthropic-ai

Rule	Path
Disallow	/

Rule

Path

Disallow

imagesiftbot

Rule	Path
Disallow	/

Rule

Path

Disallow

applebot-extended

Rule	Path
Disallow	/

Rule

Path

Disallow

turnitinbot

Rule	Path
Disallow	/

Rule

Path

Disallow

Other Records

Field	Value
sitemap	https://datajournalism.ntu.edu.tw/sitemap.xml

Field

Value

sitemap

https://datajournalism.ntu.edu.tw/sitemap.xml

Comments

Common Crawl's crawler
SentiBot's crawler
Google Bard's crawler
Facebook's crawler
webz.io's crawler
webz.io's crawler
Amazon's crawler
ClaudeBot's crawler
anthropic-ai's crawler
ImageSift's AI crawler
Apple's AI crawler
TurnitinBot crawler

datajournalism.ntu.edu.twrobots.txt

Resource Scan

Scan Details

Last Successful Scan

Groups

*

Other Records

ccbot

sentibot

google-extended

facebookbot

omgili

omgilibot

amazonbot

claudebot

anthropic-ai

imagesiftbot

applebot-extended

turnitinbot

Other Records

Comments

datajournalism.ntu.edu.tw
robots.txt