/.well-known/

Log In Sign Up

headline.com
robots.txt

Robots Exclusion Standard data for headline.com

Archived Snapshots

Resource Scan

Scan Details

Site Domain	headline.com
Base Domain	headline.com
Scan Status	Ok
Last Scan	2026-02-18T16:54:39+00:00
Next Scan	2026-03-20T16:54:39+00:00

Last Scan

Scanned	2026-02-18T16:54:39+00:00
URL	https://headline.com/robots.txt
Domain IPs	76.76.21.21
Response IP	76.76.21.21
Found	Yes
Hash	b7bbf8d997209bd6f096cbac58da0a0ec96aa883d30da1097496075816da0e8f
SimHash	785c4868e792

Groups

googlebot
bingbot
uptimebot
better-stack
betteruptimebot

Rule

Path

Disallow

ahrefsbot
semrushbot
mj12bot
majesticseo
dotbot
baiduspider
yandexbot
semrushbot
blexbot
dataforseobot
petalbot
blexbot
bytespider
seznambot
duckduckbot
facebookexternalhit
facebookbot
claudebot
claude-web
gptbot
chatgpt-user
ccbot
cohere-ai
diffbot
anthropic-ai

Rule

Path

Disallow

/

*

Rule

Path

Disallow

/api/

Disallow

/asia/*/search

Back to top

Other Records

Field

Value

sitemap

https://headline.com/sitemap.xml

Back to top

Comments

Host
✅ ALLOW: Good search engines and monitoring
❌ BLOCK: Aggressive SEO scrapers and bandwidth hogs
✅ ALLOW: Everything else (standard browsers, legitimate crawlers)
Sitemap (for Google & Bing SEO)

Back to top

Warnings

`host` is not a known field.

Back to top