headline.com
robots.txt
Robots Exclusion Standard data for headline.com
Resource Scan
Scan Details
| Site Domain | headline.com |
| Base Domain | headline.com |
| Scan Status | Ok |
| Last Scan | 2026-02-18T16:54:39+00:00 |
| Next Scan | 2026-03-20T16:54:39+00:00 |
Last Scan
| Scanned | 2026-02-18T16:54:39+00:00 |
| URL | https://headline.com/robots.txt |
| Domain IPs | 76.76.21.21 |
| Response IP | 76.76.21.21 |
| Found | Yes |
| Hash | b7bbf8d997209bd6f096cbac58da0a0ec96aa883d30da1097496075816da0e8f |
| SimHash | 785c4868e792 |
Groups
ahrefsbot
semrushbot
mj12bot
majesticseo
dotbot
baiduspider
yandexbot
semrushbot
blexbot
dataforseobot
petalbot
blexbot
bytespider
seznambot
duckduckbot
facebookexternalhit
facebookbot
claudebot
claude-web
gptbot
chatgpt-user
ccbot
cohere-ai
diffbot
anthropic-ai
| Rule | Path |
|---|---|
| Disallow | / |
*
| Rule | Path |
|---|---|
| Disallow | /api/ |
| Disallow | /asia/*/search |
Other Records
| Field | Value |
|---|---|
| sitemap | https://headline.com/sitemap.xml |
Warnings
- `host` is not a known field.
Comments