miramarflnews.com
robots.txt

Robots Exclusion Standard data for miramarflnews.com

Resource Scan

Scan Details

Site Domain miramarflnews.com
Base Domain miramarflnews.com
Scan Status Ok
Last Scan2026-04-06T11:49:51+00:00
Next Scan 2026-04-13T11:49:51+00:00

Last Scan

Scanned2026-04-06T11:49:51+00:00
URL https://miramarflnews.com/robots.txt
Redirect https://www.miramarflnews.com/robots.txt
Redirect Domain www.miramarflnews.com
Redirect Base miramarflnews.com
Domain IPs 34.85.147.204
Redirect IPs 23.58.144.153, 23.58.144.155, 2600:1413:5000:34::173d:ca65, 2600:1413:5000:34::173d:ca6e
Response IP 23.215.7.4
Found Yes
Hash d21d9bb829d3aee5563f1fa01a54c9c5b6aaf8b01179d1270f6a96b1d217953d
SimHash 6f5b925b4ff4

Groups

mediapartners-google

Rule Path
Disallow

magnetbot

Rule Path
Disallow

yandex

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 3

baiduspider

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 3

bingbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 2

slurp

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 2

facebookbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 3

anthropic-ai

Rule Path
Disallow /*.html

archive.org_bot

Rule Path
Disallow /*.html

chatgpt-user

Rule Path
Disallow /*.html

claude-searchbot

Rule Path
Disallow /*.html

claude-web

Rule Path
Disallow /*.html

claudebot

Rule Path
Disallow /*.html

google-extended

Rule Path
Disallow /*.html

gptbot

Rule Path
Disallow /*.html

ia_archiver

Rule Path
Disallow /*.html

ia_archiver-web.archive.org

Rule Path
Disallow /*.html

oai-searchbot

Rule Path
Disallow /*.html

perplexity-ai

Rule Path
Disallow /*.html

perplexity-user

Rule Path
Disallow /*.html

perplexitybot

Rule Path
Disallow /*.html

a6-indexer

Rule Path
Disallow /

addsearchbot

Rule Path
Disallow /

aihitbot

Rule Path
Disallow /

andibot

Rule Path
Disallow /

aspiegelbot

Rule Path
Disallow /

barkrowler

Rule Path
Disallow /

bigsur.ai

Rule Path
Disallow /

blp_bbot

Rule Path
Disallow /

brightbot 1.0

Rule Path
Disallow /

buck

Rule Path
Disallow /

bytespider

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

chatglm-spider

Rule Path
Disallow /

coccocbot-web

Rule Path
Disallow /

cohere-ai

Rule Path
Disallow /

cohere-training-data-crawler

Rule Path
Disallow /

cotoyogi

Rule Path
Disallow /

crawler4j

Rule Path
Disallow /

crawlspace

Rule Path
Disallow /

datenbank crawler

Rule Path
Disallow /

diffbot

Rule Path
Disallow /

easouspider

Rule Path
Disallow /

echobot bot

Rule Path
Disallow /

echoboxbot

Rule Path
Disallow /

ecoresearch

Rule Path
Disallow /

firecrawlagent

Rule Path
Disallow /

grok

Rule Path
Disallow /

grokbot

Rule Path
Disallow /

grokbot/1.0

Rule Path
Disallow /

grok-deepsearch/1.0

Rule Path
Disallow /

genieo

Rule Path
Disallow /

https://hada.news

Rule Path
Disallow /

iaskspider/2.0

Rule Path
Disallow /

icc-crawler

Rule Path
Disallow /

img2dataset

Rule Path
Disallow /

jenkersbot

Rule Path
Disallow /

jetslide

Rule Path
Disallow /

kangaroo bot

Rule Path
Disallow /

kunatocrawler

Rule Path
Disallow /

livelapbot

Rule Path
Disallow /

magpie-crawler

Rule Path
Disallow /

mauibot

Rule Path
Disallow /

mazbot

Rule Path
Disallow /

mistralai-user

Rule Path
Disallow /

mistralai-user/1.0

Rule Path
Disallow /

mojeek

Rule Path
Disallow /

mojeekbot

Rule Path
Disallow /

mycentralaiscraperbot

Rule Path
Disallow /

news-please

Rule Path
Disallow /

newsnow

Rule Path
Disallow /

novaact

Rule Path
Disallow /

omgilibot

Rule Path
Disallow /

omgili

Rule Path
Disallow /

petalbot

Rule Path
Disallow /

pangubot

Rule Path
Disallow /

poseidon research crawler

Rule Path
Disallow /

primalbot

Rule Path
Disallow /

quillbot

Rule Path
Disallow /

quillbot.com

Rule Path
Disallow /

r6_commentreader

Rule Path
Disallow /

sbintuitionsbot

Rule Path
Disallow /

scrapy

Rule Path
Disallow /

seekportbot

Rule Path
Disallow /

seekr

Rule Path
Disallow /

seekrbot

Rule Path
Disallow /

seznambot

Rule Path
Disallow /

seznamhomepagecrawler

Rule Path
Disallow /

sidetrade indexer bot

Rule Path
Disallow /

spinn3r

Rule Path
Disallow /

taragroup intelligent bot

Rule Path
Disallow /

thinkbot

Rule Path
Disallow /

timpibot

Rule Path
Disallow /

velenpublicwebcrawler

Rule Path
Disallow /

verity

Rule Path
Disallow /

verity/1.1

Rule Path
Disallow /

viennatinybot

Rule Path
Disallow /

webvac

Rule Path
Disallow /

webzio-extended

Rule Path
Disallow /

webzip

Rule Path
Disallow /

wrtnbot

Rule Path
Disallow /

xai

Rule Path
Disallow /

xai-grok/1.0

Rule Path
Disallow /

yacy

Rule Path
Disallow /

yacybot

Rule Path
Disallow /

youbot

Rule Path
Disallow /

*

Rule Path
Disallow /search
Disallow /web
Disallow /mrcontent
Disallow /webapi-public
Disallow /adperfect
Disallow /*amp.html?*
Disallow *.ece*
Allow *.ece*.pdf

*

Rule Path
Disallow /news/mh
Disallow /news/national
Disallow /site-services
Disallow /site-services/newsletters
Disallow /sports
Disallow /sports/high-school
Disallow /video
Disallow /weather

Other Records

Field Value
sitemap https://www.miramarflnews.com/sitemap/story/update.xml
sitemap https://www.miramarflnews.com/sitemap/story/archive.xml
sitemap https://www.miramarflnews.com/sitemap/story/googlenews.xml
sitemap https://www.miramarflnews.com/sitemap/section/update.xml
sitemap https://www.miramarflnews.com/sitemap/section/archive.xml
sitemap https://www.miramarflnews.com/sitemap/topic/archive.xml
sitemap https://www.miramarflnews.com/sitemap/topic/update.xml
sitemap https://www.miramarflnews.com/sitemap/video/archive.xml
sitemap https://www.miramarflnews.com/sitemap/video/update.xml

Comments

  • Miramar News content is made available for your personal, non-commercial
  • use subject to our Terms of Service here:
  • https://www.miramarflnews.com/customer-service/terms-of-service/
  • Use of any device, tool, or process designed to data mine or scrape the content
  • using automated means is prohibited without prior written permission from
  • Miramar News. Prohibited uses include but are not limited to:
  • (1) text and data mining activities
  • (2) the development of any software, machine learning, artificial intelligence (AI),
  • and/or large language models (LLMs);
  • (3) creating or providing archived or cached data sets containing our content to others; and/or
  • (4) any commercial purposes.
  • Global allow for Mediapartners - this is used by Google to place ads in content,
  • not indexing purposes.
  • Global allow for the Klangoo bot
  • Throttles to decrease load
  • Story blocks for selected bots
  • Complete blocks of uninteresting bots
  • Prevent bots from wasting crawl budget on template/non-content pages
  • Block indexing of specific sections/content at the market's request
  • Sitemap files