wxii12.com
robots.txt

Robots Exclusion Standard data for wxii12.com

Resource Scan

Scan Details

Site Domain wxii12.com
Base Domain wxii12.com
Scan Status Ok
Last Scan2024-11-13T17:33:48+00:00
Next Scan 2024-11-20T17:33:48+00:00

Last Scan

Scanned2024-11-13T17:33:48+00:00
URL https://wxii12.com/robots.txt
Redirect https://www.wxii12.com/robots.txt
Redirect Domain www.wxii12.com
Redirect Base wxii12.com
Domain IPs 151.101.1.55, 151.101.129.55, 151.101.193.55, 151.101.65.55
Redirect IPs 151.101.1.55, 151.101.129.55, 151.101.193.55, 151.101.65.55
Response IP 199.232.45.55
Found Yes
Hash 943b380eca519a724db20d467839dcb9236fc228946e0d2188c3ae9890e6eb5a
SimHash dc52916107b0

Groups

ccbot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

mediapartners-google

Rule Path
Disallow

facebot

Rule Path
Disallow

twitterbot

Rule Path
Disallow

trueanthem

Rule Path
Disallow

*

Rule Path
Disallow /ajax/
Disallow /dk/
Disallow /en/
Disallow /es/
Disallow /in/
Disallow /it/
Disallow /jp/
Disallow /ng/
Disallow /nl/
Disallow /no/
Disallow /se/
Disallow /tw/
Disallow /ua/
Disallow /uk/
Disallow /api/
Disallow /landing-feed/
Disallow /oauth/
Disallow /preview/
Disallow /search-fetch/
Disallow /search/
Disallow /transporter/
Disallow /app/

ahrefsbot

Rule Path
Disallow /

amazonbot

Rule Path
Disallow /

anthropic-ai

Rule Path
Disallow /

applebot-extended

Rule Path
Disallow /

awariorssbot
awariosmartbot

Rule Path
Disallow /

bytespider

Rule Path
Disallow /

claudebot

Rule Path
Disallow /

claude-web

Rule Path
Disallow /

cohere-ai

Rule Path
Disallow /

dataforseobot

Rule Path
Disallow /

facebookbot

Rule Path
Disallow /

magpie-crawler

Rule Path
Disallow /

omgili

Rule Path
Disallow /

omgilibot

Rule Path
Disallow /

perplexitybot

Rule Path
Disallow /

amazon-qbusiness

Rule Path
Disallow /
Allow /ajax/content-product/
Allow /en/sitemaps/

Other Records

Field Value
crawl-delay 10

Other Records

Field Value
sitemap https://www.wxii12.com/sitemap_index.xml
sitemap https://www.wxii12.com/sitemap_google_news.xml

Comments

  • Hearst Television Inc. content is made available for your personal, non-commercial
  • use subject to our Terms of Service here: https://www.hearst.com/-/tv-terms-of-use
  • You may not either directly or through the use of a device or other means copy, download,
  • stream, reproduce, duplicate, archive, distribute, upload, publish, modify, translate,
  • broadcast, perform, display, sell, transmit or retransmit the website or content
  • without prior written permission from Hearst Television Inc.
  • You may also not use any software robots, spider, crawlers, or other data gathering
  • or extraction tools, whether automated or manual, to access, acquire, copy, monitor,
  • scrape or aggregate the content or website or any portion thereof. You may not knowingly
  • or intentionally take any action that may impose an unreasonable burden or load on the
  • website or its servers and infrastructures.
  • Prohibited uses include but are not limited to:
  • (1) text and data mining activities under Art. 4 of the EU Directive on Copyright in
  • the Digital Single Market;
  • (2) the development of any software, machine learning, artificial intelligence (AI),
  • and/or large language models (LLMs);
  • (3) creating or providing archived or cached data sets containing our content to others; and/or
  • (4) any commercial purposes.
  • Contact htvdigitalopsleads at hearst dot com for assistance.