government.economictimes.indiatimes.com
robots.txt

Robots Exclusion Standard data for government.economictimes.indiatimes.com

Resource Scan

Scan Details

Site Domain government.economictimes.indiatimes.com
Base Domain indiatimes.com
Scan Status Ok
Last Scan2025-03-01T09:00:42+00:00
Next Scan 2025-03-15T09:00:42+00:00

Last Scan

Scanned2025-03-01T09:00:42+00:00
URL https://government.economictimes.indiatimes.com/robots.txt
Domain IPs 23.45.207.69, 23.45.207.84, 2600:1413:b000:1c::17d1:2edb, 2600:1413:b000:1c::17d1:2ede
Response IP 184.50.85.132
Found Yes
Hash 6ad88720e33628e1ebabc064f21763428a5cd7f8c6d3cf7f40853e75159110a9
SimHash c234c04b8394

Groups

*

Rule Path
Allow /
Disallow /web/Themes/Release/templates/
Disallow /l.php*
Disallow /pl.php*
Disallow /7176/
Disallow /web/Themes/Beta/
Disallow /opt/
Disallow *//
Disallow /whatsapp%3A//*
Disallow /*unsubscribelink
Disallow /search/*
Disallow /news/topics/*
Disallow /news/topic/*
Disallow /sport-brand-stories
Disallow /ajax_files/*
Disallow /tag/*/photos
Disallow /tag/*/videos
Disallow /tag/*/blogs
Disallow /tag/*/news
Disallow /subscription_strip.php
Disallow /event-schema.php*

googlebot-news

Rule Path
Disallow /jobs/
Disallow /widget/
Disallow /etanalytics/
Disallow /event-schema/

ccbot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

anthropic-ai

Rule Path
Disallow /

omgilibot

Rule Path
Disallow /

omgili

Rule Path
Disallow /

facebookbot

Rule Path
Disallow /

bytespider

Rule Path
Disallow /

amazonbot

Rule Path
Disallow /

applebot

Rule Path
Disallow /

claude-web

Rule Path
Disallow /

cohere-ai

Rule Path
Disallow /

diffbot

Rule Path
Disallow /

imagesiftbot

Rule Path
Disallow /

youbot

Rule Path
Disallow /

perplexitybot

Rule Path
Disallow /

twitterbot

Rule Path
Disallow /

Other Records

Field Value
sitemap https://government.economictimes.indiatimes.com/files/sitemaps/government_sitemap_index.xml
sitemap https://government.economictimes.indiatimes.com/files/sitemaps/government_sitemap_news.xml
sitemap https://government.economictimes.indiatimes.com/jcms-sitemaps/government/monthly/sitemap-index.xml
sitemap https://government.economictimes.indiatimes.com/files/sitemaps/government_sitemap_videos.xml
sitemap https://government.economictimes.indiatimes.com/files/sitemaps/government_sitemap_microsite_events.xml
sitemap https://government.economictimes.indiatimes.com/files/sitemaps/government_sitemap_categories.xml
sitemap https://government.economictimes.indiatimes.com/files/sitemaps/government_sitemap_webcast.xml
sitemap https://government.economictimes.indiatimes.com/files/sitemaps/government_sitemap_contest.xml

Comments

  • Apple Bot - collects website data for its Siri and Spotlight services.
  • Claude Bot run by Anthropic
  • Cohere AI Bot - unconfirmed bot believed to be associated with Cohere's chatbot.
  • Diffbot - somewhat dishonest scraping bot used to collect data to train LLMs.
  • ImagesiftBot is billed as a reverse image search tool, but it's associated with The Hive, a company that produces models for image generation.
  • KUKA's youBot
  • Perplexity AI
  • Twitter's bot used to index the content of any given URL
  • Sitemap: https://government.economictimes.indiatimes.com/files/sitemaps/government_sitemap_weekly.xml