newkerala.com
robots.txt

Robots Exclusion Standard data for newkerala.com

Resource Scan

Scan Details

Site Domain newkerala.com
Base Domain newkerala.com
Scan Status Ok
Last Scan2026-01-30T17:27:20+00:00
Next Scan 2026-01-31T17:27:20+00:00

Last Scan

Scanned2026-01-30T17:27:20+00:00
URL https://www.newkerala.com/robots.txt
Domain IPs 51.222.31.128
Response IP 51.222.31.128
Found Yes
Hash 4397fd4acb952468d5fe207d545d08ed92b920346397b065dc1948d7e2ac3c72
SimHash 281e5e1166f0

Groups

googlebot
googlebot-image
googlebot-news
googlebot-video
mediapartners-google
adsbot-google

Rule Path
Allow /

Other Records

Field Value
crawl-delay 2

bingbot
bingpreview
msnbot

Rule Path
Allow /

Other Records

Field Value
crawl-delay 2

duckduckbot

Rule Path
Allow /

Other Records

Field Value
crawl-delay 3

applebot

Rule Path
Allow /

Other Records

Field Value
crawl-delay 3

facebookexternalhit
twitterbot
linkedinbot
pinterestbot

Rule Path
Allow /

Other Records

Field Value
crawl-delay 4

chatgptbot
claudebot
perplexitybot

Rule Path
Allow /
Disallow /cgi-bin/
Disallow /tmp/
Disallow /junk/
Disallow /classic/
Disallow /support/
Disallow /cron/
Disallow /admin/
Disallow /includes/
Disallow /logs/
Disallow /api/
Disallow /stats/
Disallow /tracking/
Disallow /private/
Disallow /devo/nm/

Other Records

Field Value
crawl-delay 10

ahrefsbot
semrushbot
mj12bot
dotbot
blexbot
screamingfrog
ccbot
exabot
httrack
python-requests
wget
rogerbot
any other known scrapers

Rule Path
Disallow /

*

Rule Path
Disallow /cgi-bin/
Disallow /tmp/
Disallow /junk/
Disallow /classic/
Disallow /support/
Disallow /cron/
Disallow /admin/
Disallow /includes/
Disallow /logs/
Disallow /api/
Disallow /stats/
Disallow /tracking/
Disallow /private/
Disallow /devo/nm/
Disallow /news/o/

Other Records

Field Value
crawl-delay 10

Other Records

Field Value
sitemap https://www.newkerala.com/sitemap.xml
sitemap https://www.newkerala.com/world-time-now/sitemap.xml

Comments

  • Robots.txt file for newkerala.com
  • Last updated: 2025-10-01
  • Sitemap for all crawlers
  • -----------------------------------------
  • SEARCH ENGINE CRAWLERS - FULL ACCESS
  • -----------------------------------------
  • -----------------------------------------
  • SOCIAL MEDIA CRAWLERS - ALLOW FOR SHARING
  • -----------------------------------------
  • -----------------------------------------
  • TRUSTED AI AND RESEARCH CRAWLERS
  • -----------------------------------------
  • Whitelisted to safely reference content without overloading server
  • -----------------------------------------
  • BLOCK ABUSIVE AI AND SCRAPING BOTS
  • -----------------------------------------
  • -----------------------------------------
  • CRAWL DELAY FOR UNKNOWN BOTS
  • -----------------------------------------