smartkarma.com
robots.txt

Robots Exclusion Standard data for smartkarma.com

Resource Scan

Scan Details

Site Domain smartkarma.com
Base Domain smartkarma.com
Scan Status Ok
Last Scan2025-04-04T12:58:40+00:00
Next Scan 2025-04-18T12:58:40+00:00

Last Scan

Scanned2025-04-04T12:58:40+00:00
URL https://smartkarma.com/robots.txt
Redirect https://www.smartkarma.com/robots.txt
Redirect Domain www.smartkarma.com
Redirect Base smartkarma.com
Domain IPs 104.26.10.188, 104.26.11.188, 172.67.70.150, 2606:4700:20::681a:abc, 2606:4700:20::681a:bbc, 2606:4700:20::ac43:4696
Redirect IPs 104.26.10.188, 104.26.11.188, 172.67.70.150, 2606:4700:20::681a:abc, 2606:4700:20::681a:bbc, 2606:4700:20::ac43:4696
Response IP 104.26.11.188
Found Yes
Hash e605bb95e4d8ba2213dbc3a09ba13ee2ec17234dfaf6b2af8918f2ed1c7b11d9
SimHash a0149d841694

Groups

*

Rule Path
Disallow /directories
Disallow /entities/*/attachments
Disallow /entities/*/exchange-announcements
Disallow /entities/*/images
Disallow /entities/*/press-releases
Disallow /insight-providers
Disallow /insight-provider-directory
Disallow /insight-provider-directory/*
Disallow /lab
Disallow /settings/mute
Disallow /start/analytics
Disallow /start/insights
Disallow /statistics
Disallow /watchlist
Allow /tools/holdcos$
Allow /tools/ipos
Allow /tools/mna$
Allow /tools/smart-score-screener
Disallow /activities
Disallow /analytics
Disallow /compose
Disallow /entities/*/discussion
Disallow /entities/*/locker
Disallow /embed
Disallow /locker
Disallow /logout
Disallow /insights/*/discussion
Disallow /me
Disallow /messages
Disallow /messifications
Disallow /profiles/*/discussion
Disallow /notifications
Disallow /onboarding
Disallow /premium-services
Disallow /reset-password
Disallow /settings
Disallow /start
Disallow /tools/*
Disallow /watchlists
Disallow /api/*

Other Records

Field Value
sitemap https://www.smartkarma.com/sitemap.xml

Comments

  • This will be renamed to robots.txt when deployed to production environment.
  • See http://www.robotstxt.org/wc/norobots.html for documentation on how to use the robots.txt file
  • To ban all spiders from the entire site uncomment the next two lines:
  • User-agent: *
  • Disallow: /
  • prevent deprecated routes from being crawled
  • allow the holdco, ipo and m&a list
  • prevent private url from being crawled