insight.rpxcorp.com
robots.txt

Robots Exclusion Standard data for insight.rpxcorp.com

Resource Scan

Scan Details

Site Domain insight.rpxcorp.com
Base Domain rpxcorp.com
Scan Status Failed
Failure StageFetching resource.
Failure ReasonServer returned a server error.
Last Scan2025-05-05T06:53:11+00:00
Next Scan 2025-08-03T06:53:11+00:00

Last Successful Scan

Scanned2024-03-19T06:50:40+00:00
URL https://insight.rpxcorp.com/robots.txt
Domain IPs 35.186.221.148
Response IP 35.186.221.148
Found Yes
Hash c28092e7f87df08a4b558aea7af91867e9184fd1f26d30e70dc9852b2312e7db
SimHash 1a0d0c8dc2c0

Groups

laboraybot
yandexbot
semrushbot
barkrowler
petalbot
mj12bot
dotbot
turnitinbot

Rule Path
Disallow /

*

Rule Path
Disallow /admin/
Disallow /users/
Disallow /public_login
Disallow /public_signup
Disallow /broker_login
Disallow /portal_login
Disallow /dashboard/
Disallow /analytics/
Disallow /community/
Disallow /visual_analytics
Disallow /analytics/
Disallow /autocomplete/
Disallow /payments/
Disallow /insurance/
Disallow /alerts/
Disallow /rpx_reports/
Disallow /lca_reports/
Disallow /personalizations/
Disallow /rails/
Disallow /lawfirm/
Disallow /advanced_search/search_litigations*?*
Disallow /advanced_search/search_patents*?*
Disallow /advanced_search/search_entities*?*
Disallow /advanced_search/search_judges*?*
Disallow /advanced_search/search_lawfirms*?*
Disallow /advanced_search/search_venues*?*
Disallow /advanced_search/search_all*?*

Other Records

Field Value
sitemap https://insight.rpxcorp.com/sitemap.xml.gz

Comments

  • See http://www.robotstxt.org/wc/norobots.html for documentation on how to use the robots.txt file
  • To ban all spiders from the entire site uncomment the next two lines: