Why some data companies struggle to sell to the financial markets
The data deluge: Navigating the challenges of selling valuable datasets to the financial industry
Welcome to the Data Score newsletter, your go-to source for insights into the world of data-driven decision-making. Whether you're an insight seeker, a unique data company, a software-as-a-service provider, or an investor, this newsletter is for you. I'm Jason DeRise, a seasoned expert in the field of alternative data insights. As one of the first 10 members of UBS Evidence Lab, I was at the forefront of pioneering new ways to generate actionable insights from data. Before that, I successfully built a sell-side equity research franchise based on proprietary data and non-consensus insights. Through my extensive experience as a purchaser and creator of data, I have gained a unique perspective that allows me to collaborate with end-users to generate meaningful insights.
In a world where there’s more than 21,000 potential data sources to leverage and very specific demands on asset managers to generate alpha, on top of limited resources to process data, potential customers will simply move on to the next dataset if it’s just too hard to find value.
The explosion of data available for analysis is still growing rapidly. For example, Amass Insights have curated a list of more than 21,000 potential data providers And it’s still growing. https://amassinsights.com/#/unproviders. And, consider Matt Turck’s 2023 vs 2013 comparison of the whole ecosystem: https://www.linkedin.com/posts/turck_ml-ai-data-2012-vs-2023-activity-7035302655446933505-qS6u?utm_source=share&utm_medium=member_desktop
Despite the prevalence of data, and the broad agreement in the importance of having access to data, selling data is not easy. Sure, the largest asset managers have invested heavily in data acquisition, data engineering, and data science. For valuable datasets, there is definitely a worthy end market willing to pay for data, software and services. However, it’s not appropriate to extrapolate success in selling to the biggest funds to the rest of the asset manager industry.
There’s a steep decay curve between the best funds in their ability ingest and extract insight compared to the rest of the market. While some of the biggest and best asset managers are comfortable working with data products that are hard to use (and in some cases they prefer it be hard to use because it means there’s barriers to others accessing the alpha1 generation of the dataset), the rest of the market for data has not yet invested the time or resources to dig into datasets that have low “off-the-shelf” product-market fit.
One of the main barriers for data providers to sell their proprietary data and tools is that data companies do not understand the outcomes needed by financial market users. The misunderstanding of the jobs to be done by financial market clients leads to misaligned products that require the client to do more wrangling, cleansing and enrichment on top of the data to extract value. This means hard-to-use datasets increase the likelihood that potential clients give up on them, even if there is alpha generation potential.
In a world where there’s more than 21,000 potential data sources to leverage and very specific demands on asset managers to generate alpha, on top of limited resources to process data, potential customers will simply move on to the next dataset if it’s just too hard to find value.
Some examples of valuable table stakes features that are unfortunately missed too often include
The dataset needs to be easily joinable with traditional financial market data and other alternative dataset sets
Use standardized entity symbology
Use standardized date conventions that are easily used across applications
Use standardized geographic naming conventions
Easy to understand data schemas with detailed support documentation
Be direct with naming conventions for products, metrics, schemas, API end points, etc
Easy to understand field names instead of database friendly names
Avoid jargon where possible2
Distribution flexibility
Not all clients have the same ability to access data and generate insights
Some want Excel, some want interactive dashboards, some want programmatic access, some want marketplace access, some just want the insights in a report
From a data providers point of view, providing access via all distribution options allows improves reach to all personas
Manage revisions and data gaps transparently
Time stamp each data’s release to show initial data point and show revision to the history
Manage versioning to support clients through code breaking changes
Provide clear identification of data gaps or interpolated data
Hey, fellow data insight seekers, feel free to comment more table stakes, must-have features for any dataset you may have seen missing in the past - the newsletter is just the start of the conversation
It takes more than table stakes to succeed, data companies need to understand financial market outcomes based on the data
There is a fundamental misunderstanding of how financial markets synthesize new data points into the share price.
However, even if the data company gets these basics right, there’s still too many disconnects between financial market end user needs and the service and product data companies provide. There is a fundamental misunderstanding of how financial markets synthesize new data points into the share price.
For example, experienced market professionals easily understand the answer to these questions, but data professionals are typically baffled.
Why does the share price sometimes fall even after earnings results show a beat vs consensus3 estimates?
Why do low quality, poorly run businesses sometimes outperform the market for periods of time?
Why do financial professionals care about specific data points intensely today, but then later say it’s no longer that important to them?
Often, an inappropriate heuristic about what the “average” investment professionals needs from data is used to create data products. There are big differences in the needs of long/short equity hedge funds4, systematic5, long only6, macro7, and alternative investment funds8 to name a few. Furthermore, each asset manager within these cohorts likely has a different investment style and performance benchmark, with different time frames for achieving alpha and managing risk. And, within each organization are different data team structures, technology and process for generating accurate, non-consensus views. The needs of a fund with a centralized data scouting role, but decentralized analytics is different than the needs of a fully centralized data practice as well as being different than fully decentralized souring of data. This leads to different data purchaser personas, but most data providers incorrectly create product for “the average user”.
Selling data would require a different approach for asset managers who have decentralized structure for generating data insights compared to a centralized data team.
An appropriate strategy for selling to financial markets requires the data company to work backwards from the outcomes needed by their clients to create appropriate data solutions to shape the data products and go to market strategy.
In future Data Score newsletters, we’ll explore the process of working backward from financial market outcomes. We will dig into specific examples of financial market needs and working backward to specific datasets and how they are used.
(Hey, fellow data creators and SaaS creators- feel free to suggest specific financial market investment debates you’d like demystified as an example of how data is used to achieve investing outcomes. Keep suggesting dataset types to explore, it will help me prioritize the highest impact areas of focus. The newsletter is just the start of the conversation)
- Jason DeRise, CFA
Alpha: A simple way to think about alpha is that its a measure of the outperformance of a portfolio compared to a pre-defined benchmark for performance. Investopedia has a lot more detail https://www.investopedia.com/terms/a/alpha.asp
How am I doing on limiting the jargon use? It’s years of experience talking financial, data and tech jargon, I could use some help seeing it from real, regular language!
Consensus: “The Consensus” is the average view of the sell-side for a specific financial measure. Typically it refers to Revenue or Earnings Per Share (EPS) but can be any financial measure. It is used as benchmark for what is currently factored into the share price and for assessing if new results/news are better or worse than expected. However, it is important to know that sometimes there’s an unstated buyside consensus that is the better benchmark for expectations.
Long/Short Equity Hedge Fund: Long/Short Equity funds buy positions (long) in stocks they believe will go up and value an sell-short stocks (short) that they believe will go down in value. Typically, there is a risk management overlay that pairs the long and short positions to be “market neutral” meaning it doesn’t matter if the market goes up or down, what matters is the long position out perform the short position. Short selling, as a simplistic definition, is when an investor borrows stock from an investor who owns it, and then sells the stock. The short seller will eventually need to buy back the stock at a later date to return to the owner of the stock (and will profit if they buy back the stock at a lower price than they sell it.
Systematic Fund: Systematic refers to a quantitative (quant) approach to portfolio allocation based on advanced statistical models, and machine learning (with varying degrees of human involvement “in the loop” or “on the loop” managing the programmatic decision making).
Long Only Fund: These are funds that only buy investment positions and do not take short positions (see long/short above).
Macro Fund’s as a simplified summary follow an investing strategist based on global macro economic views and are typically executed in a portfolio by investing across entire asset classes like fixed income, currencies, derivatives and equities. They are not focused on bottom on company specific investing choices.
Alternative Investment Fund is a broad classification for institutional investors who are focused on private equity and venture capital investments, but can include other types of non traditional equity and fixed income market investments too.
One of the interesting questions to grapple with for both providers and fundamental investors is the progress towards nowcasting / more granularity of the regular 10k/10q. A big uncertainty is what the research stack / process looks like now, and how does that need to evolve in a world where business performance is just a constant known. Most investors aren't trained and equipped for this type of environment and it's likely to be disruptive.
How might the research process look in 5 years and what skills will make the difference.