The Paradox of Finding Surprising Insights in Alternative Data
It’s the most exciting moment in alternative data—finding something very surprising that’s not priced into consensus! But then the fear kicks in: “What if it’s not accurate?”
Welcome to the Data Score newsletter, your go-to source for insights into the world of data-driven decision-making. Whether you're an insight seeker, a unique data company, a software-as-a-service provider, or an investor, this newsletter is for you. I'm Jason DeRise, a seasoned expert in the field of alternative data insights. As one of the first 10 members of UBS Evidence Lab, I was at the forefront of pioneering new ways to generate actionable insights from data. Before that, I successfully built a sellside equity research franchise based on proprietary data and non-consensus insights. Through my extensive experience as a purchaser and creator of data, I have gained a unique perspective that allows me to collaborate with end-users to generate meaningful insights.
Institutional Investors are paid to find surprises and also be skeptical of surprises
The goal of leveraging alternative data is to uncover the surprising and often misunderstood aspects of the real world. This is accomplished by analyzing detailed and frequent data points that act as indicators for key performance indicators (KPIs) of companies. The billions of data points harvested1, cleansed, enriched, and aggregated into well-crafted, fundamentally driven metrics are built to be aligned with the uncertainties and debates in the market. Data-driven insights turn “known-unknowns” into “known-knowns”. It’s why the largest asset managers spend tens of millions of dollars (or more) on alternative data.
When an investor’s alternative data tells them there’s a big surprise coming for the market, they dig in deeper to make sure they are not missing something obvious or that an error exists.
But good institutional investors2 are paid to be skeptical. They don’t trust management, they don’t trust pundits, and they don’t even trust their own models.
Good institutional investors know that "All models are wrong, but some are useful”, to paraphrase the quote attributed to George Box.
They start from the assumption that something too good to be true is probably not true. They know that the second they think they fully understand the financial markets, that’s when they will be most surprised by what happens next in the market. They are paid to make predictions, but they know they will often be wrong. The best at predicting share price movements are slightly more accurate than a coin flip. They know they can’t be overconfident with any discovery because they are competing with every investor in the market. It’s very hard to beat the wisdom of the crowd when the crowd is independently arriving at conclusions about their investments while avoiding groupthink/crowded3 trades.
When an investor’s alternative data tells them there’s a big surprise coming for the market, they dig in deeper to make sure they are not missing something obvious or that an error exists.
Big surprises
To be clear, there’s a range of reasonable results from data and real life, and I’m not suggesting that every beat or miss needs this level of rigor outlined below. When I say big surprise, I mean a materially different outcome from consensus4. It’s the kind of beat that could move market prices significantly if it's the right insight to the right question… and not a data anomaly5. We’re talking about a very surprising change in trend that only recently inflected upward. Moderate differences vs. consensus are also important, but they are not going to receive the same level of scrutiny that a truly surprising insight would.
Signal or Anomaly?
In the early stages of an investor’s use of alternative data, it may take longer for them to take this approach of verifying a true surprise in the data. As trained skeptics of all information and first-time users of Alternative Data, they are really going to want to dig in and be sure before depending on the data's insight. Effectively, the investor’s integrity is put on the line with each big non-consensus investment they make. To be comfortable with a decision that could cause financial and reputational risk, the data must be scrutinized first.
As more experience is gained, it becomes easier to get through the process. And it’s actually reasonable to build steps into the process on an ongoing basis, including automation of reporting and alerting, to help manage the process of verifying big surprises as painlessly as possible. Validating surprising insights should be a scalable undertaking, requiring both data companies and centralized data teams within asset managers to effectively implement the process.
In this newsletter entry
I’ll run through my recommended approach to make sure that the big surprise is answering the right question, providing an insight different than what’s priced in, and how I double check the data integrity to have the highest possible confidence in very surprising data-driven insights.
Make sure we are answering the right question
Understand the consensus answer and what’s priced in
Interrogate the data
1. Make sure we are answering the right question
“The most serious mistakes are not being made as a result of wrong answers. The true dangerous thing is asking the wrong question.” ― Peter Drucker
I can promise you, as someone who once very accurately answered the wrong questions while upgrading a stock to buy, only later to find the buy rating to be very wrong as the answers to the market’s questions were revealed to be supportive of a sell rating, that there’s nothing more painful than answering the wrong questions thoroughly, accurately, and publicly! But the details of that story are for another time (which I’m happy to share because it was a great learning experience).
How can investors figure out if they are answering the right question?
Perhaps I’m biased as a former sellside6 analyst, but leveraging sellside research and sellside analysts is a great way to assess what the critical questions are to be answered. The best sellside analysts are valued more by investors because they help them think through investment debates than because of the specific answers they provide. If the analyst is listening to their clients, they will know the consensus questions to be answered.
Natural Language Processing (NLP) provides the ability to extract topics and sentiment at scale from press releases, financial statements, earnings call transcripts, research reports, and social media. The quantitative trends in topics and associated sentiment can reveal where the debates are shifting. An important comparison to consider is the difference in topics and sentiment between analysts and management in the company Q&A sections of transcribed public company calls.
Expert networks are also useful for sanity-checking what the debates and uncertainties are within the industry, which may be guiding corporate decisions and could be different than what the financial markets perceive.
If it is confirmed that the question the data is answering is indeed the right question, keep moving on through the steps. If it's not the right question to be answered, the information is still valuable but would have a much lower weight in the mosaic of questions and answers.
Speaking of answering the right questions, let me know if there are any investment debates you’d like me to address with a proposed alternative data framework in a future newsletter entry, like my prior entries on NVIDIA and Tesla.
2. Understand the consensus answer and what’s priced in
Once you're double-sure that it's the right question to be answered, it's important to make sure that the insight is different from the consensus view.
How do institutional investors figure out the market’s consensus view?
Sellside consensus estimates are the typical benchmark for the market, but there are times when consensus estimates are not reflective of the buyside’s unstated consensus view. Investors often ask the sellside analysts what they perceive as the “smart money” consensus is, which I always felt was a recursive question because the investor asking the question likely thought they were in the “smart money” classification. “Smart money” is basically a term for investors who have put in the most work to get the view right and, because they were already invested in the position, were comfortable sharing their own view openly because they wanted to sanity check that it was actually different.
Most of the time, active sellside estimates were not far from the buyside consensus estimate. The active sellside estimates are ones that are regularly updated by the analyst. I would add that it makes sense to filter out the estimates of analysts who are historically inaccurate in their estimates. As a made-up example, consider the consensus earnings per share (EPS) estimate of 40 analysts covering a stock, which averages to $0.20 EPS. But, when you filter to the most recently updated estimates of the analysts with the best track record, the estimate is $0.21, and sure enough, when the company reports $0.21 EPS (a seemingly 5% beat vs. consensus), the share price doesn’t move because the buyside and active sellside were aligned.
It’s important to understand the distribution of active consensus estimates in addition to the single point estimate. Furthermore, it's important to assess the trade flows data to determine if trades are crowded or not.
The wider the dispersion of active consensus and the less concentrated the buyside, the more opportunity there is to be right with views outside of consensus. However, the more concentrated the estimate view, the higher the potential impact of having a different view, which is also harder to prove but would be rewarding if right. The concentration of the buyside in crowded long7 or crowded short positions acts as a mean-reverting factor in that it would take an exceptionally different result in the direction of the consensus crowding to move the needle of the stock in that direction. The market would likely be at a balanced point when the consensus range is narrow and there isn’t buyside crowding. Effectively, the bulls and bears8 have cancelled each other out because the consensus estimates reflect an estimate that isn’t debated.
Once assessing the consensus view of the KPI, it's also important to consider the interrelationship between all financial line items and the KPI. For example, let's assume the alternative data signal says organic unit volume growth is materially better than consensus expects. Historically, in those situations:
Do revenues also perform better than expected, or does volume outperformance come at the expense of a lower price per unit?
Does higher revenue for the company also lead to higher operating margins as operating leverage flows through, or does the company normally reinvest upside back into the business?
How has the balance sheet typically changed as revenues grow faster? Does working capital performance get worse?
These relationships will help ensure the full picture of consensus estimates is considered, not just one line item. Many times, beats in one part of the financial statement are offset elsewhere, so it's important to know the trade-offs.
Once we have a better picture of what’s priced in, we can assess if this new data point is really different or not. If the signal is materially different, we now need to be sure of the data's integrity.
3. Interrogate the data
OK! We’ve made sure we’re answering the right question, and we are confident what’s priced in is not what the data is showing us. Now, we need to dig into the data to be sure we’re not getting a false signal.
Again, we’re talking about a very surprising change in trend, like a hockey stick-looking line graph that only recently inflected upward. Moderate differences vs. consensus are also important, but they are not going to receive the same level of scrutiny that a truly surprising insight would.
Methodology documentation holds great importance for investors, particularly in the realm of big data. Given the vastness of the data, it becomes impractical for humans to manually verify every single data point. When trying to deliver the data as close to real time as possible, the automated process needs to be robust and highly trusted by the users of the data. Therefore, investors need to vet the methodology and documentation to make sure they understand how data issues are handled at scale.
When the stakes are high, like a big surprise signal that could lead to a large investment gain, it's only logical that the investors are going to dig into the data in more detail. This is when data companies also need to be aware of the surprising signal and be there to support customers by answering questions. It’s a great chance to generate credibility on data integrity or customer support points for helping the client understand how to interpret the data if there are nuances that would mitigate the signal seen in the data.
Sanity-check the outcome with independent data: What do other data sources show for related trends? It would be surprising if revenue grew significantly, but other measures of consumer activity around the company and brand also showed no changes. Obviously, there should be differences in trend because the data is measuring different aspects of the business, but there should be some relationship. Seeing different trends would both raise more questions to be answered to understand the relationship between the metrics and underlying data sources. And until those questions are answered, the level of confidence and probability of the signal being accurate will be reduced in the mosaic approach.
Understand the margin of error9: Most investors are comfortable taking two time series and regressing them in Excel to get an R-squared10. However, investors need to revisit this work and look at the Mean Absolute Percentage Error (MAPE)11, and other measures of the margin of error. Is the margin of error wider than the difference between the new data point and the signal? If so, we need to understand that the noise in the dataset may not be a signal. This is not different from running a survey and seeing that 40% of consumers shopped at a retailer last year and 41% shopped at the retailer this year, with a margin of error of 3%. That means there’s no difference in the numbers. In fact, as a back-of-the-envelope calculation, there would need to be a 6% difference between the two data points to be sure they are statistically different. A sample transaction, clickstream data, web-mined prices, etc. are still samples and have a margin of error.
Review the exception reports: Data vendors should be providing transparency on data points redacted, restated, or otherwise touched by humans in the process of automated data pipelines12. A data pipeline to process the raw data into insight-ready data should include
Unit Tests13: In software development and data processing, these are tests that validate the functionality of specific sections of code, such as functions or methods, in isolation. These verify the completion of each process step and alert when any are incomplete.
Data Range Checks: Alerts should be generated for data points that fall outside the permitted ranges.
Anomaly Detection: Outliers should be identified and flagged for review in both raw data and aggregated metrics based on historical context. It’s important to run the anomaly detection both on the data levels and the derivatives of the data.
Tagging and Classifications: Confirm that these have been performed correctly, with alerts generated for any errors.
Final Metrics Accuracy: Ensure the final metrics have been calculated correctly.
Review each of the exceptions triggered from the pipelines to see if these could be having an impact on the output metrics. Note that some vendors interpolate14 the data when there is a data gap, while others leave the data gap visible. It’s important to understand the treatment of errors.
Review the unmodeled data: some data providers model the output data. Even simple aggregations like averages and sums could have underlying trends distorting the aggregations. This is when reviewing the less processed, granular data is useful and worth digging in deeper. The significance of this becomes amplified when the output data is derived from machine learning approaches, as the explainability15 of machine learning models remains an ongoing concern. I welcome others suggestions on this, but when the stakes are high, rolling up your sleeves and diving into the granular data to see if the basic trends support the output would ease concerns or raise questions.
Slice the data a layer deeper: Most alternative data metrics include a wider range of metrics beyond the most impactful, predictive metrics. These descriptive metrics are super valuable at this moment because they are generated from the same dataset. The numbers should be internally consistent. So for example, in web-mined pricing datasets,
The number of SKUs16 collected should follow a predictable pattern for consumer companies. Big increases and declines could be a sign of changes in the data harvest instead of reality.
Metrics showing the trend of prices on an average, median, like-for-like, distribution, etc. should all be internally consistent.
Sometimes what is found is that the data is accurate but moving because of extreme outliers (say a luxury company raised the price of their $80,000 bags by 20% but kept everything else constant; the overall average price would move up materially, but the signal would be better captured by the median price trend).
Other times, the deeper cut of the data reveals something that’s counterintuitive given the original metric providing the signal. This leads to more questions, which, until answered, would mean less weight should be placed on the signal in the investment mosaic.
Concluding thoughts
Data vendors should strive to share the simultaneous excitement and trepidation experienced by their investor clients when encountering a truly surprising insight within the data.
I can’t count the number of times where a surprising insight was uncovered under my watch and I went through the full range of emotions, from elation at finding something material that would move the markets to fear that the data was wrong and highlighting the insight would cause investors to lose money and doubt alternative data as a reliable source. It’s from both emotions that I would roll up my sleeves and dig into the data to find the answers.
As a user of data to make investment recommendations, my reputation was on the line, and I would find reasons why my skepticism was warranted and be relieved that I didn’t publish incorrect analysis. Other times, the data was extremely valuable, had proven data integrity, and was surprisingly different from consensus, so I made changes to my recommendation and got the investment estimates and valuation right. Thankfully, this happened more times in my career than the “answering the wrong question story I alluded to above”.
The same was true for my experience as a creator of data at UBS Evidence Lab while running our web-mined pricing product area. We caught times when the surprising insight was too good to be true, and the skepticism allowed us to figure out a problem with the data before releasing it to the public. But most of the time, when we found a surprising insight, the rigorous “re-vetting” of the data revealed high data integrity and an accurate insight that helped investors get on the right side of consensus estimates.
Lastly, as a buyer of data, there were many times that my team and I caught vendor data inaccuracies (sometimes comical in nature). We would send the feedback in hopes of having the data corrected instead of having a data gap, but we knew that with many alternative data techniques, it's not possible to go back in time to fix mis-harvested data.
It's important to note that investors with deep financial market and sector expertise are exceptionally good at eyeballing data in a chart and detecting problems with the output based on their experience levels. Similar to the story in Malcom Gladwell’s Blink about the art experts immediately detecting the supposed ancient Greek statute was fake (https://www.amazon.com/Blink-Power-Thinking-Without/dp/0316010669) it just doesn’t feel right. However, data companies should not rely on outside experts to detect the anomalies and need to have their own in-house automated, exception-based processes and potentially their own experts to catch the issues if machines cannot catch them consistently.
I share this because data vendors should be taking on the practice and process of finding surprising insights that would be material to the market and putting intense rigor on the data to be sure it's a good kind of big surprise instead of a surprising data error. Data vendors should strive to share the simultaneous excitement and trepidation experienced by their investor clients when encountering a truly surprising insight within the data.
- Jason DeRise, CFA
Data Harvesting: Also known as data extraction, is the process of extracting large amounts of data from various sources for processing and analysis.
Institutional investors are professional investors who invest the money of others on their behalf. This is different from a retail investor, who is an individual or nonprofessional investor who buys and sells securities through brokerage firms or retirement accounts like 401(k)s.
Crowded trade: When a trade is crowded, it means many investors have taken on the same investment, which can be seen in the trading flows data showing excess buying or selling of a specific investment asset.
Consensus: “The consensus” is the average view of the sell-side for a specific financial measure. Typically, it refers to revenue or earnings per share (EPS), but it can be any financial measure. It is used as a benchmark for what is currently factored into the share price and for assessing if new results or news are better or worse than expected. However, it is important to know that sometimes there’s an unstated buyside consensus that is the better benchmark for expectations.
Anomaly Detection: The identification of rare items, events, or observations that do not conform to an expected pattern or other items in a dataset because they deviate materially from the distribution of the data.
Buyside vs. Sellside: Buyside typically refers to institutional investors (Hedge funds, mutual funds, etc.) who invest large amounts of capital, and Sellside typically refers to investment banking and research firms that provide execution and advisory services to institutional investors.
Long/Short: Long/Short Equity funds buy positions (long) in stocks they believe will go up in value and sell short stocks (short) that they believe will go down in value. Typically, there is a risk management overlay that pairs the long and short positions to be “market neutral,” meaning it doesn’t matter if the market goes up or down; what matters is that the long position outperforms the short position. Short selling, by a simplistic definition, is when an investor borrows stock from an investor who owns it and then sells the stock. The short seller will eventually need to buy back the stock at a later date to return it to the owner of the stock, and they will profit if they buy back the stock at a lower price than they sell it.
“Bullish” and “Bearish” are financial market jargon for positive (Bullish) and negative (Bearish) opinions about what will happen next for an industry or investment.
Margin of Error: In statistics, the margin of error describes the amount of random sampling error in a survey's results.
R-Squared: A statistical measure that represents the proportion of the variance for a dependent variable that's explained by an independent variable or variables in a regression model.
Mean Absolute Percentage Error (MAPE): A statistical measure used to determine the accuracy of a forecasting method in predictive analytics based on the average of the percentage errors of each entry in a dataset.
Data pipeline: A set of data processing elements or tasks connected in series, where the output of one element is the input of the next one, converting raw data into cleansed and enriched data, typically managed on an automated schedule.
Unit Tests: In software development and data processing, these are tests that validate the functionality of specific sections of code, such as functions or methods, in isolation.
Interpolate: A method of constructing new data points within the range of a set of known data points to fill in the blanks of missing or bad data points in order to reduce the noise in the dataset.
Explainability: In the context of machine learning, this refers to the degree to which a machine learning model's behavior can be understood by humans.
SKUs (Stock-Keeping Units): unique identifiers for different products in a store or warehouse.
These are great. Keep ‘em coming!