NVIDIA: How could alternative data be used to assess its long-term potential?
Discover how alternative data can reveal insights on NVIDIA's future, paving the way for smarter investment strategies and unearthing fresh insights in the GPU market.
Welcome to the Data Score newsletter, your go-to source for insights into the world of data-driven decision-making. Whether you're an insight seeker, a unique data company, a software-as-a-service provider, or an investor, this newsletter is for you. I'm Jason DeRise, a seasoned expert in the field of alternative data insights. As one of the first 10 members of UBS Evidence Lab, I was at the forefront of pioneering new ways to generate actionable insights from data. Before that, I successfully built a sellside equity research franchise based on proprietary data and non-consensus insights. Through my extensive experience as a purchaser and creator of data, I have gained a unique perspective that allows me to collaborate with end-users to generate meaningful insights.
In the intricate world of financial investment, predicting a company's long-term potential is a critical but challenging task. This article aims to help both financial professionals and data professionals think through how alternative data could be used to answer big uncertainties in the market.
I am using NVIDIA as a case study and will delve into how alternative data, beyond traditional financial metrics, can generate insights on key aspects: NVIDIA's GPU1 market presence, the evolving applications of GPUs, potential competitive threats, and the balancing act of supply chain investments. Unleashing the power of unique data sources, this article aims to inspire novel ways to evaluate a tech giant like NVIDIA, paving the way for more informed investment decisions.
This is meant to be the start of a conversation. I invite investment professionals to comment on the key investment debates facing NVIDIA and data professionals to suggest novel solutions that can provide an edge in our understanding. Let's leverage the collective creativity of the alternative data community to expand and innovate beyond the ideas presented here.
NVIDIA’s current situation
NVIDIA, the leading graphics processing unit (GPU) company, recently reached $1 trillion in market cap “as investors piled into the chipmaker that has quickly become one of the biggest winners of the AI boom. The stock's value has tripled in less than eight months, reflecting the surge in interest in artificial intelligence following rapid advances in generative AI” according to this Reuters article: https://www.reuters.com/technology/nvidia-sets-eye-1-trillion-market-value-2023-05-30/
The challenge for investors is that at this level, the share price implies the market expects very high sales growth, earnings growth, and/or improving ROIC2. Consensus3 earnings per share growth forecast by 35 sell analysts analysts averages 157% y/y in 2024 according to Nasdaq.com (https://www.nasdaq.com/market-activity/stocks/nvda), and grow another 39% in 2025, which places its current share price at ~47x in 2025 earnings per share. By contrast, the S&P 500 has a forward P/E ratio4 of around 18.5x (based on a collection of freely sourced estimates on the web found by Google Search on 13 June 2023).
NVIDIA Consensus Earnings Per Share:
NVIDIA P/E and PEG ratios based on consensus estimates:
My view on the key investment debates facing investment decision makers about NVIDIA
Alternative data is more granular than publicly reported results, which allows for a deeper assessment of the long-term answer than simply waiting for management to report results and update guidance. With each data-driven answer to the questions above, we adjust the probability of achieving the view being tested.
This isn’t an investment research report. So, I won’t take a view on the answers to these questions or share what the answers would mean for an investment decision. However, instead, I explain how I would approach the problem of assessing NVIDIA’s future fundamentals using alternative data.
Here’s my take on the big questions that investors would want to answer to assess if there is upside or downside to the current share price:
Will the mass adoption of generative AI support the expected high demand for GPU chips over the next 5 years to meet or beat consensus revenue estimates for NVIDIA’s data center division?
Can other use cases for GPU chips support additional growth (gaming, crypto, auto, VR, etc) over the next 5 years to meet or beat overall revenue expectations?
Will other competitors enter the market with credible products over the next 5 years to disrupt NVIDIA’s potential profitability as estimated by consensus?
How much will NVIDIA need to invest in production and supply chains to meet demand expectations over the next 5 years to meet or beat consensus cash flow and ROIC expectations?
These are long-term investment questions that reflect the nature of the growth expectations implied by the current share price. When selecting the questions to be answered, it's important to have a time frame and a benchmark level, here generically set as consensus expectations 5 years from now. I’m not taking a view on the answer, but it's better to be more precise with your own specific view and then test with data if it's achievable or not.
Even though these are 5-year time horizon questions, there are 20 short-term quarters of results in a row to get there. These act as the market’s near-term data points to assess if the long-term is possible. Alternative data is more granular than publicly reported results, which allows for a deeper assessment of the long-term answer than simply waiting for management to report results and update guidance. With each data-driven answer to the questions above, we adjust the probability of achieving the view being tested.
In other words, I believe both long-term and short-term investors should care about these critical questions.
Fellow investment professionals: What other questions would need to be answered? Are there more important questions to answer to be on the right side of the investment?
Brainstorming alternative data solutions to build the mosaic
With those bigger picture questions in mind, here are the datasets and applications that I would want to access and monitor.
1. Will the mass adoption of generative AI support the expected high demand for GPU chips over the next 5 years to meet or beat consensus revenue estimates for NVIDIA’s data center division?
What I would look for in the answer: disruptive change and accelerated growth stories require a series of sigmoid curves (S-curves)5 to play out over an extended period of time. Growth will eventually flatten out. But if growth flattens sooner than the market expects and is baked into the current share price valuation, then actions would need to be taken by companies to restimulate growth or risk missing long-term revenue growth. My view is that I would want multiple lenses on this question using different techniques and sources because each approach has its own bias and noise. I would also like to see trends by different geographies and industries because each will have its own adoption S-curve. Imagine a red-yellow-green-light system for each of these metrics. If growth stops expanding exponentially, put up the red light to stop and assess what’s going on and if long-term expectations are appropriate.
Google Trends search activity
Terms: a series of topics related to generative AI, Large Language Models (LLM) and specific products in the space such as ChatGPT, Bing, Bard, hugging face, stable diffusion, mid-journey, DeepFloyd and others. Benchmark relative to traditional search URLs (Google, Yahoo, etc). Google in particular would set an appropriately high watermark for potential adoption (yes, a surprisingly high number of people google “Google” https://trends.google.com/trends/explore/TIMESERIES/1686759600?hl=en-US&tz=240&date=all&geo=US&hl=en&q=Google,yahoo,Chatgpt+%2B+gpt,amazon+-rain,facebook&sni=3).
Make sure the terms reflect the local language, and consider removing related terms to make the search results more reflective of actual intentions. Consider homonyms as a reason to completely remove terms from analysis.
Geographies:
Americas: US, Canada, Mexico, Argentina, Brazil,
EMEA: France, Germany, Italy, Spain, South Africa, Turkey, UK,
APAC: Japan, Australia, India, Indonesia, South Korea6
Metric: Share of search index. It’s always important to include multiple terms in your search queries in Google Trends because the data is provided on a relative basis within the geography.
Caveat: Google Trends is best for top-of-mind awareness, which is a critical first step in adoption, but isn’t the same thing as revenues for NVIDIA; we’re connecting the dots between end demand for consumer applications powered by GPU chips and the need for more GPU chips to meet demand in the long term.
Clickstream data for B2C (Business-to-consumer)
Entities: Same as above ChatGPT, Bing, Bard, hugging face, stable diffusion, mid-journey, DeepFloyd and others. Benchmark relative to traditional search URLs (Google, Yahoo, etc). Google in particular would set an appropriately high watermark for potential consumer adoption.
Geographies:
Americas: US, Canada, Mexico, Argentina, Brazil,
EMEA: France, Germany, Italy, Spain, South Africa, Turkey, UK,
APAC: Japan, Australia, China, India, Indonesia, South Korea,
Metrics: Visits, Visitors, monthly active users, weekly active users, daily active users, Time on Page, Bounce Rate
Caveat: The growing heavy usage of these platforms is also a prerequisite for more demand for GPU chips but should be considered a long-term indicator. That said, flattening usage would be a more near-term signal that growth could be stabilizing.
App usage data for same consumer services
Entities: ChatGPT and a collection of popular apps providing generative AI services.
Geographies:
Americas: US, Canada, Mexico, Argentina, Brazil,
EMEA: France, Germany, Italy, Spain, South Africa, Turkey, UK,
APAC: Japan, Australia, China, India, Indonesia, South Korea,
Metrics: downloads, monthly active users, weekly active users, daily active users
Caveat: Like clickstream data, the growing heavy usage of these platforms is also a prerequisite for more demand for GPU chips but should be considered a long-term indicator. That said, flattening usage would be a more near-term signal that growth could be stabilizing.
B2B software review data
Products: create a competitive set of leading software as a service (SaaS) products that leverage AI, focused on B2B (Business-to-Business) applications, e.g. AutoML tools or creativity tools with AI built into the solution
Geographies: US (I could be persuaded that reviews can differ by geography, but keeping this simple would be better.)
Metrics: Count of reviews by product and aggregated, sentiment of reviews The goal here is to see the adoption of products that leverage AI continue to expand in the B2B world as a complement to the B2C tracking above
Caveat: The count of reviews is important to monitor for growth stories because the only reason people would leave a review is that they tried the product. However, at steady state levels, the number of reviews may not be correlated with smaller changes in products at maturity. We’re looking for big movements in the number of reviews. The second important factor is the sentiment of the reviews, but only if they are becoming more negative. Note that if expanded to different geographies, the sentiment of the review will incorporate cultural biases in the way reviews are provided (e.g., in my experience working with consumer review data, US consumers tend to be more positive on the same products as European consumers), so tracking the rate of change instead of the absolute score is more important.
Clickstream data for B2B websites
Entities: Documentation pages for cloud instances with accelerated computing7/GPU chips included, AutoML tools and applications documentation pages.
Geographies: US, UK, Germany, India,
Metrics: Visits, Visitors, monthly active users, weekly active users, daily active users, Time on Page, Bounce Rate
Caveat: This use case is going to be more volatile than the B2C example above. I’ve limited the geographies to more robust panels to remove some of the noise and bias that can happen with B2B use cases. We’re really after developer clickstream activity because consumers are not typically going to the documentation pages for technical tools. But visits to these pages are a strong indicator of B2B demand for tools powered by GPUs. Again, we’re looking for a long-term signal of continued exponential growth or a flattening of the trend.
Job listings and employment data
Job attributes: A series of terms to find in job listings such as Generative AI, LLM, NLG, AI, ML, Accelerated Computing,
Geographies:
Americas: US, Canada, Mexico, Argentina, Brazil,
EMEA: France, Germany, Italy, Spain, South Africa, Turkey, UK,
APAC: Japan, Australia, China, India, Indonesia, South Korea
Industries: All industries, plus breakdown by high level sectors
Metrics: Count of jobs with related terms, length of time jobs available, share of all job postings
Caveat: Technical product development, which generates the B2B demand for GPU chips, requires human capital to execute it (unless you want to model a scenario of a world where machines are making all the decisions, including purchasing more GPU chips). The test here is to see if jobs are being created that leverage or build tools that are powered by GPU chips. Like the above, it's not a direct read to near-term NVIDIA revenue, and we’re looking for exponential growth in the metrics or a potential flattening to signal the ability to achieve long-term revenue forecasts.
2. Can other use cases for GPU chips support additional growth (gaming, crypto, auto, VR, etc) over the next 5 years to meet or beat overall revenue expectations?
What I would look for in the answer: the origin of GPU chip demand was for gaming applications. Later, the power of these chips was in demand for cryptocurrency mining. There are other long-term use cases, such as autonomous vehicles or virtual reality. Like the AI impact on the datacenter business, these each have their own S-curves and should be monitored to track if growth is accelerating, stabilizing, or declining.
Web mined price and inventory of new Gaming GPUs
Websites: Amazon, Best Buy, Tmall
Brands: NVIDIA, Intel, AMD, etc…
Frequency: Weekly
Geographies: US, Germany, UK, China,
Metrics: Like-for-like sequential and like-for-like year over year (y/y) pricing, best-seller share8, inventory availability / transaction count (as shown on website without putting items in basket to collect)
Caveat: Monitoring the share of best-seller lists and associated price changes will reveal the popularity of NVIDIA’s gaming GPUs compared to competitors. This is not the same as purchases, but a very close proximity. Combining movements up/down the best-seller list with price activity can reveal a proxy for price elasticity. For example, dropping prices to stimulate growth could work in the near term but affect long-term sales growth if prices cannot be maintained at the lower levels without affecting ROIC.
Monitor volume of distributed ledger transactions9
Currencies: Bitcoin, Ethereum
Metrics: Daily transaction volume, incremental volume of currency in circulation
Caveat: Mining cryptocurrency is a heavy, compute-intensive activity. The volume of activity would be a factor related to the demand for GPU chips but could also be generated by leveraging the current GPU chips in use. So we’re looking for a sustained increase in volume to support the long-term need for more GPUs in this sector. Conversely, dropping activity would not support incremental GPU purchases for growth needs, but purchases could still happen as part of the replacement/maintenance cycle for previously bought GPUs.
Patents on Autonomous vehicle technology
Patent keywords/terms: Sensor Data Processing, Image and Video Processing, Object Detection and Recognition, Path Planning and Decision Making, Simulations, On-board Machine Learning
Companies: Waymo, Tesla, Ford, GM, other major OEMs, Suppliers
Metrics: Count of patents, count of cross referenced patents, count of patent acquisitions, count of legal challenges
Caveat: Patents are a very early signal of long-term investment. The count of patents is the first step in understanding the investment, but the cross references, patent acquisitions, and legal challenges give more signal that the patents are valuable. If patent work were to stop growing in this sector on a 5-year basis, it would be a surprising signal and suggest more questions need to be asked about the long-term need for GPUs in the space.
3. Will other competitors enter the market with credible products over the next 5 years to disrupt NVIDIA’s potential profitability as estimated by consensus?
What I would look for in the answer: Given the high demand and high interest in this sector, it's not unreasonable to expect more competition to try to enter the market. Only time will tell if they will be successful. Using alternative data to monitor the competition is important to not only catch potential near-term headwinds to revenue, but also understand if long-term profit pools will be affected by competition.
Web mined Cloud availability and price of GPU chips
Websites: AWS, Azure, and Google Cloud’ service pricing APIs
Frequency: data is available daily, but price changes and new instance availability happen less frequently
Companies: NVIDIA, AMD, INTC
Metrics: number of instances available with specific accelerated computing chips, counting each geography + instance, price per hour of use
Caveat: This is a classic web-mined pricing approach that monitors the availability and price of products and services. The important caveat is that it doesn’t provide direct demand signals. However, prices are not set randomly. They are a reflection of the business strategy to monetize the company’s value. So, in this case, it would be fair to assume that price changes would signal an imbalance between GPU demand and supply, with rising prices signaling more demand than near-term supply and falling prices reflecting excess capacity.
Patents- this time focused on accelerated computing suppliers
Patent keywords/terms: GPU, accelerated computing
Companies: Metrics: Count of patents, count of cross referenced patents, count of patent acquisitions, count of legal challenges
Caveat: Like above, patents are a very early signal of long-term investment. The count of patents is the first step in understanding the investment, but the cross-references, patent acquisitions, and legal challenges give more signal that the patents are valuable. If patent work were to stop growing in this sector on a 5-year basis, it would be a surprising signal and suggest more questions need to be asked about the long-term need for GPUs in the space.
Job listings by GPU companies
Job attributes: A series of terms to find in job listings of all roles related to the research, development and manufacturing of GPUs
Geographies: Global
Industries: Semiconductors, Tech Hardware
Metrics: Count of jobs with related terms, length of time jobs available, share of all job postings
Caveat: Like above, context is incredibly important to analyzing the data. If hiring activity is increasing, it could be due to employee turnover or growth hires. The rate of hiring should also correspond with the expectations of growth for the company. If expectations were for continued growth and hiring activity slowed, it would raise questions about why. If the capacity is there for the next 5 years, perhaps it’s a positive sign for profitability because the employee cost base would be relatively fixed, allowing for more operating leverage in the future. And for more context, if competitors are ramping up their staffing, it’s a bit of a prisoner’s dilemma to consider for incumbents, do you continue to invest in talent to build in purposeful redundancy in the workforce because it’s a competitive advantage to retain the top talent and have backup in case of departures for competitors?
4. How much will NVIDIA need to invest in production and supply chains to meet demand expectations over the next 5 years to meet or beat consensus cash flow and ROIC expectations?
What I would look for in the answer: Semiconductor component supply chains have been under pressure over the past few years and are potentially only returning to normalcy. However, it seems that in the GPU space, demand continues to outstrip supply, which, if not managed, could limit the revenue potential. However, overextending capacity investments versus demand expectations that never materialize could materially hurt long-term profitability. The datasets below should be monitored to understand whether the investments are being matched appropriately to demand, which is a goldilocks paradigm where capacity needs to be not too much or too little, but just right.
B2B survey / expert network focused on assessing any backlog of orders
Dual-track approach to buyers of GPUs. 1) high-quality, low-frequent interviews with key buyers of GPU chips to get color commentary10 on the market; and 2) a high-frequency, lower-quality, simple 3-question survey of GPU buyers asking about demand and backlog length.
Geographies: US, big 5 Europe, India, China
Industries: all
Metrics: create a demand diffusion index11, central tendency, and distribution of backlog length, plus color commentary from interviews.
Caveat: It’s important to keep in mind that expert network interviews and b2b surveys are about perceptions and why. It’s not necessarily indicative of actual behavior.
Customs, Bill of lading data12
products/key words: GPU, Accelerated computing, relevant GPU components
Geographies: US
Industries: All
Metrics: volume and value of imports
Caveat: Bill of Lading data is often highly redacted and only covers vessel shipments. It’s possible that the high-end GPUs are air-shipped and not covered by the bill of lading data. Also, it’s possible this approach will result in a wild goose chase in deciphering the holding companies that export and import the products included that are relevant. However, at this early stage of the investment debate, any insights would be valuable, so solving this would be extremely worthwhile. Fellow data companies comment if you have done this work :)
Supply chain relationships data
Breakdown of key customers and suppliers of NVIDIA
Geographies: global
Industries: all
Metrics: % of revenue for customers, % of COGS13, % of Capex14, % of change in inventory
Caveat: This information will help narrow down which traditional data to track in terms of supply chain revenues and cost trends, and it could spark additional alternative data collection ideas focused on those key companies in the supply chain.
Remote sensing15 when facilities under development, including component suppliers
attributes: specific geocoordinates of future FAB16 / component Manufacturing facility location
Geographies: global
Metrics: % complete, key milestones achieved.
Caveat: This is a very idiosyncratic approach to remote sensing, but one of the most valuable to financial markets. Monitoring the progress of announced production facility greenfield projects or expansions is valuable because construction time is highly uncertain. As facility construction approached completion, the probability increased that industry capacity would rise. Projects that hit roadblocks would see a falling probability of completion or would signal expected delays in capacity coming online.
Job listings by GPU companies
Job attributes: A series of terms to find in job listings of all roles related to the research, development and manufacturing of GPUs
Geographies: Global
Industries: Semiconductors, Tech Hardware
Metrics: Count of jobs with related terms, length of time jobs available, share of all job postings
Caveat: Like the above use cases for other questions, context is incredibly important to analyzing this data. If hiring activity is increasing, it could be due to employee turnover or growth hires. The rate of hiring should also correspond with the expectations of growth for the company. If expectations were for continued growth and hiring activity slowed, it would raise questions about why. If the capacity is there for the next 5 years, perhaps it’s a positive sign for profitability because the employee cost base would be relatively fixed, allowing for more operating leverage in the future.
Fellow data professionals: I’m sure I’ve missed opportunities for data to help investors. What other solutions to these questions could your firm offer?
Concluding Thoughts
Traditional financial results and management commentary do indeed play pivotal roles in the decision-making process. However, in today's data-driven world, alternative data offers invaluable insights. Its granular and frequent data points enable us to refine our financial projections and ask more incisive questions of management, ultimately enhancing investment decisions and alpha generation17.
In this article, I used NVIDIA as a case study to demonstrate how the process of identifying key questions and working backward to relevant datasets can illuminate any investment debate. This is not limited to NVIDIA; the approach can be applied to any company or sector. If you're interested in exploring how your investment questions could be addressed with data, or if you're a data provider with ideas on how your data can help, don't hesitate to reach out. I'm more than happy to provide advice and further the conversation.
- Jason DeRise, CFA
So much jargon, folks… did I miss any other terms that should have been defined or explained?
GPUs: An acronym for "Graphics Processing Units." These are specialized electronic circuits designed to rapidly manipulate and alter memory to accelerate the creation of images in a frame buffer intended for output to a display device.
Growth and ROIC: The importance of growth and ROIC to increasing the value of a company is well covered by McKinsey at this link https://www.mckinsey.com/capabilities/strategy-and-corporate-finance/our-insights/balancing-roic-and-growth-to-build-value
ROIC (Return on Invested Capital): This financial metric measures a firm's profitability and the efficiency with which its capital is employed.
Consensus: “The Consensus” is the average view of the sell-side for a specific financial measure. Typically, it refers to revenue or earnings per share (EPS), but it can be any financial measure. It is used as a benchmark for what is currently factored into the share price and for assessing if new results/news are better or worse than expected. However, it is important to know that sometimes there’s an unstated buyside consensus that is the better benchmark for expectations.
P/E Ratio is share price to earnings per share, which reflect the company’s equity value in terms of of $1 of earnings. More details here: https://www.investopedia.com/terms/p/price-earningsratio.asp. Earnings per share is the net income of the company divided by the number of outstanding shares. PEG ratios compare the P/E to the estimated earnings per share growth rate as a way to reflect that companies with higher growth typically have higher P/E ratios when comparing the value of two or more companies.
A sigmoid function is a mathematical function having a characteristic "S"-shaped curve or sigmoid curve. S-curves are often used to describe the adoption of new technology.
Google Trends notes: China’s exclusion from the geographies in Google Trends because the sample of searches on Google from China is not robust. Also note the exclusion of the global aggregation due to how Google Trends data is aggregated, weighted by global search activity of all terms, and not the importance of geographies to the specific topics.
Accelerated Computing: This term refers to using GPUs to perform tasks traditionally handled by CPUs. It's central to the functions of NVIDIA's products and the competitive landscape.
Best-Seller Share: Websites often provide a ranking of products by sales to help customers find items they would likely buy quickly (before they leave for another competitor website). By collecting this rank information and setting a threshold for being a best seller, the share of best sellers can be tracked as a proxy for demand.
Distributed ledger technology is a platform that uses ledgers stored on separate, connected devices in a network to ensure data accuracy and security. Blockchains evolved from distributed ledgers to address growing concerns that too many third parties are involved in too many transactions. https://www.investopedia.com/terms/d/distributed-ledger-technology-dlt.asp
Color Commentary: This term originally comes from broadcasting and refers to adding context or expert insights into a given situation. In this case, it's about expert opinions on the GPU market.
Diffusion Index: An economic indicator that represents the net number of positive signals or conditions occurring in a given set.
Bill of lading data: A bill of lading is a legal document between a shipper and carrier detailing the type, quantity, and destination of the goods being carried. The bill of lading also serves as a shipment receipt when the carrier delivers the goods at the predetermined destination.
COGS (Cost of Goods Sold): This is the cost of creating the goods or services a company sells, which will vary depending on the type of business.
Capex (Capital Expenditure): This is the money a company spends on acquiring, maintaining, or improving physical assets such as buildings, equipment, or technology.
Remote sensing: The process of detecting and monitoring the physical characteristics of an area by measuring its reflected and emitted radiation from a distance, typically from a satellite.
FAB: Short for Fabrication plant, a factory where devices such as integrated circuits are manufactured.
Alpha generation: A term used in finance to describe an investment strategy's ability to beat the market or generate excess returns. A simple way to think about alpha is that its a measure of the outperformance of a portfolio compared to a pre-defined benchmark for performance. Investopedia has a lot more detail https://www.investopedia.com/terms/a/alpha.asp
Good write-up Jason. The other data approach we've been taking is tracking panelized cloud spend across AWS, Azure, and GCP. Provides a read on share of spend by CPU supplier across cloud workloads, architecture mis of spend/usage, GPU market share, and pricing per hour of different chips.