Data Deep Dive: Reverse Engineering Retail Real Estate Strategy with Geospatial Analytics
How web mining, geocoding, and geospatial analytics reveal hidden patterns in retailer strategy, competition, and growth
As a consumer retail analyst, I spent much of my time analyzing how shoppers choose grocery stores and how retailers position their locations to meet demand. While store location data exists at state, county, metro, and zip code levels, these broad aggregates fail to capture the real, on-the-ground decisions consumers make when choosing where to shop.
Those aggregates simply aren’t precise enough for analytics. For example, take New York City as a market and consider the pharmacy industry. In the early to mid 2000s, Duane Reade had some 250 locations. But New Yorkers would see many examples of Duane Reades operating very close to each other. Two Duane Reade stores across the street from each other served different customers due to walking distance, cross-street barriers, and daily commuter paths from the subway to home. Duane Reade could figure this out with their loyalty cards and know where their customers lived and they also used old-school methods of standing on the sidewalk and counting people walking past to pick the best location to make Duane Reade the most convenient option for the shopper.
Duane Reade wasn’t special in this skill set generally. This is something retailers and commercial real estate professionals do on a regular basis. “Location, location, location” is the mantra of the industry and for good reason.
It’s the drive and walking time that’s the biggest factor. Drive-time analytics best define competitive trade areas (or catchment area)1 offering a clearer view of supply and demand dynamics.
The same phenomenon exists with Starbucks in New York City—one location can sit directly across from another, yet both remain profitable because they serve distinct trade areas.
Makes me think of this Lewis Black routine about the end of the universe:
I’m not sure I can explain why two Starbucks exist in the same building in Houston. While I can’t explain the two Starbucks in Houston, I can explain how to source retailer location data, geocode the locations, and leverage a Geographic Information System (GIS)2 and graph database3 to create drive-time-based trade areas in order to create metrics that reverse engineer the retailer's real estate strategy and have critical insights on near- and long-term retailer growth.
This Data Deep Dive includes the following sections:
Common questions addressed with the data
Underlying Data
Cleaning the Data
Enriching the data
Limitations to consider
Action items to begin using the data

Welcome to the Data Score newsletter, composed by DataChorus LLC. The newsletter is your go-to source for insights into the world of data-driven decision-making. Whether you're an insight seeker, a unique data company, a software-as-a-service provider, or an investor, this newsletter is for you. I'm Jason DeRise, a seasoned expert in the field of data-driven insights. As one of the first 10 members of UBS Evidence Lab, I was at the forefront of pioneering new ways to generate actionable insights from alternative data. Before that, I successfully built a sell-side equity research franchise based on proprietary data and non-consensus insights. After moving on from UBS Evidence Lab, I’ve remained active in the intersection of data, technology, and financial insights. Through my extensive experience as a purchaser and creator of data, I have gained a unique perspective, which I am sharing through the newsletter.
The Data Score introduced premium content on February 1st, which is behind a paywall. Data Playbook, Data-Driven Investing, and Dataset Deep Dives will be premium content. These sections provide actionable strategies, in-depth case studies, and exclusive expertise tailored for data-driven professionals. The subscription for premium content is priced at $9.99 per month or $99.99 annually.
Common questions addressed with the data
There are multiple important investment questions that can be answered by applying geospatial4 analytic techniques to points of interest data.
Is new store productivity supported by location growth in high-quality locations?
Is competition accelerating, negatively affecting same-store sales?
Are new locations self-cannibalizing their own store footprint, affecting same-store sales?
Is a retailer benefiting from stronger economic tailwinds at the local level compared to other retailers?
Can retailers consolidate and remain within government trade commission standards for local market share thresholds?
What is the potential total addressable market5 for the retailer and how many locations can the retailer reach in the long term?