Methodology

How the dataset is built - sources, accuracy bands, what we don’t cover, and how to flag errors. Written so a careful reader can decide how much to trust it.

In one paragraph

We pull dark store coordinates and metadata from each platform’s public store-locator endpoints, dedupe by platform store ID and spatial proximity, enrich with reverse-geocoded city / area / sub-locality data via Ola Maps (with Mappls and Nominatim as fallbacks), classify cities into tiers, group stores into hyperlocal areas, and refresh the whole pipeline monthly with a public diff. Every record we publish is a real store the platform itself classifies as a fulfilment centre.

Data sources

Store location data is sourced from the public APIs that the three platforms’ consumer-facing applications use. We extract coordinates (latitude, longitude), platform-internal store IDs, and any available metadata such as store names and locality identifiers.

Blinkit: store-locator endpoints exposed to the Blinkit consumer app and to its picker / delivery onboarding surface.
Zepto: coverage and store-resolution endpoints used by the Zepto app to determine serviceability.
Swiggy Instamart: serviceability and store-resolution endpoints inside the Swiggy app, filtered to Instamart fulfilment locations.

We do not scrape authenticated, private, or paywalled data. We do not place orders to elicit responses. We do not use stolen credentials. Everything we collect is publicly accessible to any consumer of the platforms’ own apps.

How we identify a dark store

Not every location returned by a platform endpoint is a dark store. Some are pickup points, some are partner kirana stores, some are delivery rider hubs without inventory. We treat a location as a "dark store" only when three signals line up:

1. Platform classification. The platform’s own data classifies the location as a dark store / micro-fulfilment centre / dispatch warehouse - distinct from retail partner stores or pickup points.
2. Operating hours pattern. Dark stores typically run extended hours (often 24/7 or close to it) to support 10-minute delivery windows. Locations whose schedule looks like a regular retail storefront are excluded.
3. Address pattern. Dark stores are warehouse-style facilities - anonymous building numbers in residential or mixed-use zones, not high-street retail frontage. Locations whose address looks like a branded retail store are flagged for review.

Within a platform we dedupe by store ID. Across platforms we keep records separate even when two warehouses sit in the same building - they’re commercially distinct (separate teams, separate inventory, separate hiring). Within a platform, we collapse two records that share an ID or sit within ~50 metres of each other (typically the same store re-emitted with a slightly different coordinate).

Reverse geocoding

Raw coordinates are enriched with full address data using a three-tier geocoding pipeline. Each tier is tried in order; we fall through only if the higher-priority service returns empty, ambiguous, or low-quality results.

Ola Maps (primary) - best accuracy and consistency for Indian addresses, including sub-locality data and Plus Codes.
Mappls / MapmyIndia (fallback) - used when Ola Maps returns generic or empty locality data.
Nominatim / OpenStreetMap (last resort) - open-source fallback with a 1-request-per-second rate limit and a custom User-Agent.

Successful geocoding results are cached. Stores within 500 metres of each other reuse the same area assignment to keep the neighbourhood-level grouping consistent. Failed lookups (all three services empty) are logged and flagged for manual review - those stores keep their coordinates but show no derived address.

Geocoding accuracy band

Coordinates are accurate enough for spatial analysis and neighbourhood grouping, but they’re not survey-grade. The accuracy distribution we measure across the dataset:

92%

Within ±50 metres

The vast majority of stores have coordinates accurate to within ~50 metres of the true warehouse location. Sufficient for "which area is this store in" and "how far apart are these two stores".

7%

Within ±200 metres

A smaller band sits at ±50–200 metres. Usually because the platform publishes a service-centroid coordinate rather than the precise warehouse address, or the warehouse is in a building with multiple entrances.

1%

Flagged for review

Roughly one percent of stores are flagged in our internal database with low-confidence coordinates - typically newer stores in tier-2 cities where the platform’s own address data is sparse, or addresses without Plus Code coverage. We don’t hide these on the site; we mark them.

Area grouping

Individual stores are grouped into areas based on their geocoded locality. An area represents a neighbourhood or commercial zone where one or more platforms operate dark stores - Mansarovar in Jaipur, Whitefield in Bangalore, Andheri West in Mumbai. Area grouping gives spatial context that single pin points don’t.

Where the geocoded locality is ambiguous (multiple "Sector 18" results within a city, for example), we use the city plus the sub-locality to disambiguate, and fall back to a coordinate-based cluster if the address signal is too weak.

City tier classification

Cities are classified into three tiers based on population, economic activity, and quick commerce penetration. The classification matters for analysis (worker pay scales, store density expectations, platform expansion patterns):

Tier 1 metro - Delhi NCR (Delhi, Noida, Gurgaon, Ghaziabad, Faridabad), Mumbai, Bangalore, Hyderabad, Chennai, Kolkata, Pune.
Tier 1 non-metro - Ahmedabad, Jaipur, Lucknow, Chandigarh, Kochi, Indore, and similar.
Tier 2 - Every other city in the dataset where any of the three platforms operates.

What we DON’T cover

The boundary of the dataset matters as much as what’s in it. The public site explicitly excludes:

Quick-commerce platforms beyond the big three. We monitor BBNow, Tata Neu Quick, JioMart Express, and Amazon Fresh internally but don’t publish them yet - coverage signals are noisier and the public dataset would be misleading.
Food delivery (Swiggy / Zomato regular ordering). Restaurant orders aren’t a dark-store model. Out of scope.
Pickup points and partner kirana stores. These appear in some platform endpoints but are not dark stores and are filtered out.
Closed stores (after our healthcheck flags them). Stores we believe have closed are marked isActive: false in the underlying database after two consecutive missed scrapes and removed from the public site at the next refresh. The change is logged in the changelog.
Salary or worker-level data. The map is about stores, not people. Per-platform pay ranges are discussed in our written reports but never tied to individual stores.

Data freshness and refresh cadence

Full data refresh: monthly, on the 1st. The current dataset has 4,081 stores across 408 cities and 2,089 areas. We aim to publish the diff (additions, removals, reclassifications) within 24 hours of the refresh; that lands on the changelog page.

New stores typically appear in the next monthly refresh after they open (so up to a 30-day lag).
Closed stores are flagged after 2 consecutive missed scrapes (~60 days) and removed at the next monthly refresh.
Coordinates are re-verified each refresh; stores that move are re-geocoded.

Between refreshes, the public site is a snapshot. Use the last_refreshed_at stamp on each page to judge how current the picture is.

How to report an issue

Found a wrong location, a missing store, a closure we haven’t caught, or a misclassification? Please tell us. Use the contact form with subject "Data correction", or write directly to [email protected].

A good correction includes:

The platform (Blinkit / Zepto / Swiggy Instamart)
The city or area you’re in
The specific address or coordinate that’s wrong
What you believe the correct state is (e.g. "this store closed in March 2026")
(Optional) a screenshot or link showing the actual state

Limitations

The honest section. Things we can’t fully fix, and that you should factor into how much weight to put on the data:

Coverage bias toward metros. The platforms themselves are densest in tier-1 metros, and our geocoding is most accurate there. Tier-2 city coverage is real but more variable - both because there are fewer stores and because address signals are sparser.
Up-to-30-day lag on new stores. Stores that opened after the most recent monthly refresh won’t appear until the next one. Same holds for store moves within a city.
Up-to-60-day lag on closures. We flag a store as closed only after 2 consecutive missed scrapes, which means a quietly-closed store may still appear here for up to 60 days after the closure.
Platform endpoint changes. Platforms periodically change their API structure. When that happens we sometimes lose a refresh cycle while we adapt. The changelog records any month where coverage was incomplete.
Sub-locality naming inconsistencies. Different geocoders disagree on whether "Andheri West" or "Andheri (W)" or "Andheri W" is the canonical name. We normalise but don’t always pick the most popular form.
Transliteration variants. Areas with multiple common Romanisations (Bengaluru / Bangalore, Mumbai / Bombay) are normalised to a canonical form, which may not match what locals call the place.
Snapshot, not stream. Data is point-in-time. Markets that change weekly look stable here for a month at a time.

Want the per-store data with all the caveats above documented in machine-readable form? See the data license. Want to see what changed in the most recent refresh? The changelog has it.

Looking for dark store jobs?

Visit QuickCommerceJobs.com