How Zepto’s Data Team Built for 10-Minute Delivery | AIM

May 19, 2025

12

To build apps serving the masses, companies must develop resilient systems that handle all the data efficiently and use it to empower their business decisions and improve the end-user experience. In such cases, the data team, with its innovative approaches to handling and processing data, does the heavy lifting.

At the Data Engineering Summit 2025, Abhinav Raghuvanshi, Zepto’s associate director of data engineering, addressed the company’s challenge in managing data: How can data be delivered as quickly as food? For a company that established the 10-minute delivery model, real-time visibility is not just advantageous, but is essential for successful execution, forecasting, and enhancing customer experience.

The talk, “From Warehouse to Lakehouse: The Future of Real-Time Data at Zepto,” traced how the data platform evolved from a single Redshift warehouse to a hybrid architecture powering operational analytics and application-grade real-time systems. Along the way, Zepto had to rethink ingestion, transformation, cost governance, and how it handled SQL queries for the backend.

When Redshift Wasn’t Fast Enough

Zepto’s original setup relied on Redshift as its central warehouse. It worked, until it didn’t. Query bottlenecks, stale reporting, and long wait times exposed cracks. “We gave a kind of patient-facing design to Redshift,” Raghuvanshi said, “With the volume we were getting, it was becoming impossible to serve data in near reality.”

Adding compute nodes or partitioning tables only went so far. Query collisions, a lack of separation between storage and compute, and costly I/O pushed the team to rethink. Citing an example, Raghuvanshi said, if someone were to run a very inefficient query on the Redshift cluster, they would find that this approach would be pretty slow for them.

So began the shift to a modern architecture with S3 as the central storage layer, Kafka for streaming ingestion, and Databricks (powered by Apache Spark and Photon) for transformation and orchestration. He explained, PII columns are stripped early, business logic is layered through a structured data pipeline, and every job is encrypted and scheduled using Apache Airflow.

Zepto also made a shift to MongoDB to reduce latency recently. It will be interesting to see what they do next.

Data Democratised, Governance Intact

The system isn’t just real-time, it’s real-access. Zepto supports nearly 400 analysts across business, product, and operations. To ensure usability, the engineering team built a low-code in-house framework where users can simply drop in SQL queries. “Most analysts are very comfortable with SQL queries,” Raghuvanshi explained. “We put in abstractions in terms of transformations they want to do.”

Each team operates within cost boundaries thanks to dbt catalog tags, role-based access, and job-level budget dashboards. Teams get alerted if they’re burning through compute, and reporting workloads are sandboxed to avoid collisions. “Instead of worrying about when the data would arrive,” he noted, “it gives more control to the end-users in terms of what frequencies of data they want to consume.”

The lakehouse model, underpinned by S3 and Databricks, now supports historical queries, snapshots, and time-series analytics. Snapshots can reconstruct a store’s inventory state from months ago, and they are helpful for audits, restocking predictions, or other analyses.

ClickHouse for Real-Time Analytics

From a business model perspective, Zepto utilises dark stores, which serve as central hubs for items like ice cream. Maintaining optimal conditions, such as temperature, requires real-time monitoring of various metrics in milliseconds to ensure product safety and quality. Additionally, tracking delivery routes necessitates IoT data integration for real-time reporting. This immediate feedback is crucial for efficient operations, ensuring timely store visits and preventing delays.

Raghuvanshi mentioned using ClickHouse for that. The Iot system captures all those metrics and puts them into ClickHouse for real-time analysis, reporting, and monitoring.

ClickHouse powers Zepto’s production-facing real-time analytics that include order drop rates, traffic flow, IoT metrics from storage facilities, and even fridge temperature tracking. It supports MergeTree engines—replicated, replacing, summing—each designed for different workloads. “If your data is immutable, feel free to just dump it into ClickHouse,” he said.

To simplify access, the team wrapped Flink with a SQL-friendly interface, enabling analysts to plug queries into Kafka sources without writing code. Results are routed into ClickHouse, where apps can query metrics within milliseconds.

ClickHouse is not just a database, it’s a monitoring nerve centre. Store-level metrics, inventory depletion, and fulfilment rates—all now update in near real-time. “Let’s say if your milk in a store at 10 a.m. is getting exhausted,” he said, “you want the milk to be distributed in stores by 2 p.m. You would have the data in real-time.”

Future Roadmap

Zepto is looking to enable more expressive queries in real-time. Tools like StarRocks and other Online Analytical Processing (OLAP) engines are being evaluated to handle complex joins without latency penalties. “It will take a year and a half for StarRocks to be ready for self-improvement,” Raghuvanshi noted.

The larger goal is to make real-time intelligence as accessible as historical reporting, without expecting every business user to become a systems engineer. With a system like Zepto’s, he noted, “Now people can focus on building more data products out to the ecosystem of individual users”.

Source link

Tags
Zepto

How Zepto’s Data Team Built for 10-Minute Delivery | AIM

When Redshift Wasn’t Fast Enough

Data Democratised, Governance Intact

ClickHouse for Real-Time Analytics

Future Roadmap

Related Articles

Jensen Huang Flips Back on Quantum | AIM

Android Developers Can Now Build Apps With Gemini Nano With New API

Google I/O 2025: What to Expect From Google’s Developer Conference

LEAVE A REPLY Cancel reply

Latest Articles

Jensen Huang Flips Back on Quantum | AIM

Android Developers Can Now Build Apps With Gemini Nano With New API

Google I/O 2025: What to Expect From Google’s Developer Conference

The Dark Side of o3 | AIM

How redBus Uses Raw Data from 150 billion Data Points | AIM

How Zepto’s Data Team Built for 10-Minute Delivery | AIM

When Redshift Wasn’t Fast Enough

Data Democratised, Governance Intact

ClickHouse for Real-Time Analytics

Future Roadmap

Related Articles

LEAVE A REPLY Cancel reply

Stay Connected

Latest Articles