SkaleHigh
Beauty / RetailData & Software Development

Building a Data Platform for Pilgrim

How we helped a $350 million consumer brand centralize data from 100+ marketplaces

Pilgrim
Client
Pilgrim
Industry
Beauty / Retail
Service
Data & Software Development
Project Timeline
2024-2025

Centralized Data Platform

Unified data from 100+ marketplaces to power Pilgrim's inventory management and marketing decisions

Data PipelineData WarehouseETL

The Context

Pilgrim, a rapidly growing consumer brand valued at $350 million, wanted to build a comprehensive data pipeline to track their inventory, sales, and other business metrics across multiple platforms. Their goal was to make better-informed marketing decisions and optimize their product distribution strategy.

As a well-established brand, Pilgrim products were available on more than 100 different marketplaces, from major platforms like Flipkart, Amazon, and BigBasket to smaller specialized retailers and Zepto. This wide distribution network was a testament to their success but also created significant data challenges.

The Challenge

Pilgrim faced multiple challenges in consolidating their data across various marketplaces:

  • Diverse Data Sources: Each marketplace had its own data format, API structure, and reporting methodology
  • API Rate Limits: Many platforms imposed strict rate limits that made data extraction difficult at scale
  • SKU Naming Inconsistencies: Products were often labeled differently across platforms, making data reconciliation complex
  • Data Type Variations: Different platforms used different data types, timestamps, and measurement units
  • Access Method Variations: Some platforms offered APIs, while others only provided dashboard access requiring custom extraction solutions

Our Approach

Discovery & Planning

We began by meeting with Rwitapa, Head of Data Analytics at Pilgrim, to understand their specific needs and pain points. After several in-depth discussions, we identified which marketplaces needed immediate attention based on revenue contribution.

Strategic Recommendations

For some marketplaces, we recommended affordable third-party tools that could address their needs. For the remainder where no suitable tools existed, we proposed building custom connectors and maintaining a centralized data warehouse.

Scope & Timeline Definition

We established clear project scope and timelines, building in buffer periods to account for unexpected challenges. This approach ensured we could deliver on our commitments while managing the complexities of each marketplace integration.

Our Solution

We implemented a comprehensive data pipeline solution that addressed Pilgrim's unique challenges:

Custom API Connectors

We developed custom API connectors for each marketplace with unique rate limiting strategies, authentication mechanisms, and error handling capabilities tailored to each platform's requirements.

Browser Extensions

For platforms without accessible APIs, we created custom browser extensions that could securely extract data from dashboards while respecting login requirements and session management.

Data Normalization Engine

We built a robust data transformation layer that standardized SKU names, normalized data types, and harmonized timestamps across all platforms to ensure consistent analysis.

Analytical Data Warehouse

We implemented a scalable data warehouse optimized for analytics that stored historical data and provided fast query performance for Pilgrim's reporting and analysis needs.

Historical Data Backfilling

One significant challenge was backfilling 6-12 months of historical data across all platforms. To accomplish this efficiently, we:

  • Developed parallelized query strategies to speed up data extraction
  • Implemented delta loading mechanisms to avoid redundant processing
  • Created batch processing pipelines that optimized resource utilization
  • Established data verification procedures to ensure completeness and accuracy

Implementation Process

Phase 1: High-Priority Marketplaces

We initially focused on the marketplaces that generated most of Pilgrim's revenue, building connectors for their primary sales channels.

Key Achievements:
  • • Successfully integrated 5 major platforms
  • • Established core data warehouse architecture
  • • Created initial data transformation pipeline

Phase 2: Secondary Marketplaces

After success with phase 1, we were asked to expand our solution to connect additional marketplaces, each with their unique technical challenges.

Key Achievements:
  • • Expanded to 20+ additional marketplaces
  • • Developed browser extensions for dashboard-only platforms
  • • Enhanced data reconciliation capabilities

Phase 3: Stabilization & Knowledge Transfer

After development, we dedicated time to system stabilization, monitoring, bug fixes, and comprehensive knowledge transfer to Pilgrim's team.

Key Achievements:
  • • Stabilized all connectors within a month
  • • Provided comprehensive documentation
  • • Delivered hands-on training to Pilgrim's data team
  • • Established ongoing maintenance procedures

Results & Impact

100+

Marketplaces integrated into a single data platform

12 months

Of historical data successfully backfilled

30+ hours

Saved weekly on manual data consolidation

Business Impact

The centralized data platform enabled Pilgrim's marketing and inventory management teams to:

  • Make data-driven marketing decisions based on comprehensive cross-platform insights
  • Identify underperforming product lines and optimize inventory allocation
  • Generate accurate sales forecasts across all marketplaces
  • Increase marketing ROI by targeting high-performing channels
"

SkaleHigh's team went beyond just building connectors—they truly understood our business challenges and built innovative solutions for problems we didn't even know how to articulate. Their approach to handling the diverse marketplace data was impressive, and they delivered a system that has transformed how we make marketing decisions.

Rwitapa Mitra

Rwitapa Mitra

Head of Data Analytics, Pilgrim

"

Technologies Used

PythonAWSAirflowDockerKubernetesPostgreSQLBig QuerySeleniumChrome ExtensionsETLData Warehousing

Need a Similar Solution for Your Business?

We specialize in building data pipelines and analytics platforms that help businesses make better decisions. Let's discuss how we can help your organization leverage its data effectively.

Get in Touch