Login Sign Up

Data Engineer

Pace Wisdom Solutions

2 - 5 years

Bengaluru

Posted: 29/05/2026

Getting a referral is 5x more effective than applying directly

Job Description

Job Description:

Location: Bangalore (Hybrid)

Role: Data Engineer Business Insights

Experience Required

57+ years of experience in Data Engineering, Big Data, or Analytics Engineering.

Role Overview and Key responsibilities

We are looking for a highly skilled Data Engineer to build and manage scalable data pipelines for our 4PL (Fourth-Party Logistics) Business Insights platform. The ideal candidate will design and implement robust ingestion, transformation, and analytics-ready data infrastructure that powers AI-driven business insights and operational intelligence.

  • This role will be responsible for building end-to-end pipelines from Existing Kafka spine + Debezium CDC + Apache Flink for streaming transformation along with supporting bulk ingestion from CSV and other flat-file sources
  • Would need the candidate to have working experience with Apache Iceberg on Amazon S3
  • Should be familiar with ClickHouse for building customer dashboards and Trino/Athena for historical queries
  • Design, develop, and maintain scalable data pipelines for ingesting logistics and operational data into the analytics platform.
  • Strong SQL skills and experience optimizing analytical queries.
  • Familiarity with containerization and cloud-native deployments.
  • Proficiency in Python, Scala, or Java.

Data Lake & Warehouse Management

  • Manage and optimize data flow from Kafka topics into S3-based storage layers. Build ETL/ELT pipelines to transform and load data into ClickHouse for high-performance analytical querying.
  • Design partitioning, indexing, and schema strategies in ClickHouse for low-latency AI and BI workloads.

AI & Analytics Enablement

  • Enable AI agents and analytics applications to efficiently query ClickHouse datasets.
  • Ensure data quality, consistency, and availability for downstream AI-driven insights.
  • Collaborate with AI/ML teams to expose optimized datasets and semantic models.

Platform Reliability & Optimization

  • Monitor and optimize pipeline performance, storage efficiency, and query latency.
  • Implement observability, alerting, and retry mechanisms for ingestion pipelines.
  • Ensure scalability, fault tolerance, and data governance best practices.

Collaboration

  • Work closely with:
  • Product teams
  • Business Insights teams
  • AI/ML engineers
  • Platform engineering teams
  • Participate in architecture discussions and contribute to long-term data platform strategy.

Required Skills & Qualifications

Technical Skills

  • Strong experience in building distributed data pipelines.
  • Hands-on expertise with:

Data Engineering Concepts

  • ETL/ELT pipeline design
  • Data modeling for analytics
  • Data partitioning and indexing strategies
  • Schema evolution and metadata management
  • Monitoring and observability

Nice to Have

  • Experience with logistics, supply chain, or 4PL platforms.
  • Exposure to AI/LLM-based analytics systems.
  • Familiarity with vector search or AI retrieval architectures.
  • Experience with dbt or modern data stack tools.
  • Knowledge of Iceberg, Delta Lake, or Parquet optimization.

Preferred Qualifications

  • Bachelors or Masters degree in Computer Science, Engineering, or related field.
  • Experience working in high-scale analytics or real-time data environments.

Success Metrics

  • Reliable real-time and batch ingestion pipelines.
  • Optimized ClickHouse performance for AI-agent querying.
  • Reduced latency for analytics and reporting workloads.
  • High data quality and pipeline uptime.

Services you might be interested in

We Search & Apply Jobs for You!

Our team scans through 1000s of opportunities and applies to roles best suited to your profile

Save 100+ hours and focus on what matters - cracking interviews and landing offers.