Login Sign Up
🔔 FCM Loaded

Data Engineer Lead

ImmersiveData.AI

5 - 7 years

Pune

Posted: 15/03/2026

Getting a referral is 5x more effective than applying directly

Job Description

Job Description: Data Engineer (Lead)


Location: Pune

Experience: 5-7 Years

Job Type: Full-Time (Hybrid)


About the Role

We are looking for an experienced Lead Data Engineer to drive a large-scale ETL modernization and migration initiative. The project focuses on transforming a legacy IBM DataStage ETL platform into a modern cloud-based data architecture using Databricks and Azure Data Factory (ADF).

The goal of this migration is to significantly enhance scalability, performance, maintainability, and cloud integration by replacing traditional ETL jobs with Spark-based distributed data processing on Databricks.

The ideal candidate will play both technical leadership and hands-on development roles, ensuring smooth migration, troubleshooting, and optimization of data pipelines.


Key Responsibilities


Technical Leadership

  • Lead a POD of data engineers, coordinating daily activities and sprint deliverables.
  • Work closely with clients and stakeholders to discuss project progress, blockers, and technical solutions.
  • Provide technical guidance and mentorship to team members.
  • Participate in design discussions and architecture decisions for cloud data pipelines.


Data Engineering & Development

  • Design, develop, and optimize data pipelines using Databricks (Apache Spark).
  • Translate legacy IBM DataStage ETL pipelines into modern Databricks and ADF workflows.
  • Contribute to hands-on development, debugging, and troubleshooting of data pipelines.


Performance Optimization

  • Perform SQL performance tuning and identify query bottlenecks.
  • Implement optimization techniques such as:
  • Query optimization
  • Partitioning
  • Clustering
  • Efficient data processing patterns


Pipeline Execution & Debugging

  • Manage end-to-end pipeline execution in Databricks and Spark.
  • Debug failures, analyze logs, and resolve runtime issues in distributed data pipelines.


Data Quality & Issue Resolution

  • Investigate SIT and UAT data issues and perform root cause analysis.
  • Validate data lineage and transformation logic.
  • Ensure functional parity between legacy DataStage jobs and migrated pipelines.


ADF Orchestration

  • Develop and maintain Azure Data Factory pipelines for workflow orchestration.
  • Monitor and troubleshoot ADF pipeline failures and execution issues.


Project Tracking

  • Manage project tasks and progress using JIRA.
  • Track development status, bugs, and sprint deliverables.


Required Skills


Core Data Engineering

  • Strong experience in Apache Spark (Databricks)
  • Expertise in SQL and query performance tuning
  • Experience in ETL pipeline development


Cloud & Data Platforms

  • Hands-on experience with Azure Data Factory (ADF)
  • Experience working with Databricks for large-scale data processing
  • Understanding of cloud data architecture


Legacy ETL Understanding

  • Experience or familiarity with IBM DataStage for understanding legacy logic and validating migrated pipelines.


Data Debugging & Analysis

  • Strong experience in data validation, data lineage, and troubleshooting pipeline issues


Project & Collaboration Tools

  • Experience using JIRA for Agile project tracking
  • Experience working in SIT/UAT environments


Leadership Skills

  • Experience leading data engineering teams or PODs
  • Strong client communication and stakeholder management
  • Ability to guide team members and resolve technical blockers

Nice to Have

  • Experience in large-scale ETL migration projects
  • Knowledge of data lake architectures
  • Experience with CI/CD pipelines for data engineering
  • Understanding of modern data engineering best practices

Services you might be interested in

Improve Your Resume Today

Boost your chances with professional resume services!

Get expert-reviewed, ATS-optimized resumes tailored for your experience level. Start your journey now.