Data Engineer Lead
ImmersiveData.AI
5 - 7 years
Pune
Posted: 15/03/2026
Job Description
Job Description: Data Engineer (Lead)
Location: Pune
Experience: 5-7 Years
Job Type: Full-Time (Hybrid)
About the Role
We are looking for an experienced Lead Data Engineer to drive a large-scale ETL modernization and migration initiative. The project focuses on transforming a legacy IBM DataStage ETL platform into a modern cloud-based data architecture using Databricks and Azure Data Factory (ADF).
The goal of this migration is to significantly enhance scalability, performance, maintainability, and cloud integration by replacing traditional ETL jobs with Spark-based distributed data processing on Databricks.
The ideal candidate will play both technical leadership and hands-on development roles, ensuring smooth migration, troubleshooting, and optimization of data pipelines.
Key Responsibilities
Technical Leadership
- Lead a POD of data engineers, coordinating daily activities and sprint deliverables.
- Work closely with clients and stakeholders to discuss project progress, blockers, and technical solutions.
- Provide technical guidance and mentorship to team members.
- Participate in design discussions and architecture decisions for cloud data pipelines.
Data Engineering & Development
- Design, develop, and optimize data pipelines using Databricks (Apache Spark).
- Translate legacy IBM DataStage ETL pipelines into modern Databricks and ADF workflows.
- Contribute to hands-on development, debugging, and troubleshooting of data pipelines.
Performance Optimization
- Perform SQL performance tuning and identify query bottlenecks.
- Implement optimization techniques such as:
- Query optimization
- Partitioning
- Clustering
- Efficient data processing patterns
Pipeline Execution & Debugging
- Manage end-to-end pipeline execution in Databricks and Spark.
- Debug failures, analyze logs, and resolve runtime issues in distributed data pipelines.
Data Quality & Issue Resolution
- Investigate SIT and UAT data issues and perform root cause analysis.
- Validate data lineage and transformation logic.
- Ensure functional parity between legacy DataStage jobs and migrated pipelines.
ADF Orchestration
- Develop and maintain Azure Data Factory pipelines for workflow orchestration.
- Monitor and troubleshoot ADF pipeline failures and execution issues.
Project Tracking
- Manage project tasks and progress using JIRA.
- Track development status, bugs, and sprint deliverables.
Required Skills
Core Data Engineering
- Strong experience in Apache Spark (Databricks)
- Expertise in SQL and query performance tuning
- Experience in ETL pipeline development
Cloud & Data Platforms
- Hands-on experience with Azure Data Factory (ADF)
- Experience working with Databricks for large-scale data processing
- Understanding of cloud data architecture
Legacy ETL Understanding
- Experience or familiarity with IBM DataStage for understanding legacy logic and validating migrated pipelines.
Data Debugging & Analysis
- Strong experience in data validation, data lineage, and troubleshooting pipeline issues
Project & Collaboration Tools
- Experience using JIRA for Agile project tracking
- Experience working in SIT/UAT environments
Leadership Skills
- Experience leading data engineering teams or PODs
- Strong client communication and stakeholder management
- Ability to guide team members and resolve technical blockers
Nice to Have
- Experience in large-scale ETL migration projects
- Knowledge of data lake architectures
- Experience with CI/CD pipelines for data engineering
- Understanding of modern data engineering best practices
Services you might be interested in
Improve Your Resume Today
Boost your chances with professional resume services!
Get expert-reviewed, ATS-optimized resumes tailored for your experience level. Start your journey now.
