Lead Data Engineer - Scala/Spark
Xebia
5 - 14 years
Bengaluru
Posted: 25/05/2026
Job Description
Job Title : Lead Data Engineer - Scala/Spark
Job location : Bengaluru
Exp Range : 5-14 years
Notice Period : immediate - 15 days
We are seeking a Senior Data Engineer with deep expertise in Scala-based Spark development and end-to-end
deployment of data pipelines on Kubernetes cluster, orchestrated via Airflow. The ideal candidate should have
a strong software engineering foundation, excellent understanding of distributed systems, proficient in
software design, modern project/code structuring skills, with good understanding on CI/CD processes and
implementation which enables them to deliver reliable, scalable and robust data solutions. Should have overall
experience of minimum 6-8 years with minimum 5Years in Hadoop, Spark.
Key Responsibilities:
Design & implement robust, scalable, batch & real-time data engineering solutions using Apache
Spark (Scala) & Spark structure streaming.
Architect well-structured Scala projects using reusable, modular, and testable codebases aligned
with SOLID principles and clean architecture principles & practices.
Develop, Deploy & Manage Spark jobs on Kubernetes clusters, ensuring eTicient resource utilization,
fault tolerance, and scalability.
Orchestrate data workflows using Apache Airflow manage DAGs, task dependencies, retries, and
SLA alerts.
Write and maintain comprehensive unit tests and integration tests for Pipelines / Utilities developed.
Work on performance tuning, partitioning strategies, and data quality validation.
Use and enforce version control best practices (branching, PRs, code review) and continuous
integration (CI/CD) for automated testing and deployment.
Write clear, maintainable documentation (README, inline docs, docstrings).
Participate in design reviews and provide technical guidance to peers and junior engineers.
Technical Skills:
Primary:
Languages: Scala, Java
Big Data Orchestration: Airflow, Spark on Kubernetes, Yarn, Oozie
Big Data Processing: Hadoop, Kafka, Spark & Spark Structured Streaming.
Experience on SOLID & DRY principles with Good Software Architecture & Design implementation
experience
Advanced Scala experience (e.g. Functional Programming, using Case classes, Complex Data
Structures & Algorithms)
Proficient in developing automated frameworks for unit & integration testing.
Experience with Docker and Helm and related container technologies.
Proficient in deploying and managing Spark workloads on Kubernetes clusters.
Experience in evaluation and implementation of Data Validation & Data Quality
Devops experience in Jenkins, Maven, Github, Github actions, CI/C
Services you might be interested in
We Search & Apply Jobs for You!
Our team scans through 1000s of opportunities and applies to roles best suited to your profile
Save 100+ hours and focus on what matters - cracking interviews and landing offers.
