Scala DATA Engineer
VMC Soft Technologies, Inc
2 - 5 years
Bengaluru
Posted: 21/02/2026
Job Description
Job Title: Scala Data Engineer
Location: Bengaluru
Experience: 8+ IT experience
Job Summary:
We are seeking a highly skilled and experienced Senior Scala Data Engineer to join our
dynamic data team. In this role, you will be instrumental in designing, developing, and
maintaining our next-generation data pipelines and platforms using Scala, Apache Spark,
and cloud-native technologies. You will work on challenging problems involving large-scale
data ingestion, transformation, and processing, contributing directly to our analytical
capabilities and product features.
Note: We are looking for teh immediate Joiners who can join immediately plus the experience on Scala should me more than 6 years or 7 years its a mandate less than that will not be get entertained.
Key Responsibilities:
Design & Development: Architect, build, and optimize robust, scalable, and
efficient data pipelines using Scala and Apache Spark (Spark Core, Spark SQL, Spark
Streaming).
Data Ingestion: Develop solutions for ingesting high-volume, high-velocity data
from various sources (e.g., relational databases, NoSQL databases, APIs, message
queues like Kafka, log files) into our data lake/warehouse.
Data Transformation: Implement complex data transformations, aggregations, and
feature engineering logic to prepare data for analytics, machine learning models,
and operational systems.
Performance Optimization: Identify and resolve performance bottlenecks in Spark
jobs and data pipelines, ensuring optimal resource utilization and execution times.
Data Quality & Governance: Implement data validation, monitoring, and alerting
mechanisms to ensure data accuracy, completeness, and consistency. Contribute to
data governance best practices.
Cloud Infrastructure: Leverage and optimize cloud services (e.g., AWS EMR/Glue,
Azure Databricks/Synapse, GCP DataProc/BigQuery) for data processing and
storage.
Automation & Orchestration: Design and implement automated workflows for
data pipelines using tools like Apache Airflow, AWS Step Functions, or similar.
Required Qualifications:
Experience: 5+ years of professional experience in data engineering, with a strong
focus on building large-scale data solutions.
Scala Expertise: Proven advanced proficiency in Scala programming language.
Apache Spark: Deep hands-on experience with Apache Spark (Core, SQL,
Streaming) for batch and real-time data processing.
Cloud Platforms: Extensive experience with at least one major cloud provider
(AWS, Azure, or GCP) and their relevant data services (e.g., AWS S3, EMR, Glue,
Kinesis; Azure Data Lake, Databricks, Event Hubs; GCP GCS, DataProc, Pub/Sub).
Data Warehousing: Strong understanding of data warehousing concepts,
dimensional modeling (star/snowflake schemas), and ETL/ELT processes.
SQL: Expert-level SQL skills for data querying, manipulation, and optimization.
Distributed Systems: Experience working with distributed systems and
understanding of their challenges (consistency, fault tolerance, concurrency).
Version Control: Proficiency with Git and collaborative development workflows.
Nice-to-Haves:
Streaming Technologies: Experience with real-time streaming platforms like
Apache Kafka, Apache Flink, or Kinesis.
Containerization & Orchestration: Experience with Docker, Kubernetes, and
container orchestration for Spark applications.
Data Orchestration Tools: Hands-on experience with Apache Airflow, Dagster,
Prefect, or similar workflow management tools.
NoSQL Databases: Experience with NoSQL databases such as Cassandra, MongoDB,
DynamoDB, or HBase.
Data Lakehouse/Modern DW: Experience with technologies like Delta Lake,
Apache Iceberg, Snowflake, Redshift, or BigQuery.
MLOps: Familiarity with MLOps principles and supporting data pipelines for
machine learning models.
CI/CD: Experience setting up and maintaining CI/CD pipelines for data engineering
projects.
Performance Tuning: Advanced knowledge of Spark performance tuning
techniques, including memory management, shuffle optimization, and data
partitioning strategies.
Certifications: Relevant cloud (AWS Certified Data Analytics, Azure Data Engineer
Associate, GCP Professional Data Engineer) or Spark certifications.
Thanks & Regards,
Vibha Seth
Technical Recruiter
E-Mail: vibha@vmcsofttech.com
Contact: 9935984975
LinkedIn: linkedin.com/in/vibha-seth-14337b241
Services you might be interested in
Improve Your Resume Today
Boost your chances with professional resume services!
Get expert-reviewed, ATS-optimized resumes tailored for your experience level. Start your journey now.
