Assistant Manager Data Engineer - Deloitte Support Services India Private Limited
Key Responsibilities/Job Duties:
With 6 8 years of hands-on experience developing applications leveraging the following skills.
Build and Optimize Data Pipelines: Develop, maintain, and optimize scalable ETL pipelines using PySpark, Python, and Databricks. Ensure efficient data movement and transformation across structured and unstructured data sources.
Databricks Implementation: Work on the Databricks platform to design and implement robust data solutions, including Delta Lake architecture for efficient data storage and processing.
SQL Query Development: Create and optimize complex SQL queries for analytics and reporting purposes. Collaborate with analytics teams to ensure fast and efficient query execution.
Performance Tuning: Conduct performance tuning for Spark jobs and ETL processes. Optimize cluster configurations, caching strategies, and parallelism to improve runtime efficiency and reduce costs.
Data Governance and Security: Implement data security measures, including data masking, encryption, and fine-grained access controls. Utilize Databricks Unity Catalog to manage schemas, tables, and data lineage tracking.
Collaboration Across Teams: Work closely with data architects, analysts, and scientists to understand requirements and deliver high-quality data solutions that support analytics and machine learning workflows.
Automation and CI/CD: Integrate data workflows into CI/CD pipelines using tools like Azure DevOps and Git. Automate deployment processes for data pipelines and ensure consistency in releases.
Cloud Data Platform Expertise: Leverage Azure cloud services (e.g., Azure Data Lake Storage, Azure Blob Storage) for data storage and processing. Ensure efficient resource utilization and scalability of cloud-based solutions.
Monitoring and Troubleshooting: Monitor data pipelines and troubleshoot issues to maintain data quality and reliability. Use tools like Spark UI and Azure Log Analytics for performance monitoring.
Minimum Qualifications
Education & Experience: Bachelors degree in computer science, Data Engineering, or related field. 5+ years of experience in data engineering or similar roles.
Programming Skills: Proficiency in PySpark, Python, and SQL for data processing and pipeline development.
Databricks Expertise: Hands-on experience with Databricks (preferably Azure Databricks) and Apache Spark. Knowledge of Delta Lake architecture and Spark optimization techniques.
Cloud Platform Experience: Strong understanding of Azure cloud services, including Azure Data Lake Storage and Azure Blob Storage.
Data Governance Knowledge: Familiarity with implementing data governance measures, such as data masking, encryption, and access controls.
Problem-Solving: Strong analytical and troubleshooting skills to resolve complex data pipeline issues and ensure platform reliability.
Preferred Qualifications
Certifications: Databricks Certified Data Engineer Associate or Professional. Equivalent certifications in Azure Data Engineering are a plus.
Financial Services Experience: Prior experience in regulated industries, such as financial services, with an understanding of compliance requirements for sensitive data.
Big Data Tools: Familiarity with Apache Kafka, Azure Data Factory, and BI tools like Power BI for analytics integration.
CI/CD and DevOps: Experience with implementing CI/CD pipelines for data projects and using tools like Azure DevOps or Jenkins for automated testing and deployment.
Cost Optimization: Proven ability to improve performance and reduce costs in cloud-based data solutions.
Tools & Technologies
Data Processing: PySpark, Python, SQL, Delta Lake.
Databricks Platform: Databricks Workspace, Apache Spark, Databricks SQL, Unity Catalog.
Cloud Services: Azure Databricks, Azure Data Lake Storage, Azure Blob Storage.
DevOps Tools: Azure DevOps, Git, Terraform (preferred for infrastructure-as-code).
Monitoring Tools: Spark UI, Azure Log Analytics.
Location: Hyderabad
Work Shift Timings: 11 AM to 8 PM
Education
Bachelors Degree in Computer Science; Engineering or Equivalent
#CA-NRP