🔔 FCM Loaded

Sr. Data Engineer

Exigo Tech

2 - 5 years

Vadodara

Posted: 08/01/2026

Getting a referral is 5x more effective than applying directly

Job Description

Exigo Tech is a Sydney-based Technology Solutions Provider that is focused on providing solutions on three major verticals; Infrastructure, Cloud, and Application to businesses across Australia. We help companies reach operational efficiencies by empowering them with technology solutions that drive their business processes.


Exigo is looking for Full-time Sr. Data Engineer


We are ISO 27001:2022 certified organization

Visit our website: for more details.

LinkedIn:


Click Here to know more : LIFE AT EXIGO TECH


Roles and Responsibilities

  • Install, configure, and manage Apache Spark (open-source) clusters on Ubuntu, including Spark master/worker nodes and Spark environment files.
  • Configure and manage Spark UI and Spark History Server for monitoring jobs, analyzing DAGs, stages, tasks, and troubleshooting performance.
  • Develop, optimize, and deploy PySpark ETL/ELT pipelines using DataFrame API, UDFs, window functions, caching, partitioning, and broadcasting.
  • Deploy PySpark jobs using spark-submit in client/cluster mode with proper logging and error handling.
  • Install, configure, and manage Apache Airflow including UI, scheduler, webserver, connections, and variables.
  • Create, schedule, and monitor Airflow DAGs for PySpark jobs using SparkSubmitOperator, BashOperator, or PythonOperator.
  • Configure and manage cron jobs for scheduling data processing tasks where needed.
  • Install, configure, and optimize Trino (PrestoSQL) coordinator and worker nodes; configure catalogs such as S3, MySQL, or PostgreSQL.
  • Maintain Linux/Ubuntu servers including services, logs, environment variables, memory usage, and port conflict resolution.
  • Design and implement scalable data architectures using Azure Data Services including ADF, Synapse, ADLS, Azure SQL, and Databricks.
  • Develop, manage, and automate ETL/ELT pipelines using Azure Data Factory (Pipelines, Mapping Dataflows, Dataflows).
  • Monitor, troubleshoot, and optimize data pipelines across Spark, Airflow, Trino, and Azure platforms.
  • Work with structured, semi-structured, and unstructured data across multiple data sources and formats.
  • Implement data analytics, transformation, backup, and recovery solutions.
  • Perform data migration, upgrade, and modernization using Azure and database tools.
  • Implement CI/CD pipelines for data solutions using Azure DevOps and Git.
  • Ensure data quality, governance, lineage, metadata management, and security compliance across cloud and big data environments.
  • Design and optimize data models using star and snowflake schemas; build data warehouses, Delta Lake, and Lakehouse systems.
  • Develop and rebuild reports/dashboards using Power BI, Tableau, or similar tools.
  • Collaborate with internal teams, clients, and business users to gather requirements and deliver high-quality data solutions.
  • Provide documentation, runbooks, and operational guidance.


Technical Skills:

  1. Apache Spark (Open Source) & PySpark - Must
  • Apache Spark installation & cluster configuration (Ubuntu/Linux)
  • Spark master/worker setup (standalone & cluster mode)
  • Spark UI & History Server configuration and debugging
  • PySpark development (ETL pipelines, UDFs, window functions, DataFrame API)
  • Performance tuning (partitioning, caching, shuffles)
  • spark-submit deployment with monitoring and logging

2. Apache Airflow & Job Orchestration - Must

  • Airflow installation & configuration (UI, scheduler, webserver)
  • Creating and scheduling DAGs (SparkSubmitOperator, BashOperator, PythonOperator)
  • Retry logic, triggers, alerting, and log management
  • Cron job scheduling & process automation

3. Trino (PrestoSQL) - Must

  • Trino coordinator & worker node setup
  • Catalog configuration (S3, RDBMS sources)
  • Distributed SQL troubleshooting & performance optimization

4. Azure Data Services (nice to have)

  • Azure Data Factory
  • Azure Synapse Analytics
  • Azure SQL / Cosmos DB
  • Azure Data Lake Storage (Gen2)
  • Azure Databricks (Delta, Notebooks, Jobs)
  • Azure Event Hubs / Stream Analytics

5. Microsoft Fabric ( nice to have)

  • Lakehouse
  • Warehouse
  • Dataflows
  • Notebooks
  • Pipelines

6. Programming & Querying

  • Python
  • PySpark
  • SQL
  • Scala

7. Data Modeling & Warehousing

  • Star schema modeling
  • Snowflake schema modeling
  • Fact/dimension modeling
  • Data warehouse & Lakehouse design
  • Delta Lake / Lakehouse architectures

8. DevOps & CI/CD

  • Git / GitHub / Azure Repos
  • Azure DevOps pipelines (CI/CD)
  • Automated deployment for Spark, Airflow, ADF, Databricks, Fabric

9. BI Tools (Nice to have)

  • Power BI
  • Tableau
  • Report building, datasets, DAX

10. Linux/Ubuntu Server Knowledge

  • Shell scripting
  • Service management
  • Logs & environment variables


Soft Skills:

  • Excellent problem solving and communication skills
  • Able to work well in a team setting
  • Excellent organizational and time management skills
  • Taking end-to-end ownership
  • Production support & timely delivery
  • Self-driven, flexible and innovative
  • Microsoft Certified: Azure Data Engineer Associate (DP-203 / DP -300)
  • Knowledge of DevOps and CI/CD pipelines in Azure


Education:

  • BSc/BA in Computer Science, Engineering or a related field


Work Location: Vadodara, Gujarat, India

Services you might be interested in

Improve Your Resume Today

Boost your chances with professional resume services!

Get expert-reviewed, ATS-optimized resumes tailored for your experience level. Start your journey now.