Sr. Data Engineer
Exigo Tech
2 - 5 years
Vadodara
Posted: 15/12/2025
Job Description
Exigo Tech is a Sydney-based Technology Solutions Provider that is focused on providing solutions on three major verticals; Infrastructure, Cloud, and Application to businesses across Australia. We help companies reach operational efficiencies by empowering them with technology solutions that drive their business processes.
Exigo is looking for Full-time Sr. Data Engineer
We are ISO 27001:2022 certified organization
Visit our website: for more details.
LinkedIn:
Click Here to know more : LIFE AT EXIGO TECH
Roles and Responsibilities
- Install, configure, and manage Apache Spark (open-source) clusters on Ubuntu, including Spark master/worker nodes and Spark environment files.
- Configure and manage Spark UI and Spark History Server for monitoring jobs, analyzing DAGs, stages, tasks, and troubleshooting performance.
- Develop, optimize, and deploy PySpark ETL/ELT pipelines using DataFrame API, UDFs, window functions, caching, partitioning, and broadcasting.
- Deploy PySpark jobs using spark-submit in client/cluster mode with proper logging and error handling.
- Install, configure, and manage Apache Airflow including UI, scheduler, webserver, connections, and variables.
- Create, schedule, and monitor Airflow DAGs for PySpark jobs using SparkSubmitOperator, BashOperator, or PythonOperator.
- Configure and manage cron jobs for scheduling data processing tasks where needed.
- Install, configure, and optimize Trino (PrestoSQL) coordinator and worker nodes; configure catalogs such as S3, MySQL, or PostgreSQL.
- Maintain Linux/Ubuntu servers including services, logs, environment variables, memory usage, and port conflict resolution.
- Design and implement scalable data architectures using Azure Data Services including ADF, Synapse, ADLS, Azure SQL, and Databricks.
- Develop, manage, and automate ETL/ELT pipelines using Azure Data Factory (Pipelines, Mapping Dataflows, Dataflows).
- Monitor, troubleshoot, and optimize data pipelines across Spark, Airflow, Trino, and Azure platforms.
- Work with structured, semi-structured, and unstructured data across multiple data sources and formats.
- Implement data analytics, transformation, backup, and recovery solutions.
- Perform data migration, upgrade, and modernization using Azure and database tools.
- Implement CI/CD pipelines for data solutions using Azure DevOps and Git.
- Ensure data quality, governance, lineage, metadata management, and security compliance across cloud and big data environments.
- Design and optimize data models using star and snowflake schemas; build data warehouses, Delta Lake, and Lakehouse systems.
- Develop and rebuild reports/dashboards using Power BI, Tableau, or similar tools.
- Collaborate with internal teams, clients, and business users to gather requirements and deliver high-quality data solutions.
- Provide documentation, runbooks, and operational guidance.
Technical Skills:
- Apache Spark (Open Source) & PySpark - Must
- Apache Spark installation & cluster configuration (Ubuntu/Linux)
- Spark master/worker setup (standalone & cluster mode)
- Spark UI & History Server configuration and debugging
- PySpark development (ETL pipelines, UDFs, window functions, DataFrame API)
- Performance tuning (partitioning, caching, shuffles)
- spark-submit deployment with monitoring and logging
2. Apache Airflow & Job Orchestration - Must
- Airflow installation & configuration (UI, scheduler, webserver)
- Creating and scheduling DAGs (SparkSubmitOperator, BashOperator, PythonOperator)
- Retry logic, triggers, alerting, and log management
- Cron job scheduling & process automation
3. Trino (PrestoSQL) - Must
- Trino coordinator & worker node setup
- Catalog configuration (S3, RDBMS sources)
- Distributed SQL troubleshooting & performance optimization
4. Azure Data Services (nice to have)
- Azure Data Factory
- Azure Synapse Analytics
- Azure SQL / Cosmos DB
- Azure Data Lake Storage (Gen2)
- Azure Databricks (Delta, Notebooks, Jobs)
- Azure Event Hubs / Stream Analytics
5. Microsoft Fabric ( nice to have)
- Lakehouse
- Warehouse
- Dataflows
- Notebooks
- Pipelines
6. Programming & Querying
- Python
- PySpark
- SQL
- Scala
7. Data Modeling & Warehousing
- Star schema modeling
- Snowflake schema modeling
- Fact/dimension modeling
- Data warehouse & Lakehouse design
- Delta Lake / Lakehouse architectures
8. DevOps & CI/CD
- Git / GitHub / Azure Repos
- Azure DevOps pipelines (CI/CD)
- Automated deployment for Spark, Airflow, ADF, Databricks, Fabric
9. BI Tools (Nice to have)
- Power BI
- Tableau
- Report building, datasets, DAX
10. Linux/Ubuntu Server Knowledge
- Shell scripting
- Service management
- Logs & environment variables
Soft Skills:
- Excellent problem solving and communication skills
- Able to work well in a team setting
- Excellent organizational and time management skills
- Taking end-to-end ownership
- Production support & timely delivery
- Self-driven, flexible and innovative
- Microsoft Certified: Azure Data Engineer Associate (DP-203 / DP -300)
- Knowledge of DevOps and CI/CD pipelines in Azure
Education:
- BSc/BA in Computer Science, Engineering or a related field
Work Location: Vadodara, Gujarat, India
Services you might be interested in
Improve Your Resume Today
Boost your chances with professional resume services!
Get expert-reviewed, ATS-optimized resumes tailored for your experience level. Start your journey now.
