Phenom Intro:

Our purpose is to help a billion people find the right job! Phenom is an AI-Powered talent experience platform that is redefining the HR tech space. We have grown into a global organization with offices in 6 countries and over 2,000 employees. As an HR tech unicorn organization, innovation and creativity is within our DNA. Come help us make every talent moment Phenomenal!

Job Summar

yPhenom People is looking for an experienced Data Engineer to optimize and enhance our data pipelines that power insights across the organization. This role focuses on improving performance, ensuring data quality, automating pipeline development, and enabling data sharing through scalable and secure mechanisms

.The ideal candidate will have strong expertise in PySpark, SQL optimization, and experience with data sharing platforms such as DataHub. Youll collaborate closely with data engineering, platform, and analytics teams to modernize our data ecosystem built on Airflow, Livy, EMR, Flink, Iceberg, and Snowflake

.If youre passionate about performance tuning, automation, and building resilient data systems wed love to hear from you

.
What Youll

DoOptimize existing data pipelines built using PySpark for performance and scalabilit
y.Review and tune SQL queries to improve execution time and resource utilizatio
n.Design and develop new batch data pipelines that meet evolving business needs (real-time decisioning is not required; systems can tolerate up to 6 hours of delay
).Automate pipeline generation and metadata management to streamline developmen
t.Implement data quality and validation frameworks to ensure reliability and trust in data asset
s.Collaborate with teams to integrate and share data via DataHub, leveraging connectors such as SFTP and Snowflake Data Shar
e.Enhance observability by integrating and monitoring pipelines through Grafana and other telemetry tool
s.Work closely with other engineers to improve coding practices and leverage AI-powered developer assistants like Cursor, Copilot, or Windsurf for productivit
y.Participate in design reviews, documentation, and operational support for production system

s.Work Experien

ceWhat Youve Do

ne58 years of experience as a Data Engineer working with large-scale distributed data system
s.Strong proficiency in Python (especially PySpark) and hands-on experience optimizing Spark jobs on EMR or similar environment
s.Deep knowledge of SQL tuning and query optimization across analytical databases (preferably Snowflake
).Experience working with Airflow for workflow orchestration and Livy for Spark job managemen
t.Exposure to Flink and Apache Iceberg for batch or incremental data processin
g.Strong understanding of data quality frameworks, testing, and validation technique
s.Familiarity with DataHub or similar data catalog and sharing platform
s.Experience setting up dashboards and alerts in Grafana or equivalent monitoring tool
s.Exposure to AI coding assistants (Cursor, GitHub Copilot, Windsurf) for accelerating development is a plu
s.Passionate about automation, continuous improvement, and building scalable, maintainable system

Data Engineer

Phenom

Let experts apply while you prepare for interviews

Job Description

Services you might be interested in

We Search & Apply Jobs for You!