Company Description

Applix is building industrial AI systems for complex manufacturing, and heavy-equipment operations. Our products turn messy operational data, plant-floor workflows, supply chain constraints, and business logic into deployed software that improves cost, throughput, quality, and operating profit.

We are hiring a Data Engineer to support the build and deployment, a recommendation system

This is not a dashboard-only role. This is a hands-on data engineering role for someone who can work with messy ERP, operations, shop-floor, logistics, pricing, and finance data and turn it into reliable product infrastructure.

Role Description

As a Data Engineer, you will own the data foundation/ You will work closely with the Forward Deployed Engineer, Optimization Engineer, and customer IT/operations teams to extract, clean, model, validate, and maintain the data needed to power optimization recommendations.

The system will rely on data such as component inventory, recovery yield history, reconditioning cost, turnaround time, freight costs, scrap pricing, demand signals, facility capability profiles, and benchmark pricing.

Your job is to make sure the optimizer is not starved, misled, or slowed down by bad data.

What You Will Do

Own data extraction, ingestion, transformation, and quality for customer operational systems.
Build clean data pipelines from systems such as ERP, shop-floor databases, XWheel-style operational systems, spreadsheets, pricing feeds, logistics data, and finance reports.
Create reliable staging, curated, and analytics-ready tables for optimization and reporting.
Work with the Forward Deployed Engineer to understand operational meaning behind fields, IDs, statuses, facilities, component types, cost buckets, and timestamps.
Identify and close data gaps around recovery yield, reconditioning cost, turnaround time, freight matrix, scrap netback, demand signal, and facility capability.
Build data quality checks for completeness, freshness, consistency, duplicates, outliers, and broken joins.
Create data dictionaries and lineage documentation so the team knows what each field means and whether it can be trusted.
Support the Optimization Engineer with clean model inputs for routing, recovery decisions, scrap timing, capacity, cost, and revenue calculations.
Build daily/weekly refresh workflows for shadow-run validation.
Support dashboards and reporting for recommendation performance, operator overrides, OPACC impact, data readiness, and model accuracy.
Work directly with customer IT and business teams to debug access issues, broken extracts, missing data, and inconsistent business logic.

What We Are Looking For

Must Haves

37+ years of experience in data engineering, analytics engineering, or backend data systems.
Strong SQL skills.
Strong Python skills for data processing, validation, automation, and pipeline development.
Experience building ETL/ELT pipelines from messy real-world business systems.
Experience with data warehouses such as Snowflake, Databricks, BigQuery, Redshift, or similar.
Ability to model operational data into clean, usable tables.
Experience with data quality checks, reconciliation, and debugging.
Comfortable working with incomplete, inconsistent, and poorly documented datasets.
Ability to work with business stakeholders to understand what data actually means.
Strong ownership mindset. You should be willing to chase down missing fields, broken joins, wrong assumptions, and bad source data.

Strong Plus

Experience with manufacturing, supply chain, logistics, ERP, MES, aftermarket parts, reconditioning, or heavy equipment data.
Experience with Snowflake specifically.
Experience with Power BI, dbt, Airflow, Dagster, Prefect, or similar orchestration/modeling tools.
Experience integrating external pricing feeds, market data, or procurement data.
Experience supporting optimization, forecasting, routing, scheduling, inventory, pricing, or decision-support systems.
Experience working with data from multiple facilities, plants, warehouses, or business units.
Comfort with APIs, file-based integrations, SFTP, SQL Server, Oracle, Postgres, and Excel-heavy environments.

What Success Looks Like

Within 1 week, you have mapped the available data sources, validated access, identified critical gaps, and created an initial data readiness assessment.
Within 23 weeks, you have built clean, repeatable pipelines for the first version of the optimizer.
Within 5 weeks, you are supporting shadow-run validation with refreshed data, quality checks, and actuals comparison.
By week 6, the team can confidently explain which recommendations are working, where the data is weak, and what needs to improve before scaling.

Data Engineer

Applix

Job Description

Services you might be interested in

We Search & Apply Jobs for You!