Manager - ML Delivery & Data Operations
Mechademy
5 - 10 years
Gurugram
Posted: 19/02/2026
Job Description
Mechademy's ML operations are at an inflection point. We've built internal tools that can create production-grade machine learning models in under 30 minutes even for years of industrial sensor data. The delivery engine exists. Now we need someone to run the factory.
We're looking for a Manager ML Delivery & Data Operations with 5-7+ years of experience to own our ML model lifecycle, client data onboarding, and operational excellence. You'll work directly with the Director of Data Science to free him from 20-25 hours/week of operational work while scaling ML model production to 30+ models daily by 2027.\
Our clients include Berkshire Hathaway, Chevron, SM Energy, and Freeport LNG. When we say "zero-defect execution," we mean it mistakes aren't an option when you're monitoring billion-dollar industrial assets.
This role is 50% operations management and 50% hands-on execution initially, shifting to 70% management as the team scales.
Operations Management & Process Excellence (30%)
- Triage incoming requests (model creation, data onboarding, ad-hoc analyses) and distribute work across the team
- Establish SLAs for ML operations and data operations define what "good" looks like and hold the team to it
- Build processes, SOPs, and automation to reduce the current 80% manual operational burden by 40%+
- Capacity planning: scale from current pace to 30+ models daily by 2027
- Identify operational bottlenecks and implement systematic solutions
- Free the Director from operational firefighting, enabling 90% strategic focus
ML Model Lifecycle Management (25%)
- Use our internal AutoML tools to create regression models for clients (training takes <30 min per model)
- Validate model quality you need to understand feature engineering, feature selection, evaluation metrics (not just accuracy residuals, drift, business-relevant metrics), and know whether a model is good enough to ship to Chevron
- Deploy models to production environments and monitor for drift and degradation
- Manage model retraining schedules and lifecycle
- Build automation for model monitoring (currently manual scripts)
- Transition from 80% manual ML ops to automated, scalable processes
Client Data Onboarding & Quality Assurance (25%)
- Lead client dataset onboarding from raw IoT sensor data to ML-ready state
- Prepare data for ML model training using our AutoML platform
- Write and optimize SQL queries to inspect, transform, and validate client data
- Implement rigorous DQA workflows: type checks, missingness detection, outlier flagging, reconciliation
- Partner with Customer Success, Product, and Engineering to resolve data blockers
- Ensure zero defects in client data entering ML pipelines
Team Leadership & Hiring (20%)
- Directly manage 2-3 people initially, grow team to 6-7 over 12-18 months
- Conduct weekly 1:1s, performance reviews, career development planning
- Hire and onboard ML/Data Ops Specialists with Director approval
- Create SOPs, training materials, and knowledge transfer processes
- Foster culture of rigor, craftsmanship, and zero-defect execution
- 5-7+ years in data operations, analytics delivery, ML operations, analytics engineering, or similar operational roles
- 2+ years with direct team management responsibility (not just tech lead)
- Strong proficiency in Python (Pandas, NumPy, Polars); production-quality code, not just notebooks
- Write optimized SQL queries for large datasets; query tuning, window functions, CTEs
- Solid understanding of ML concepts you should know what feature engineering and feature selection are, why models are created, how to evaluate whether a model is performing well, and what deployment means in practice. You don't need to design algorithms, but you need to look at a model's output and know whether it's good enough to ship.
- Data validation, cleaning, anomaly detection, automated DQ workflows
- Scripting for process automation, scheduling, orchestration
- Demonstrated track record of building processes from scratch: SOPs, automation, SLAs, capacity planning
- Process-driven mindset: you see a manual process and instinctively ask "how do I automate this?"
- Comfortable starting hands-on and evolving to management as the team scales
- Experience with AutoML platforms or ML lifecycle management tools (MLflow, Ray, Kubeflow)
- Experience with orchestration tools: Airflow, Prefect, or Dagster
- Track record of reducing manual operational burden by 40%+ through automation
- Experience scaling operations from low volume to high volume (10x+ growth)
- Client data onboarding experience working with messy, real-world external data
- ML frameworks awareness (scikit-learn, XGBoost)
- Statistical methods for outlier detection
- Startup or high-growth environment experience
- Languages: Python, SQL
- ML Operations: AutoML platforms, model deployment, monitoring, drift detection
- Data Tools: Pandas, NumPy, Polars, SQL databases
- Automation: Scripting, scheduling, orchestration workflows
- Process Tools: Git, Jupyter, SOPs, documentation
- Cloud Platforms: AWS (S3, data storage)
- Nice-to-Have: MLflow, Ray, Dagster, Airflow, Apache Iceberg
- Bachelor's degree in Engineering, Computer Science, Mathematics, Statistics, Data Science, or equivalent
- Experience scaling ML production from low volume to high volume (10x+ growth)
- Familiarity with industrial IoT, sensor data, or time-series data
- Experience managing both data engineering and ML operations teams
- Client data onboarding from external/enterprise sources (not just internal datasets)
- Track record building operational automation that reduces manual work 40%+
- Hands-on experience with distributed ML systems (Ray, Spark)
First 30 Days: Shadow current workflows, map every operational task the Director handles, begin handling daily triage independently.
First 90 Days: Fully own 100% of operational workload. Director's operational time drops from 40% to <15%. Establish SLAs and tracking for all requests.
First 6 Months: Operational manual burden reduced by 40%+. Team scaled to 4-5 with clear SOPs for every core workflow. ML model production visibly on trajectory toward 30+ daily.
Services you might be interested in
Improve Your Resume Today
Boost your chances with professional resume services!
Get expert-reviewed, ATS-optimized resumes tailored for your experience level. Start your journey now.
