Unico Connect is an AI-first product engineering agency. We ship custom software, mobile apps, web platforms, and AI solutions for clients across the USA, Europe, UK, Australia, Singapore, and India. We are hiring an MLOps Engineer to own the model training, fine-tuning, and production infrastructure behind the AI systems we ship for clients.

The mandatory requirement for this role is hands-on production experience in training and/or fine-tuning models. We are not looking for engineers who only deploy pre-trained models or call APIs. You must have personally trained or fine-tuned at least one model that went to production for a real workload, classical ML or LLM. DevOps and infrastructure depth are equally essential, since you will own the pipelines, GPUs, deployment, and monitoring around the models you build.

Responsibilities

Model training and fine-tuning (primary): Train and fine-tune models for client engagements: LLMs (LoRA, QLoRA, PEFT, full fine-tune where required), classical ML, and multi-modal models as the use case demands. Manage GPU jobs, distributed training where applicable, hyperparameter optimisation (Ray Tune, Optuna), and evaluation against client-defined success criteria.
Training pipelines and reproducibility: Build versioned, reproducible pipelines covering data (DVC, LakeFS, or equivalent), code, and configs. Run experiment tracking with MLflow or W&B. Implement data validation as code (Great Expectations, Soda, or Pandera).
Model deployment and serving: Package and deploy trained models on AWS (SageMaker, ECS or EKS) or equivalent. Use vLLM, TGI, BentoML, or Triton for LLM serving where required. Own inference scaling, latency, and cost per inference.
RAG and GenAI infrastructure: Build RAG pipelines, vector database integrations (Pinecone, Weaviate, Qdrant, pgvector, or OpenSearch), embedding workflows, and prompt versioning.
Monitoring and evaluation: Set up dashboards and alerts for model performance, data and concept drift, inference health, and cost. Run evaluation harnesses (Ragas, DeepEval, Promptfoo) in CI and maintain golden datasets.
AI security basics: Apply guardrails (Llama Guard, NeMo Guardrails), PII redaction, prompt hardening, and secret management within the OWASP LLM Top 10 framework.
CI/CD and infrastructure as code: Build CI/CD pipelines and Terraform modules across model, infrastructure, and application layers, with security and evaluation gates.
Troubleshooting and support: Debug production issues, optimise inference performance, and support client delivery teams. Participate in the on-call rotation as the team grows.

Requirements

Hands-on model training and/or fine-tuning in production (mandatory). Must have personally trained or fine-tuned at least one model that shipped to production for a real workload (classical ML retraining at scale, full fine-tune, LoRA, QLoRA, or PEFT). POCs, coursework, and research-only work do not qualify.
35 years of MLOps or ML engineering experience with strong DevOps fundamentals.
AWS at depth: SageMaker, ECS or EKS, S3, Lambda, Step Functions, CloudWatch, IAM. Equivalent Azure or GCP experience considered with willingness to work in AWS.
Strong Python with comfort reading and debugging ML code. Working knowledge of PyTorch or TensorFlow at infrastructure level (packaging, optimisation, quantisation).
Docker, Kubernetes, Terraform, and at least one CI/CD platform (GitHub Actions, Jenkins, GitLab CI, or Azure DevOps).
MLflow or W&B for experiment tracking, one pipeline orchestrator (Airflow, SageMaker Pipelines, or Kubeflow), and one model serving framework (vLLM, TGI, BentoML, Triton, or TorchServe).
Practical RAG experience: at least one vector database and one LLM framework (LangChain or LlamaIndex) in production.
Working knowledge of LLM security: prompt injection, PII handling, secrets management, and OWASP LLM Top 10.

MLOps Engineer

Unico Connect

Job Description

Services you might be interested in

Improve Your Resume Today