Senior AI/ML Engineer

AI System Design, LLMOps & Production AI Infrastructure

Unico Connect|Mumbai (On-site)|Full-time|57 years

About the Role

Unico Connect is an AI-first product engineering agency building custom software, mobile apps, and AI solutions for clients across the USA, Europe, UK, Australia, Singapore, and India. We are hiring a Senior AI/ML Engineer who can own both the application layer and the operational layer of the AI systems we ship from agent design and RAG pipelines to deployment, monitoring, security, and cost control.

The mandatory requirements for this role are end-to-end ownership of a complex production AI system and direct ownership of its infrastructure layer. You will own the architectural direction of AI systems for client engagements, design agent and RAG architectures, stand up the MLOps and LLMOps infrastructure that keeps them running in production, and mentor mid-level engineers. This is a deeply hands-on role with direct accountability for architecture, evaluation, cost, and reliability across engagements.

Responsibilities

Agentic system design: Design and ship agentic AI systems using LangGraph, CrewAI, AutoGen, LlamaIndex Agents, MCP, or custom orchestration.

RAG architecture: Architect non-trivial RAG pipelines with chunking strategies, hybrid search, reranking, and structured output validation.

Model selection and applied AI: Work across LLM applications, classical ML, and multi-modal AI as engagements demand. Select the right tool based on cost, scale, and client requirements.

MLOps and LLMOps architecture: Define reproducible training and inference pipelines, model registries, experiment tracking, feature stores, and CI/CD across data, model, prompt, and evaluation layers.

Model deployment and serving: Own model deployment and serving managed services (Bedrock, SageMaker, Azure ML, Vertex AI) or self-hosted (vLLM, TGI, Triton, KServe, BentoML). Choose with cost, scale, and security reasoning.

Cloud infrastructure: Own production cloud infrastructure on AWS, Azure, or GCP compute, storage, networking, IAM, secrets, GPU scheduling, and cost.

Evaluation and quality gates: Build the evaluation stack golden datasets, regression suites, LLM-as-judge, hallucination detection, faithfulness scoring, and bias/fairness checks. Gate releases on quality metrics.

Prompt engineering and versioning: Treat prompts as software artefacts version control, A/B testing, regression gates, and rollback. No prompt edits in production without testing.

Observability and monitoring: Build production observability for AI systems distributed tracing, structured logging, prompt and model metadata tags, latency percentiles, and error budgets. Tooling such as OpenTelemetry, Prometheus, Grafana, and Langfuse.

AI security and compliance: Implement guardrails, prompt-injection defences, output validation, PII detection, audit logging, IAM hygiene, network isolation, and secrets management. Support client compliance evidence (SOC 2, HIPAA, GDPR, DPDP Act) and maintain audit trails for inputs, outputs, model versions, and access.

Mentorship and code quality: Mentor mid-level and junior engineers through code reviews, design reviews, and pairing on hard problems.

AI-assisted development: Use Claude, Cursor, and similar tools day-to-day. Validate AI-generated code before it ships.

Requirements

End-to-end ownership of a complex production AI system (mandatory). Must have personally built and shipped at least one complex production AI system agentic, multi-step, or RAG at scale with full ownership of architecture, evaluation, cost, and operations. POCs and demos do not qualify.

Production infrastructure ownership for an AI system (mandatory). Must have personally owned the infrastructure layer of an AI system in production, covering deployment pipeline, monitoring, and cost. Wrapping a managed service without owning these layers does not qualify.

57 years of professional software or AI engineering experience, with at least 3 years on LLM applications, AI/ML engineering, or production AI infrastructure.

Strong Python: type hints, async, FastAPI, packaging, and testing. Comfortable reading and debugging ML code in PyTorch or TensorFlow.

Hands-on production work with at least three of: OpenAI, Anthropic Claude, Google Gemini, or self-hosted open-weight models (vLLM, Ollama, Together, Replicate).

Production experience with at least one agent framework (LangGraph, CrewAI, AutoGen, LlamaIndex Agents) or a hand-rolled equivalent.

Production experience with RAG, embeddings, and vector databases (Pinecone, Weaviate, Qdrant, pgvector, Chroma, OpenSearch), including reranking.

Hands-on with at least one evaluation framework (Langfuse, LangSmith, Promptfoo, Ragas, DeepEval) built harnesses that gate releases.

Production experience with experiment tracking (MLflow, W&B, Neptune) and model registry workflows.

Production observability stack: OpenTelemetry, Prometheus, Grafana, ELK, or equivalent. Has set up dashboards and alerting for production AI systems.

Hands-on cloud experience with at least one of AWS, Azure, or GCP at production depth, with willingness to ramp to a second cloud.

Container and IaC stack: Docker, Kubernetes (including GPU scheduling and HPA), Terraform or equivalent IaC, and one CI/CD platform (GitHub Actions, GitLab CI, Jenkins, Azure DevOps).

Demonstrated cost discipline on production AI features: specific numbers, not just principles.

Security awareness for AI systems: hands-on with at least one guardrail library and one PII detection tool.

Excellent written and spoken English. Confident defending architectural choices in design reviews and client-facing conversations.

Nice to have: Fine-tuning experience (LoRA, QLoRA, PEFT, full fine-tune); multi-tenant ML platform experience per-tenant cost allocation, isolation, compliance tiering; MCP server authoring or contribution; distributed training (DeepSpeed, FSDP, Ray Train) and GPU cluster management; cloud certifications (AWS ML Specialty, Azure AI Engineer, GCP ML Engineer); prior agency, consulting, or compliance audit involvement (SOC 2, HIPAA, GDPR).

Senior AI/ML Engineer

Unico Connect

Job Description

Services you might be interested in

We Search & Apply Jobs for You!