🔔 FCM Loaded

Senior ML Engineer - GenAI & ML Systems

TraceLink

5 - 10 years

Pune

Posted: 31/01/2026

Getting a referral is 5x more effective than applying directly

Job Description

Senior ML Engineer GenAI & Agentic ML Systems


About the Role

We are seeking a highly experienced Senior ML Engineer GenAI & ML Systems to lead the design, architecture, and implementation of advanced agentic AI systems within our next-generation supply chain platforms.

This role is hands-on and execution-focused. You will design, build, deploy, and maintain large-scale multi-agent systems capable of reasoning, planning, and executing complex workflows in dynamic, non-deterministic environments. You will also own production concerns, including context management, knowledge orchestration, evaluation, observability, and system reliability .


This position is ideal for a strong ML Engineer or Software Engineer with deep practical exposure to GenAI, data science, and modern ML systems , who is comfortable working end-to-endfrom architecture through production deployment. Experience in life sciences supply chain or other regulated environments is a strong plus.


Key Responsibilities

  • Architect, implement, and operate large-scale agentic AI / GenAI systems that automate and coordinate complex supply chain workflows.
  • Design and build multi-agent systems, including agent coordination, planning, tool execution, long-term memory, feedback loops, and supervision.
  • Develop and maintain advanced context and knowledge management systems , including:
  • RAG and Advanced RAG pipelines
  • Hybrid retrieval, reranking, grounding, and citation strategies
  • Context window optimization and long-horizon task reliability
  • Own the technical strategy for reliability and evaluation of non-deterministic AI systems , including:
  • Agent evaluation frameworks
  • Simulation-based testing
  • Regression testing for probabilistic outputs
  • Validation of agent decisions and outcomes
  • Fine-tune and optimize LLMs/SLMs for domain performance, latency, cost efficiency, and task specialization (strong plus).
  • Design and deploy scalable backend services using Python and Java , ensuring production-grade performance, security, and observability.
  • Implement AI observability and feedback loops , including agent tracing, prompt/tool auditing, quality metrics, and continuous improvement pipelines.
  • Apply and experiment with reinforcement learning or iterative improvement techniques within GenAI or agentic workflows where appropriate.
  • Collaborate closely with product, data science, and domain experts to translate real-world supply chain requirements into intelligent automation solutions.
  • Guide system architecture across distributed services, event-driven systems, and real-time data pipelines using cloud-native patterns.
  • Mentor engineers, influence technical direction, and establish best practices for agentic AI and ML systems across teams.


Required Qualifications

  • 6+ years of experience building and operating cloud-native SaaS systems on AWS, GCP, or Azure (minimum 5 years with AWS ).
  • Strong ML Engineer / Software Engineer background with deep practical exposure to data science and GenAI systems.
  • Expert-level, hands-on experience designing, deploying, and maintaining large multi-agent systems in production.
  • Proven experience with advanced RAG and context management , including memory, state handling, tool grounding, and long-running workflows.
  • 6+ years of hands-on Python experience delivering production-grade systems.
  • Practical experience evaluating, monitoring, and improving non-deterministic AI behavior in real-world deployments.
  • Hands-on experience with agent frameworks such as LangGraph, AutoGen, CrewAI, Semantic Kernel , or equivalent.
  • Solid understanding of distributed systems, microservices, and production reliability best practices.


Big Plus / Preferred Qualifications

  • Hands-on experience fine-tuning LLMs or SLMs for domain-specific tasks (training, evaluation, deployment).
  • Experience designing and deploying agentic systems in supply chain domains (logistics, manufacturing, planning, procurement).
  • Strong knowledge of knowledge organization techniques , including RAG, Advanced RAG, hybrid search, and reranking.
  • Experience applying reinforcement learning, reward modeling, or iterative optimization in GenAI workflows.
  • Familiarity with Java and JavaScript/ECMAScript .
  • Experience deploying AI solutions in regulated or enterprise environments with governance, security, and compliance requirements.
  • Knowledge of life sciences supply chain or regulated industry ecosystems.


Who You Are

  • A hands-on technical leader who moves seamlessly between architecture and implementation.
  • A builder who values practical, production-ready solutions over prototypes.
  • Comfortable designing systems with probabilistic and emergent behavior .
  • Passionate about building GenAI systems that are reliable, observable, explainable, and scalable .
  • A clear communicator who can align stakeholders and drive execution across teams.

Services you might be interested in

Improve Your Resume Today

Boost your chances with professional resume services!

Get expert-reviewed, ATS-optimized resumes tailored for your experience level. Start your journey now.