Artificial Intelligence Engineer
Shunya Labs
5 - 10 years
Gurugram
Posted: 12/02/2026
Job Description
About US
Shunya Labs is building the Voice AI Infrastructure Layer for Enterprises powering speech intelligence, conversational agents, and domain-specific voice applications across industries. Born from deep work in mental-health AI and built for global enterprise scale, our stack combines state-of-the-art ASR/TTS models with an open-weights philosophy , driving accuracy, privacy, and scalability.
About the Role
Were seeking an AI Systems Engineer with an experience of 5-10 years, who thrives at the intersection of AI model optimization, infrastructure engineering, and applied research.
You will evaluate, host, and optimize a wide range of AI models spanning ASR, LLMs, and multimodal systems, and build the orchestration layer that powers scalable, low-latency deployments.
This is a role for someone whos comfortable navigating ambiguity, researching emerging AI methods, and translating client requirements into robust, production-ready solutions.
Youll work across the full stack, from GPU inference tuning to React-based control dashboards, building a resilient and scalable AI delivery platform.
Key Responsibilities
AI Model Evaluation & Optimization
- Evaluate, benchmark, and optimize AI models (speech, text, vision, multimodal) for latency, throughput, and accuracy.
- Implement advanced inference optimizations using ONNX Runtime, TensorRT, quantization, and GPU batching.
- Continuously research and experiment with the latest AI runtimes, serving frameworks, and model architectures.
- Develop efficient caching and model loading strategies for multi-tenant serving.
AI Infrastructure & Orchestration
- Design and develop a central orchestration layer to manage multi-model inference, load balancing, and intelligent routing.
- Build scalable, fault-tolerant deployments using AWS ECS/EKS, Lambda, and Terraform.
- Use Kubernetes autoscaling and GPU node optimization to minimize latency under dynamic load.
- Implement observability and monitoring (Prometheus, Grafana, CloudWatch) across the model-serving ecosystem.
DevOps, CI/CD & Automation
- Build and maintain CI/CD pipelines for model integration, updates, and deployment (GitHub Actions, CodePipeline, etc.).
- Manage Dockerized environments, version control, and GPU-enabled build pipelines.
- Ensure reproducibility and resilience through infrastructure-as-code and automated testing.
Frontend & Developer Tools
- Create React/Next.js-based dashboards for performance visualization, latency tracking, and configuration control.
- Build intuitive internal tools for model comparison, experiment management, and deployment control.
- Utilize Cursor, VS Code, and other AI-powered development tools to accelerate iteration.
- Client Interaction & Solutioning
- Work closely with clients and internal stakeholders to gather functional and performance requirements.
- Translate abstract business needs into deployable AI systems with measurable KPIs.
- Prototype quickly, iterate with feedback, and deliver robust production systems.
Research & Continuous Innovation
- Stay on top of the latest AI research and model releases (OpenAI, Anthropic, Hugging Face, Meta, etc.).
- Evaluate emerging frameworks for model serving, fine-tuning, and retrieval (LangChain, LlamaIndex, GraphRAG, etc.).
- Proactively identify and implement performance or cost improvements in the model serving stack.
- Share learnings and contribute to the internal AI knowledge base.
- Ambiguous Problem Solving
- Work effectively in undefined problem spaces, identifying optimal paths forward through experimentation.
- Break down high-level goals into actionable technical strategies.
- Balance trade-offs between accuracy, latency, and cost while innovating under uncertainty.
Required Skills
- Strong proficiency in Python, TypeScript/JavaScript, Bash, and modern software development practices.
- Deep understanding of Docker, Kubernetes, Terraform, and AWS (ECS, Lambda, S3, CloudFront).
- Experience with inference optimization (ONNX, TensorRT, quantization, batching).
- Proven ability to design and scale real-time inference pipelines.
- Experience building and maintaining CI/CD pipelines and monitoring systems.
- Hands-on experience with React/Next.js or similar frameworks for dashboard/UI development.
- Strong grasp of API design, load balancing, and GPU resource management.
Nice to Have
- Experience with LangChain, LlamaIndex, GraphRAG, or vector databases (FAISS, Neo4j).
- Familiarity with speech processing models (Whisper, Silero, NeMo, etc.).
- Prior work with serverless inference or edge AI architectures.
- Knowledge of data pipelines, model versioning, and MLOps best practices.
Soft Skills
- Excellent problem-solving in ambiguous, evolving environments.
- Strong ability to research, self-learn, and prototype emerging AI technologies.
- Confident communicator who can translate technical findings to business impact.
- Ownership mindset with a collaborative, solution-oriented approach.
Services you might be interested in
Improve Your Resume Today
Boost your chances with professional resume services!
Get expert-reviewed, ATS-optimized resumes tailored for your experience level. Start your journey now.
