Login Sign Up

Technical Architect

Impetus

2 - 5 years

Noida

Posted: 17/02/2026

Getting a referral is 5x more effective than applying directly

Job Description

Responsibilities



  • Core AI/ML Fundamentals
  • Solid understanding of AI/ML concepts including:
  • Classification, regression, neural networks
  • OCR and transcription systems
  • Audio/Video processing and multimodal learning
  • OCR, Transcription & Audio/Video Intelligence
  • Implement specialized models for:
  • Highaccuracy document OCR
  • Realtime audio transcription
  • Architect deep learning pipelines for audio/video analysis and generation.
  • Integrate multimodal models (e.g., LLaVA, Whisper) into broader GenAI systems.
  • Generative AI & LLM Expertise
  • Strong understanding of:
  • Generative AI techniques
  • Transformer architectures
  • RAG (Retrieval-Augmented Generation) pipelines
  • Modern LLM ecosystems
  • Hands-on experience with:
  • LLM parameter handling and model selection
  • Scaling strategies and performance optimization
  • Expertise in:
  • Prompt engineering and instruction tuning
  • Prompt tuning and optimization for high-quality outputs
  • Familiarity with evaluation frameworks covering:
  • Quality, grounding, accuracy, safety
  • Latency and cost analysis
  • Governance and compliance requirements
  • Agentic AI Systems
  • Experience designing and building agentic AI systems including:
  • Multiagent orchestration
  • Tooluse workflows
  • Autonomous task execution
  • Design longterm memory architectures:
  • Vector-based memory systems
  • Graph-based memory for complex, persistent context
  • Model Training, FineTuning & Optimization
  • Knowledge of finetuning approaches:
  • LoRA, QLoRA, supervised fine-tuning
  • Experience with model compression techniques:
  • Quantization, distillation
  • Familiarity with performance-level tooling:
  • CUDA, Triton, or specialized custom kernels
  • Design AI systems capable of efficiently handling and routing multiple user requests simultaneously, including:
  • Scalable request handling
  • Loadbalanced inference
  • Multitenant model utilization
  • Caching and prioritization strategies
  • Infrastructure, Pipelines & Deployment
  • Oversee:
  • Data pipeline integration
  • Training workflows
  • ML CI/CD processes
  • Strong understanding of:
  • GPU/compute requirements
  • Costefficient deployment strategies
  • Experience designing and managing production-grade inference servers using:
  • vLLM
  • Text Generation Inference (TGI)
  • SGLang
  • Ability to collaborate with engineering teams to integrate LLMs into production systems:
  • APIs, microservices, cloud architectures
  • Research, Evaluation & Continuous Innovation
  • Stay current with advancements in AI, ML, and LLM ecosystems.
  • Evaluate new tools, frameworks, and platform technologies to continuously enhance system architecture.

Services you might be interested in

We Search & Apply Jobs for You!

Our team scans through 1000s of opportunities and applies to roles best suited to your profile

Save 100+ hours and focus on what matters - cracking interviews and landing offers.