Technical Architect
Impetus
2 - 5 years
Noida
Posted: 17/02/2026
Getting a referral is 5x more effective than applying directly
Job Description
Responsibilities
- Core AI/ML Fundamentals
- Solid understanding of AI/ML concepts including:
- Classification, regression, neural networks
- OCR and transcription systems
- Audio/Video processing and multimodal learning
- OCR, Transcription & Audio/Video Intelligence
- Implement specialized models for:
- Highaccuracy document OCR
- Realtime audio transcription
- Architect deep learning pipelines for audio/video analysis and generation.
- Integrate multimodal models (e.g., LLaVA, Whisper) into broader GenAI systems.
- Generative AI & LLM Expertise
- Strong understanding of:
- Generative AI techniques
- Transformer architectures
- RAG (Retrieval-Augmented Generation) pipelines
- Modern LLM ecosystems
- Hands-on experience with:
- LLM parameter handling and model selection
- Scaling strategies and performance optimization
- Expertise in:
- Prompt engineering and instruction tuning
- Prompt tuning and optimization for high-quality outputs
- Familiarity with evaluation frameworks covering:
- Quality, grounding, accuracy, safety
- Latency and cost analysis
- Governance and compliance requirements
- Agentic AI Systems
- Experience designing and building agentic AI systems including:
- Multiagent orchestration
- Tooluse workflows
- Autonomous task execution
- Design longterm memory architectures:
- Vector-based memory systems
- Graph-based memory for complex, persistent context
- Model Training, FineTuning & Optimization
- Knowledge of finetuning approaches:
- LoRA, QLoRA, supervised fine-tuning
- Experience with model compression techniques:
- Quantization, distillation
- Familiarity with performance-level tooling:
- CUDA, Triton, or specialized custom kernels
- Design AI systems capable of efficiently handling and routing multiple user requests simultaneously, including:
- Scalable request handling
- Loadbalanced inference
- Multitenant model utilization
- Caching and prioritization strategies
- Infrastructure, Pipelines & Deployment
- Oversee:
- Data pipeline integration
- Training workflows
- ML CI/CD processes
- Strong understanding of:
- GPU/compute requirements
- Costefficient deployment strategies
- Experience designing and managing production-grade inference servers using:
- vLLM
- Text Generation Inference (TGI)
- SGLang
- Ability to collaborate with engineering teams to integrate LLMs into production systems:
- APIs, microservices, cloud architectures
- Research, Evaluation & Continuous Innovation
- Stay current with advancements in AI, ML, and LLM ecosystems.
- Evaluate new tools, frameworks, and platform technologies to continuously enhance system architecture.
Services you might be interested in
We Search & Apply Jobs for You!
Our team scans through 1000s of opportunities and applies to roles best suited to your profile
Save 100+ hours and focus on what matters - cracking interviews and landing offers.
