Technical Architect
Impetus
2 - 5 years
Noida
Posted: 17/02/2026
Getting a referral is 5x more effective than applying directly
Job Description
Responsibilities
- Core AI/ML Fundamentals
- Solid understanding of AI/ML concepts including:
- Classification, regression, neural networks
- OCR and transcription systems
- Audio/Video processing and multimodal learning
- OCR, Transcription & Audio/Video Intelligence
- Implement specialized models for:
- Highaccuracy document OCR
- Realtime audio transcription
- Architect deep learning pipelines for audio/video analysis and generation.
- Integrate multimodal models (e.g., LLaVA, Whisper) into broader GenAI systems.
- Generative AI & LLM Expertise
- Strong understanding of:
- Generative AI techniques
- Transformer architectures
- RAG (Retrieval-Augmented Generation) pipelines
- Modern LLM ecosystems
- Hands-on experience with:
- LLM parameter handling and model selection
- Scaling strategies and performance optimization
- Expertise in:
- Prompt engineering and instruction tuning
- Prompt tuning and optimization for high-quality outputs
- Familiarity with evaluation frameworks covering:
- Quality, grounding, accuracy, safety
- Latency and cost analysis
- Governance and compliance requirements
- Agentic AI Systems
- Experience designing and building agentic AI systems including:
- Multiagent orchestration
- Tooluse workflows
- Autonomous task execution
- Design longterm memory architectures:
- Vector-based memory systems
- Graph-based memory for complex, persistent context
- Model Training, FineTuning & Optimization
- Knowledge of finetuning approaches:
- LoRA, QLoRA, supervised fine-tuning
- Experience with model compression techniques:
- Quantization, distillation
- Familiarity with performance-level tooling:
- CUDA, Triton, or specialized custom kernels
- Design AI systems capable of efficiently handling and routing multiple user requests simultaneously, including:
- Scalable request handling
- Loadbalanced inference
- Multitenant model utilization
- Caching and prioritization strategies
- Infrastructure, Pipelines & Deployment
- Oversee:
- Data pipeline integration
- Training workflows
- ML CI/CD processes
- Strong understanding of:
- GPU/compute requirements
- Costefficient deployment strategies
- Experience designing and managing production-grade inference servers using:
- vLLM
- Text Generation Inference (TGI)
- SGLang
- Ability to collaborate with engineering teams to integrate LLMs into production systems:
- APIs, microservices, cloud architectures
- Research, Evaluation & Continuous Innovation
- Stay current with advancements in AI, ML, and LLM ecosystems.
- Evaluate new tools, frameworks, and platform technologies to continuously enhance system architecture.
Services you might be interested in
Improve Your Resume Today
Boost your chances with professional resume services!
Get expert-reviewed, ATS-optimized resumes tailored for your experience level. Start your journey now.
