About Us: MostEdge is revolutionizing the retail industry with a cutting-edge analytics platform designed to prioritize customer needs and shape the future of retail. From advanced POS systems and self-service kiosks to surveillance, loyalty solutions, and next-level consumer engagement tools, MostEdge empowers businesses to transform every customer interaction into a profitable opportunity. By seamlessly integrating retail management processes, optimizing supply chains, and ensuring stock availability for in-demand products, MostEdge enables businesses to grow efficiently while eliminating time-consuming administrative tasks. As the only company offering 24/7 c-store operations, shrink management, audits, and reconciliation services, MostEdge ensures that every transaction is secure, accurate, and optimized for success. Beyond technology, MostEdges learning academy nurtures employees and owners into tomorrows retail leaders, fostering innovation and growth across the industry. By partnering with retailers, distributors, and manufacturers, MostEdge is not just enhancing retail operationsits empowering businesses to thrive sustainably, profitably, and confidently in an ever-evolving market. With customer presence in more than twenty states, the companys staff services customers from Atlanta, Georgia, Hyderabad, Vadodara and Vijayawada locations.

Senior AI Engineer LLM Fine-Tuning & Applied Generative AI

Location: [Vadodara]

Employment Type: Full-time

Experience Required: 58 years in AI / ML Engineering

2) 3+ years working with Deep Learning & NLP

3) 2+ years hands-on with Large Language Models (LLMs)

Job Summary:

We are seeking a Senior AI Engineer LLM Fine-Tuning Expert to design, train, optimize, and deploy production-grade Large Language Model solutions.

This role is ideal for someone who has deep theoretical knowledge combined with real-world deployment experience across fine-tuning, inference optimization, and scalable AI systems.

You will work on custom LLM fine-tuning (SFT, LoRA, QLoRA, PEFT, RLHF), multi-modal pipelines, and enterprise-grade AI applications including chatbots, agents, document intelligence, and video/surveillance analytics.

Key Responsibilities:

LLM Fine-Tuning & Training
Fine-tune open-source LLMs (LLaMA, Mistral, Mixtral, Qwen, Falcon, Phi, Gemma, etc.)
Implement SFT, PEFT, LoRA, QLoRA, Prefix-Tuning, Adapters
Design and optimize instruction datasets, conversation datasets, and domain-specific corpora
Implement RLHF / RLAIF pipelines for alignment and response quality
Perform tokenization strategies, prompt optimization, and curriculum learning

Model Optimization & Performance
Optimize models for low-latency, high-throughput inference
Apply quantization (INT8, INT4, GPTQ, AWQ) and pruning
Benchmark and profile GPU/CPU memory usage and inference speed
Deploy models using vLLM, TensorRT-LLM, Triton, ONNX

AI System Architecture
Design end-to-end LLM-powered systems (API Vector DB LLM Agent layer)
Build RAG pipelines using FAISS, Milvus, Weaviate, Pinecone, Chroma
Implement multi-agent frameworks (LangGraph, CrewAI, AutoGen, custom agents)
Integrate LLMs with vision, speech, and structured data pipelines

Production & MLOps
Deploy AI services using FastAPI / Flask / gRPC
Build scalable pipelines using Docker, Kubernetes
Implement CI/CD for ML, model versioning, rollback, and monitoring
Track experiments using MLflow, Weights & Biases
Monitor hallucinations, drift, latency, and cost in production

Required Skills:

Core AI / ML

Strong foundation in Deep Learning, NLP, Transformers
Expert-level Python
Hands-on experience with PyTorch (mandatory)
Experience with Hugging Face Transformers, Datasets, Accelerate

LLM & Generative AI

Proven experience fine-tuning LLMs beyond prompting
Deep understanding of attention mechanisms, embeddings, tokenization
Experience with RAG, Agents, Tool Calling, Function Calling

Infrastructure

Experience with GPU systems (NVIDIA A100, H100, L40, RTX series)
Knowledge of CUDA basics, memory optimization
Cloud experience: AWS / GCP / Azure
Vector databases and high-performance inference engines

Good to Have (Strong Plus)

Multi-modal models (Vision-Language, OCR, Audio)
Experience with surveillance, video analytics, or edge AI
Knowledge of C++ inference optimization
Experience training models on 10B+ parameters
Contributions to open-source ML projects
Fine-tune an open-source LLM on a domain-specific dataset
Deploy an optimized inference API with RAG
Explain performance trade-offs and cost optimization

What We Offer

Work on real-world, large-scale AI systems
Access to high-end GPU infrastructure
Competitive compensation + performance bonuses
Opportunity to shape AI strategy and architecture
Fast-paced, innovation-driven environment

Artificial Intelligence Specialist

MostEdge

Job Description

Services you might be interested in

Improve Your Resume Today