Login Sign Up
🔔 FCM Loaded

Artificial Intelligence Specialist

MostEdge

2 - 5 years

Hyderabad

Posted: 13/04/2026

Getting a referral is 5x more effective than applying directly

Job Description

About Us: MostEdge is revolutionizing the retail industry with a cutting-edge analytics platform designed to prioritize customer needs and shape the future of retail. From advanced POS systems and self-service kiosks to surveillance, loyalty solutions, and next-level consumer engagement tools, MostEdge empowers businesses to transform every customer interaction into a profitable opportunity. By seamlessly integrating retail management processes, optimizing supply chains, and ensuring stock availability for in-demand products, MostEdge enables businesses to grow efficiently while eliminating time-consuming administrative tasks. As the only company offering 24/7 c-store operations, shrink management, audits, and reconciliation services, MostEdge ensures that every transaction is secure, accurate, and optimized for success. Beyond technology, MostEdges learning academy nurtures employees and owners into tomorrows retail leaders, fostering innovation and growth across the industry. By partnering with retailers, distributors, and manufacturers, MostEdge is not just enhancing retail operationsits empowering businesses to thrive sustainably, profitably, and confidently in an ever-evolving market. With customer presence in more than twenty states, the companys staff services customers from Atlanta, Georgia, Hyderabad, Vadodara and Vijayawada locations.


Senior AI Engineer LLM Fine-Tuning & Applied Generative AI

Location: [Vadodara]

Employment Type: Full-time

Experience Required: 58 years in AI / ML Engineering

2) 3+ years working with Deep Learning & NLP

3) 2+ years hands-on with Large Language Models (LLMs)


Job Summary:

We are seeking a Senior AI Engineer LLM Fine-Tuning Expert to design, train, optimize, and deploy production-grade Large Language Model solutions.

This role is ideal for someone who has deep theoretical knowledge combined with real-world deployment experience across fine-tuning, inference optimization, and scalable AI systems.

You will work on custom LLM fine-tuning (SFT, LoRA, QLoRA, PEFT, RLHF), multi-modal pipelines, and enterprise-grade AI applications including chatbots, agents, document intelligence, and video/surveillance analytics.


Key Responsibilities:


  • LLM Fine-Tuning & Training
  • Fine-tune open-source LLMs (LLaMA, Mistral, Mixtral, Qwen, Falcon, Phi, Gemma, etc.)
  • Implement SFT, PEFT, LoRA, QLoRA, Prefix-Tuning, Adapters
  • Design and optimize instruction datasets, conversation datasets, and domain-specific corpora
  • Implement RLHF / RLAIF pipelines for alignment and response quality
  • Perform tokenization strategies, prompt optimization, and curriculum learning


  • Model Optimization & Performance
  • Optimize models for low-latency, high-throughput inference
  • Apply quantization (INT8, INT4, GPTQ, AWQ) and pruning
  • Benchmark and profile GPU/CPU memory usage and inference speed
  • Deploy models using vLLM, TensorRT-LLM, Triton, ONNX


  • AI System Architecture
  • Design end-to-end LLM-powered systems (API Vector DB LLM Agent layer)
  • Build RAG pipelines using FAISS, Milvus, Weaviate, Pinecone, Chroma
  • Implement multi-agent frameworks (LangGraph, CrewAI, AutoGen, custom agents)
  • Integrate LLMs with vision, speech, and structured data pipelines


  • Production & MLOps
  • Deploy AI services using FastAPI / Flask / gRPC
  • Build scalable pipelines using Docker, Kubernetes
  • Implement CI/CD for ML, model versioning, rollback, and monitoring
  • Track experiments using MLflow, Weights & Biases
  • Monitor hallucinations, drift, latency, and cost in production


Required Skills:


Core AI / ML

  • Strong foundation in Deep Learning, NLP, Transformers
  • Expert-level Python
  • Hands-on experience with PyTorch (mandatory)
  • Experience with Hugging Face Transformers, Datasets, Accelerate


LLM & Generative AI

  • Proven experience fine-tuning LLMs beyond prompting
  • Deep understanding of attention mechanisms, embeddings, tokenization
  • Experience with RAG, Agents, Tool Calling, Function Calling


Infrastructure

  • Experience with GPU systems (NVIDIA A100, H100, L40, RTX series)
  • Knowledge of CUDA basics, memory optimization
  • Cloud experience: AWS / GCP / Azure
  • Vector databases and high-performance inference engines


Good to Have (Strong Plus)

  • Multi-modal models (Vision-Language, OCR, Audio)
  • Experience with surveillance, video analytics, or edge AI
  • Knowledge of C++ inference optimization
  • Experience training models on 10B+ parameters
  • Contributions to open-source ML projects
  • Fine-tune an open-source LLM on a domain-specific dataset
  • Deploy an optimized inference API with RAG
  • Explain performance trade-offs and cost optimization


What We Offer

  • Work on real-world, large-scale AI systems
  • Access to high-end GPU infrastructure
  • Competitive compensation + performance bonuses
  • Opportunity to shape AI strategy and architecture
  • Fast-paced, innovation-driven environment

Services you might be interested in

Improve Your Resume Today

Boost your chances with professional resume services!

Get expert-reviewed, ATS-optimized resumes tailored for your experience level. Start your journey now.