🔔 FCM Loaded

Lead Data Scientist

IHX - A Perfios Company

5 - 10 years

Bengaluru

Posted: 17/12/2025

Getting a referral is 5x more effective than applying directly

Job Description

About IHX

IHX is building Indias most trusted health tech infrastructure platformpowering

real-time, consent-led claims and data exchange between insurers and 30,000+

hospitals across 1,200+ cities. With $1B+ in claims processed yearly, IHX sits at the

center of the next-gen health insurance stack.


About the Role

We are building the next-generation automated health-insurance claims processing platform, leveraging AI/ML, deep learning, NLP, OCR, and LLM-powered intelligence. As a Lead Data Scientist, you will drive the design, development, deployment, and optimization of AI models that power large-scale claims decisioning across multiple regions. This is a high-impact leadership role where you will work independently, set technical direction, mentor a diverse team, and ensure reliable production performance of mission-critical models.


Key Responsibilities

Model Development, Deployment & Production Support

  • Design, develop, train, and validate models used in automated health-claims processing.
  • Own the end-to-end machine learning pipelines including data ingestion, feature engineering, modeling, validation, deployment, and monitoring.
  • Monitor model performance, drift, SLAs, and stability in real-time production environments.
  • Lead root-cause analysis, bug resolution, and continuous improvement of production models.
  • Build state-of-the-art models including classical ML, deep learning, NLP, OCR, LLMs, transformers, and generative AI.
  • Implement scalable model serving and continuous training strategies using modern MLOps tools.


Build Efficient, Scalable & High-Accuracy Models

  • Optimize models for accuracy, latency, memory footprint, and infrastructure cost.
  • Implement model compression, distillation, and quantization when required to meet SLAs.
  • Ensure solutions perform reliably across heterogeneous real-world datasets and regions.


Implement End-to-End ML Pipelines

  • Architect and implement automated ML pipelines covering structured, unstructured, and document based data.
  • Build feature engineering, model training, validation, and retraining workflows.
  • Implement CI/CD for ML, model versioning, and automated retraining strategies.
  • Work closely with engineering teams to operationalize ML using MLOps best practices.


Expertise in a Wide Range of AI Techniques

  • Hands-on experience with classical ML models including tree-based models, linear models, clustering, and anomaly detection.
  • Experience with deep learning architectures such as CNNs, RNNs, and Transformers.
  • Strong background in NLP and LLM-based solutions for extraction, summarization, classification, and claim interpretation.
  • Experience building OCR pipelines for document parsing, form extraction, and image understanding.
  • Experience applying generative AI for reasoning, rule extraction, and claim scenario understanding.
  • Ability to evaluate and select the most appropriate technique for each problem.


Work Independently on High-Scale Business Use Cases

  • Own ML modules deployed across multiple geographies, regulations, and insurance ecosystems.
  • Ensure scalability and robustness for high-volume claims processing workloads.
  • Collaborate with product, engineering, and operations teams to translate business requirements into ML solutions.


Strong Technical Acumen

  • Deep understanding of data structures, machine learning algorithms, and modern AI architectures.
  • Proficiency in Python, ML frameworks such as PyTorch and TensorFlow, and cloud platforms including AWS, GCP, or Azure.
  • Familiarity with distributed systems, microservices, APIs, and containerized deployments.
  • Ability to conduct architecture reviews and guide engineering teams on ML integration.
  • Experience building scalable data pipelines and feature stores.
  • Define data quality standards, metadata tracking, and experiment management practices.
  • Lead by example with strong individual contributions on critical projects.
  • Write high-quality, production-ready Python code using frameworks such as PyTorch, Hugging Face, LangChain, or Ollama.
  • Conduct rigorous model validation, interpretability analysis, and bias detection.


Team Leadership, Mentoring & Collaboration

  • Lead, mentor, and inspire data scientists, ML engineers, and analysts.
  • Foster a culture of ownership, experimentation, innovation, and continuous learning.
  • Collaborate cross-functionally with product, engineering, quality, and operations teams.
  • Demonstrate empathy, flexibility, and leadership in fast-paced environments.


Required Qualifications

  • Engineering degree is mandatory.
  • 8+ years of experience in data science or machine learning, with 35 years in a leadership role.
  • Proven experience building and deploying ML models in production at scale.
  • Strong foundation in statistics, machine learning fundamentals, optimization, and deep learning.
  • Expertise in NLP, transformers, LLM fine-tuning, embeddings, computer vision, OCR, time-series modeling, and predictive modeling.
  • Advanced proficiency in Python, SQL, ML frameworks, and cloud platforms.
  • Demonstrated success leading teams and delivering enterprise-scale AI systems.


Preferred Qualifications

  • Experience in health-insurance or health-claims processing ecosystems.
  • Understanding of regulatory and compliance constraints in healthcare data.
  • Knowledge of healthcare data standards such as HL7, FHIR, ICD, CPT, and SNOMED.
  • Experience with MLOps tools including MLflow, Kubeflow, Airflow, Docker, and CI/CD pipelines.

Services you might be interested in

Improve Your Resume Today

Boost your chances with professional resume services!

Get expert-reviewed, ATS-optimized resumes tailored for your experience level. Start your journey now.