Lead Data Scientist
IHX - A Perfios Company
5 - 10 years
Bengaluru
Posted: 17/12/2025
Job Description
About IHX
IHX is building Indias most trusted health tech infrastructure platformpowering
real-time, consent-led claims and data exchange between insurers and 30,000+
hospitals across 1,200+ cities. With $1B+ in claims processed yearly, IHX sits at the
center of the next-gen health insurance stack.
About the Role
We are building the next-generation automated health-insurance claims processing platform, leveraging AI/ML, deep learning, NLP, OCR, and LLM-powered intelligence. As a Lead Data Scientist, you will drive the design, development, deployment, and optimization of AI models that power large-scale claims decisioning across multiple regions. This is a high-impact leadership role where you will work independently, set technical direction, mentor a diverse team, and ensure reliable production performance of mission-critical models.
Key Responsibilities
Model Development, Deployment & Production Support
- Design, develop, train, and validate models used in automated health-claims processing.
- Own the end-to-end machine learning pipelines including data ingestion, feature engineering, modeling, validation, deployment, and monitoring.
- Monitor model performance, drift, SLAs, and stability in real-time production environments.
- Lead root-cause analysis, bug resolution, and continuous improvement of production models.
- Build state-of-the-art models including classical ML, deep learning, NLP, OCR, LLMs, transformers, and generative AI.
- Implement scalable model serving and continuous training strategies using modern MLOps tools.
Build Efficient, Scalable & High-Accuracy Models
- Optimize models for accuracy, latency, memory footprint, and infrastructure cost.
- Implement model compression, distillation, and quantization when required to meet SLAs.
- Ensure solutions perform reliably across heterogeneous real-world datasets and regions.
Implement End-to-End ML Pipelines
- Architect and implement automated ML pipelines covering structured, unstructured, and document based data.
- Build feature engineering, model training, validation, and retraining workflows.
- Implement CI/CD for ML, model versioning, and automated retraining strategies.
- Work closely with engineering teams to operationalize ML using MLOps best practices.
Expertise in a Wide Range of AI Techniques
- Hands-on experience with classical ML models including tree-based models, linear models, clustering, and anomaly detection.
- Experience with deep learning architectures such as CNNs, RNNs, and Transformers.
- Strong background in NLP and LLM-based solutions for extraction, summarization, classification, and claim interpretation.
- Experience building OCR pipelines for document parsing, form extraction, and image understanding.
- Experience applying generative AI for reasoning, rule extraction, and claim scenario understanding.
- Ability to evaluate and select the most appropriate technique for each problem.
Work Independently on High-Scale Business Use Cases
- Own ML modules deployed across multiple geographies, regulations, and insurance ecosystems.
- Ensure scalability and robustness for high-volume claims processing workloads.
- Collaborate with product, engineering, and operations teams to translate business requirements into ML solutions.
Strong Technical Acumen
- Deep understanding of data structures, machine learning algorithms, and modern AI architectures.
- Proficiency in Python, ML frameworks such as PyTorch and TensorFlow, and cloud platforms including AWS, GCP, or Azure.
- Familiarity with distributed systems, microservices, APIs, and containerized deployments.
- Ability to conduct architecture reviews and guide engineering teams on ML integration.
- Experience building scalable data pipelines and feature stores.
- Define data quality standards, metadata tracking, and experiment management practices.
- Lead by example with strong individual contributions on critical projects.
- Write high-quality, production-ready Python code using frameworks such as PyTorch, Hugging Face, LangChain, or Ollama.
- Conduct rigorous model validation, interpretability analysis, and bias detection.
Team Leadership, Mentoring & Collaboration
- Lead, mentor, and inspire data scientists, ML engineers, and analysts.
- Foster a culture of ownership, experimentation, innovation, and continuous learning.
- Collaborate cross-functionally with product, engineering, quality, and operations teams.
- Demonstrate empathy, flexibility, and leadership in fast-paced environments.
Required Qualifications
- Engineering degree is mandatory.
- 8+ years of experience in data science or machine learning, with 35 years in a leadership role.
- Proven experience building and deploying ML models in production at scale.
- Strong foundation in statistics, machine learning fundamentals, optimization, and deep learning.
- Expertise in NLP, transformers, LLM fine-tuning, embeddings, computer vision, OCR, time-series modeling, and predictive modeling.
- Advanced proficiency in Python, SQL, ML frameworks, and cloud platforms.
- Demonstrated success leading teams and delivering enterprise-scale AI systems.
Preferred Qualifications
- Experience in health-insurance or health-claims processing ecosystems.
- Understanding of regulatory and compliance constraints in healthcare data.
- Knowledge of healthcare data standards such as HL7, FHIR, ICD, CPT, and SNOMED.
- Experience with MLOps tools including MLflow, Kubeflow, Airflow, Docker, and CI/CD pipelines.
Services you might be interested in
Improve Your Resume Today
Boost your chances with professional resume services!
Get expert-reviewed, ATS-optimized resumes tailored for your experience level. Start your journey now.
