Login Sign Up
🔔 FCM Loaded

Data Scientist Intern

Artha Group

2 - 5 years

Mumbai

Posted: 21/03/2026

Getting a referral is 5x more effective than applying directly

Job Description

About Artha

Artha Group is a performance-first investment house managing 2,300 crores across domestic and international investment vehicles, including Category I & II AIFs, LLPs, and Private Limited companies. With active investments in 130+ startups, with 32+ successful exits, and 10+ renewable energy projects. We operate at the convergence of capital precision and operational depth.


Our Technology Division is building the Unified Intelligence Platform (UIP) an AI-first portfolio intelligence system powered by multi-agent orchestration, knowledge graphs, and large language models.


Location: Mumbai / Onsite

Employment Type: Internship (6 months)

Reporting To: CTO, Artha Group

Team: Technology Division AI & Data Science

Experience Level: Final-year student or recent graduates (01 year)


Role Overview

This is a hands-on data science internship focused on fine-tuning language models, building financial data pipelines, and supporting AI workflows for a production-grade intelligence platform. You will work directly with the CTO and the AI team, gaining exposure to real VC data, deal intelligence, and advanced ML systems.

This is not a research-only role. You will be expected to ship working components, handle messy real-world data, and contribute to production workflows.


You will

  • Fine-tune small language models (SLMs) on proprietary VC and portfolio datasets
  • Build and clean structured/unstructured financial data pipelines
  • Develop embeddings for semantic search on deal memos and financials
  • Support multi-agent AI workflows with ML components
  • Design evaluation frameworks for LLM outputs in financial contexts
  • Perform exploratory data analysis (EDA) on portfolio metrics and market trends
  • Enrich knowledge graphs with ML-derived signals
  • Key Responsibilities
  • Implement LoRA/QLoRA fine-tuning workflows on HuggingFace
  • Work with SLMs (Phi-3, Mistral, Gemma, LLaMA) and understand tokenization, context windows
  • Handle financial datasets: P&L, balance sheets, MIS reports, time-series metrics
  • Build and maintain Python-based ML pipelines (NumPy, Pandas, Scikit-learn, PyTorch/TensorFlow)
  • Integrate vector databases (ChromaDB, Qdrant) for semantic search
  • Contribute to evaluation and monitoring of model performance


What Success Looks Like in 6 Months

  • Delivered at least one fine-tuned model integrated into UIP workflows
  • Built robust data pipelines for financial datasets
  • Demonstrated ability to work independently on assigned ML tasks
  • Produced clear documentation and reproducible experiments
  • Received positive feedback from CTO and AI team on ownership and execution


Candidate Profile

  • Education: Final-year or recent graduate in CS, ECE, Statistics, Data Science, or MBA with strong quant background
  • Experience: 01 year; prior projects in NLP, ML, or financial data preferred
  • Mindset: Ownership-driven, curious, comfortable with ambiguity, strong execution discipline
  • Portfolio: GitHub repos, Kaggle notebooks, fine-tuning experiments, or research papers are a strong plus


Required Skills

  • Strong foundations in statistics, probability, and ML theory
  • Hands-on experience with fine-tuning language models (LoRA, PEFT)
  • Proficiency in Python and ML stack (NumPy, Pandas, Scikit-learn, PyTorch/TensorFlow)
  • Familiarity with vector databases and semantic search
  • Understanding of transformer architectures and attention mechanisms


Good to Have:

  • Exposure to VC/FinTech datasets
  • Experience with LangChain/LangGraph, Neo4j, or MLOps tools
  • Knowledge of RAG pipelines and LLM evaluation frameworks


Compensation Structure

  • Stipend: 25,000 per month, with the possibility of converting to a full-time position
  • Duration: 6 months
  • Start Date: Immediate
  • PPO: High performers will be considered for a full-time role


What This Role Is NOT

  • This is not a pure research internship you will work on production-grade systems
  • This is not a remote-only role fulltime presence in Mumbai is expected
  • This is not a short-term project full 6-month commitment required

Services you might be interested in

Improve Your Resume Today

Boost your chances with professional resume services!

Get expert-reviewed, ATS-optimized resumes tailored for your experience level. Start your journey now.