🔔 FCM Loaded

AI Engineer Production Operations

MSCI

2 - 5 years

Mumbai

Posted: 26/10/2025

Getting a referral is 5x more effective than applying directly

Job Description

Your Team Responsibilities

Our mission is to embed AI-driven automation, telemetry, and observability into MSCI’s production environments, enabling the Quality Center of Excellenace team to deliver on its objectives of operational excellence, risk reduction, and quality at scale. We serve as the engineering backbone of production governance, ensuring that systems are reliable, efficient, and continuously improving through data-driven insights.

Your Key Responsibilities

  • AI Tooling & Framework Development

    • Build AI-driven tools for anomaly detection, incident triage, and root-cause analysis in production systems.

    • Develop and deploy automation frameworks to support Level 1 / Level 2 support teams and streamline repetitive operational tasks.

    • Create self-healing and predictive monitoring capabilities using ML models.

  • Telemetry & Observability

    • Implement telemetry pipelines in GCP (e.g., Stackdriver/Cloud Monitoring, BigQuery, Pub/Sub) and Azure (e.g., Application Insights, Log Analytics, Monitor).

    • Build dashboards, automated alerting, and intelligent log/metric analysis frameworks across hybrid cloud environments.

    • Leverage distributed tracing and logging frameworks to ensure end-to-end visibility of systems.

  • Incident Management Automation

    • Design AI-assisted runbooks to support incident triage and resolution.

    • Automate classification and escalation of incidents using ML and rule-based systems.

    • Integrate AI-powered insights with existing incident management platforms (e.g., ServiceNow, PagerDuty, Opsgenie).

  • Collaboration

    • Work closely with production teams, SREs, and system test engineers, incidnet managers to integrate AI solutions into live environments.

    • Partner with cloud engineering teams to ensure solutions are scalable, secure, and compliant.

    • Provide technical knowledge transfer and training on AI-enabled tools to support engineers.

Your skills and experience that will help you excel

  • Bachelor’s or Master’s degree in Computer Science, Data Engineering, or related field.

  • 11+ years of hands-on engineering experience in production support, SRE, or AI/ML platform development.

  • Strong programming skills in Python (preferred) and experience with AI/ML frameworks (PyTorch, TensorFlow, Scikit-learn).

  • Hands-on expertise with GCP (BigQuery, Pub/Sub, Cloud Monitoring, Vertex AI) and Azure (Application Insights, Log Analytics, Azure ML, Azure Monitor).

  • Experience with observability tools (Prometheus, Grafana, ELK, Splunk, Datadog).

  • Proficiency with cloud-native infrastructure (Kubernetes, Docker, Terraform, CI/CD pipelines).

  • Strong understanding of incident management and ITIL practices.

  • Experience implementing AIOps solutions in hybrid cloud environments.

  • Knowledge of MLOps best practices (model deployment, monitoring, retraining).

About Company

MSCI Inc. is a leading global provider of critical decision-support tools and services for the investment community. The company is best known for its market indexes, such as the MSCI World and MSCI Emerging Markets Indexes, which are widely used as benchmarks by asset managers and institutional investors worldwide. In addition to indexes, MSCI offers portfolio risk and performance analytics, real estate data, and environmental, social, and governance (ESG) research to help clients make informed investment decisions. With a strong presence across major financial markets, MSCI plays a pivotal role in shaping investment strategies and facilitating transparency in global capital markets.

Services you might be interested in

Improve Your Resume Today

Boost your chances with professional resume services!

Get expert-reviewed, ATS-optimized resumes tailored for your experience level. Start your journey now.