Login Sign Up
🔔 FCM Loaded

Associate Director-SRE Lead

Birlasoft

14 - 16 years

Pune

Posted: 02/03/2026

Getting a referral is 5x more effective than applying directly

Job Description

Area(s) of responsibility

Job description - SRE Lead (SRE & Platform Infra)

Summary

The SRE Lead is responsible for driving reliability, resiliency, and performance across Birlasoft’s Platform Engineering ecosystem—including microservices, cloud workloads, Cogito agentic operations, and enterprise applications. The role ensures high availability and predictable performance through SLO-driven engineering, observability, automation-first operations, and incident excellence. Partners closely with DevOps, Cloud, Architecture, Security, and Delivery to embed reliability into design, build, and run phases.


Roles & Responsibilities

Reliability & Performance

  • Define & maintain SLOs, SLIs, and error budgets for platform services.
  • Lead capacity planning, performance tuning, autoscaling strategies, and resilience testing.
  • Drive reliability patterns such as graceful degradation, retry logic, and distributed failover.

Observability & Monitoring

  • Own monitoring stack across Azure Monitor, App Insights, Log Analytics, OpenTelemetry, and AKS.
  • Design alerting standards noise reduction, correlation, routing, escalation.
  • Build health, reliability, and risk dashboards for leadership.

Incident & Problem Management

  • Lead incident response, on-call processes, and blameless postmortems.
  • Drive MTTR reduction through automation, playbooks, and predictive analytics.
  • Establish proactive issue detection mechanisms using patterns, telemetry, and AIOps.

Automation & AIOps

  • Implement automation-first operations for remediation, self-healing, and repetitive tasks.
  • Integrate AI-driven agent workflows with Cogito for troubleshooting, optimization, and cost‑ops.
  • Increase operational maturity through runbooks, autopilot actions, and integrated CI/CD reliability checks.

Collaboration & Governance

  • Partner with Platform Engineering pods (Infra, Core, Integration, DevEx, Security) to embed reliability by design.
  • Influence architecture for scalability, observability, and fault tolerance.
  • Mentor SRE engineers and lead the maturity of SRE practices across accounts.

 

Technical Skills

Mandatory

  • Azure Monitor, App Insights, Log Analytics, KQL
  • Kubernetes (AKS), autoscaling, HPA/KEDA
  • Distributed tracing (OpenTelemetry)
  • CI/CD pipelines & release engineering (Azure DevOps/Jenkins/GitOps)
  • Incident management, root-cause analysis, and on‑call frameworks
  • Performance testing, load testing, and capacity planning
  • Infrastructure as Code (Terraform/ARM/Bicep)
  • Strong understanding of microservices & cloud-native architecture
  • Python/PowerShell/Go scripting for automation

Qualifications

  • Bachelor’s degree in Computer Science, Engineering, or related field
  • 8–14 years of experience in SRE, DevOps, Cloud, or Platform Engineering roles
  • Certifications preferred:
    • Azure Administrator / Azure DevOps Engineer
    • Kubernetes (CKA/CKAD)
    • SRE Foundation / SRE Practitioner
  • Demonstrated leadership experience managing SRE/DevOps teams, reliability initiatives, or mission‑critical platforms

 

 

About Company

Birlasoft is a global IT services and consulting company that is part of the CK Birla Group. It specializes in digital transformation, enterprise application services, and IT modernization for industries such as manufacturing, life sciences, BFSI, and energy. Birlasoft is known for its strong capabilities in SAP, Oracle, cloud, and analytics, helping clients drive innovation, reduce costs, and improve agility.

Services you might be interested in

Improve Your Resume Today

Boost your chances with professional resume services!

Get expert-reviewed, ATS-optimized resumes tailored for your experience level. Start your journey now.