Login Sign Up
🔔 FCM Loaded

Site Reliability Engineer

PwC India

2 - 5 years

Bengaluru

Posted: 04/04/2026

Getting a referral is 5x more effective than applying directly

Job Description

Opportunity

We are looking for SREs who want to define what reliability means for the next generation of industrial software. Defining SLIs/SLOs, building observability platforms, and establishing incident management processes.


Responsibilities

  • Define and implement SLI/SLO frameworks for complex engineering systems across manufacturing and industrial clients
  • Design and deploy observability platforms using Prometheus, Grafana, and Datadog
  • Establish incident management processes and lead blameless post-mortems
  • Implement chaos engineering practices to proactively identify system weaknesses
  • Drive toil elimination through automation and platform improvements
  • Build reliability engineering capabilities within the practice and client organisations


Essential Skills

  • SLI/SLO definition and implementation at enterprise scale
  • Observability: Prometheus, Grafana, Datadog, New Relic
  • Incident management and post-mortem facilitation
  • Chaos engineering: Gremlin, Chaos Monkey, Litmus
  • Python testing for reliability validation and automated runbooks
  • Automation and scripting: Python, Go, Bash
  • Cloud platforms: AWS, Azure, GCP


Experience

510 years in SRE or Production Engineering roles with experience in enterprise or industrial environments

Services you might be interested in

Improve Your Resume Today

Boost your chances with professional resume services!

Get expert-reviewed, ATS-optimized resumes tailored for your experience level. Start your journey now.