🔔 FCM Loaded

Site Reliability Engineer

Concentrix

2 - 5 years

Bengaluru

Posted: 12/02/2026

Getting a referral is 5x more effective than applying directly

Job Description

  • 5+ years in observability, monitoring, or reliability engineering roles.
  • Hands-on experience with common observability tools such as Prometheus, Grafana, Splunk, Coralogix, and external monitoring tools (e.g., Catchpoint, ThousandEyes).
  • Strong scripting skills in Python, plus Bash or PowerShell for automation.
  • Experience with Terraform and Ansible for infrastructure automation.
  • Solid understanding of SLIs, SLOs, error budgets, and reliability engineering principles.
  • Familiarity with Linux environments and distributed systems.
  • Design and implement a Universal Dashboard in Grafana for leadership and engineering visibility.
  • Ensure a consistent look and feel across all observability views.
  • Define and implement SLIs, SLOs, and error budgets for critical services.
  • Establish alerting thresholds and escalation workflows aligned with reliability goals.
  • Integrate anomaly detection and AI-assisted insights into the observability platform.
  • Contribute to self-healing workflows and automated remediation strategies.
  • Partner with engineering teams to instrument services with metrics, logs, and traces.
  • Provide documentation and best practices for observability adoption across teams.

Services you might be interested in

Improve Your Resume Today

Boost your chances with professional resume services!

Get expert-reviewed, ATS-optimized resumes tailored for your experience level. Start your journey now.