Reliability Engineering Lead
Zyoin Group
18 - 20 years
Hyderabad
Posted: 20/02/2026
Job Description
Reliability Engineering Lead
Location: Hyderabad
Work Mode: Hybrid (3 Days Office)
Experience: 1218 Years
Notice Period: Immediate 30 Days
Role Overview
We are looking for a seasoned Reliability Engineering Lead to drive reliability strategy, incident excellence, automation maturity, and observability across enterprise digital platforms. This role blends deep technical expertise with governance leadership and is ideal for someone who can translate reliability engineering into measurable business outcomes such as revenue impact, operational efficiency, and user safety.
You will act as the process owner for reliability frameworks, ensuring systems remain resilient, compliant, scalable, and optimized while enabling engineering velocity.
Key Responsibilities
1. Service Reliability & SLO Framework
- Define and implement SLIs/SLOs aligned with business impact and operational requirements.
- Drive SLO-based decision making for releases, prioritization, and incident response.
- Establish error budget frameworks balancing feature velocity and system reliability.
- Build reliability governance aligned with regulatory frameworks (GxP, SOX, etc.).
- Translate technical metrics into business-level insights and executive reporting.
2. Incident Management & Learning Culture
- Lead structured incident command processes for critical outages.
- Facilitate blameless postmortems to improve systems and foster psychological safety.
- Build and maintain incident learning repositories for organizational knowledge sharing.
- Implement proactive monitoring systems to detect issues before user impact.
3. Automation & Toil Reduction
- Maintain operational toil below 50% workload through automation initiatives.
- Identify and eliminate repetitive tasks using cost-benefit prioritization.
- Deliver engineering improvements that enhance performance and reliability quarterly.
- Develop self-service documentation, runbooks, and automation tooling.
4. Platform Engineering & AI Reliability
- Design reliability frameworks for AI/ML workloads and data pipelines.
- Partner with platform teams to embed reliability into internal developer platforms (IDPs).
- Support enterprise-scale agentic systems with reliability and compliance alignment.
- Improve CI/CD reliability and infrastructure-as-code practices.
5. Observability & Performance Engineering
- Implement full-stack observability across metrics, logs, traces, and business KPIs.
- Conduct performance engineering, capacity planning, and bottleneck analysis.
- Deploy intelligent monitoring systems with predictive alerting and root cause insights.
- Enable cross-system monitoring across cloud, on-prem, and legacy environments.
6. Security & Compliance Alignment
- Integrate reliability practices with DevSecOps and compliance frameworks.
- Automate compliance checks, audit trails, and reporting.
- Perform reliability impact assessments for regulated systems.
- Design and validate disaster recovery strategies aligned with business and regulatory requirements.
Mandatory Qualifications
- 1218 years of experience in SRE, platform engineering, or reliability engineering.
- Proven experience designing enterprise-scale reliability frameworks.
- Strong expertise in:
- SLO/SLI design
- Observability platforms
- Incident management
- Automation strategies
- Hands-on knowledge of distributed systems, cloud platforms, and infrastructure reliability.
- Experience working within regulated environments or compliance-driven systems.
- Strong stakeholder communication and leadership capabilities.
Why This Role
- Strategic leadership opportunity with organization-wide impact.
- Ownership of reliability strategy for mission-critical platforms.
- High visibility with senior leadership and cross-functional teams.
- Ability to influence platform architecture, delivery velocity, and engineering culture.
Services you might be interested in
Improve Your Resume Today
Boost your chances with professional resume services!
Get expert-reviewed, ATS-optimized resumes tailored for your experience level. Start your journey now.
