Site Reliability Engineering Manager
Snapmint
8 - 12 years
Gurugram
Posted: 20/02/2026
Job Description
About
Founded by serial Entrepreneurs from IIT Bombay, Snapmint is challenging the way banking is done by building the banking experience ground up. Our first product provides purchase financing at 0% interest rates to 300 Million banked consumers in India who do not have credit cards using instant credit scoring and advanced underwriting systems. We look at hundreds of variables, much beyond traditional credit models. With real time credit approval, seamless digital loan servicing and repayments technology we are revolutionizing the way banking is done for todays smartphone-wielding Indian.
Website: https://snapmint.com
LinkedIn: https://www.linkedin.com/company/snapmintfinserv/
Role Overview
We are seeking an experienced Engineering Manager SRE to lead our Site Reliability Engineering team. This role combines technical leadership, people management, and operational excellence to ensure our systems are reliable, scalable, secure, and highly available. You will drive reliability strategy, improve operational processes, and build a high-performing SRE team.
Title: EM- SRE
Experience: 8-12 Years
Work Location: Gurgaon (Unitech Cyber Park, Sector 39)
Working Arrangement: 5 days (WFO)
Key Responsibilities
Leadership & Team Management
Build, mentor, and manage a high-performing SRE team.
Set clear goals, conduct performance reviews, and support career growth.
Foster a culture of reliability, automation, and continuous improvement.
Collaborate cross-functionally with Engineering, Product, Security, and DevOps teams.
Reliability & Operations
Define and manage SLIs, SLOs, and error budgets.
Ensure system reliability, performance, scalability, and availability.
Lead incident management, root cause analysis (RCA), and postmortems.
Drive improvements in observability, monitoring, and alerting systems.
Infrastructure & Automation
Oversee cloud infrastructure (AWS/GCP/Azure) and on-prem environments.
Promote infrastructure-as-code (Terraform, CloudFormation, etc.).
Drive automation to reduce toil and improve system efficiency.
Improve CI/CD pipelines and deployment reliability.
Strategy & Execution
Develop and execute SRE roadmap aligned with business objectives.
Improve system resilience and disaster recovery processes.
Ensure compliance with security and regulatory requirements.
Track and report reliability metrics to leadership.
Required Qualifications
8+ years of experience in software engineering, DevOps, or SRE.
2+ years of engineering management experience.
Strong expertise in cloud platforms (AWS/GCP/Azure).
Deep understanding of distributed systems and system architecture.
Experience with monitoring tools (Datadog, Prometheus, Grafana, New Relic, etc.).
Proficiency in at least one programming language (Python, Go, Java, etc.).
Experience with containerization and orchestration (Docker, Kubernetes).
Strong incident management and production operations experience.
Services you might be interested in
Improve Your Resume Today
Boost your chances with professional resume services!
Get expert-reviewed, ATS-optimized resumes tailored for your experience level. Start your journey now.
