Site Reliability Engineer
greytHR
1 - 2 years
Bengaluru
Posted: 09/04/2026
Job Description
About the Role
We are looking for a passionate and detail-oriented Site Reliability Engineer (SRE) to join our engineering team. As an SRE, you will play a critical role in ensuring the reliability, scalability, and performance of our infrastructure and services. Youll work closely with development and QA teams to build, maintain, and scale production systems while implementing best practices for monitoring, automation, and incident management.
This position is ideal for engineers who thrive in complex distributed environments, are strong in Kubernetes, and enjoy improving system reliability through automation and modern tooling.
Key Responsibilities
- Infrastructure Reliability & Performance
- Maintain, monitor, and improve uptime and performance of production systems.
- Design and implement scalable, reliable, and secure infrastructure on cloud platforms (AWS / GCP).
- Kubernetes & Containerization
- Deploy, manage, and optimize containerized workloads using Kubernetes and Helm.
- Troubleshoot Kubernetes clusters, pods, and networking issues.
- Manage CI/CD pipelines integrated with Kubernetes-based deployments.
- Monitoring & Incident Response
- Participate in on-call rotations for production support and incident response.
- Conduct post-incident reviews and drive preventive improvements.
- Security & Compliance
- Implement and enforce security best practices in infrastructure and application deployments.
- Manage access controls, secrets, and network policies in production environments.
- Collaboration & Continuous Improvement
- Work with development teams to design systems with reliability and scalability in mind.
- Drive automation and self-healing capabilities for common operational tasks.
- Contribute to SRE playbooks, runbooks, and documentation.
Required Skills & Qualifications
- Education: Bachelors degree in Computer Science, Engineering, or related field (or equivalent experience).
- Experience: 1-2 years of experience as an SRE / DevOps
- Core Skills:
- Strong experience with Kubernetes, Docker, and container orchestration.
- Proficiency in Linux system administration and shell scripting.
- Good knowledge of cloud platforms (AWS / GCP / Azure) and related services.
- Basic understanding of networking concepts (DNS, Load Balancing, Firewalls, etc.).
- Programming experience in Python, Go, or Bash for automation.
Good to Have
- Experience in multi-cloud or hybrid cloud environments.
- Certified Kubernetes Administrator (CKA)
- Experience in cost optimization and capacity planning.
- Understanding of SLOs, SLIs, and SLAs within an SRE framework.
- Contribution to open-source projects or active participation in the SRE community.
Services you might be interested in
Improve Your Resume Today
Boost your chances with professional resume services!
Get expert-reviewed, ATS-optimized resumes tailored for your experience level. Start your journey now.
