🔔 FCM Loaded

Site Reliability Engineer

HappieHire

2 - 5 years

Bengaluru

Posted: 26/02/2026

Getting a referral is 5x more effective than applying directly

Job Description

Role: Site Reliability Engineer (SRE)

Location: Hyderabad Marriott Office

Work Mode: 5 Days Work From Office

Experience Required: 7+ Years

Notice Period: Immediate to 15 Days

Budget: Up to 25 LPA


About the Role

we are looking for a highly skilled Site Reliability Engineer (SRE) to manage and enhance the reliability, scalability, and performance of cloud-based production systems. The ideal candidate will have strong experience in AWS, automation, infrastructure as code, and monitoring tools to ensure highly available and resilient systems.


Key Responsibilities:

  • Design, implement, and maintain scalable and highly available infrastructure on AWS.
  • Automate infrastructure provisioning and configuration using Terraform and Ansible.
  • Develop automation scripts using Python and Bash for operational efficiency.
  • Deploy, manage, and optimize containerized workloads using Kubernetes.
  • Design, implement, and maintain robust CI/CD pipelines for reliable deployments.
  • Monitor system health, performance, and availability using tools like Dynatrace, Prometheus, Grafana, and ELK stack.
  • Perform incident management, root cause analysis, and implement preventive solutions.
  • Collaborate with development and engineering teams to improve system reliability and performance.
  • Ensure adherence to cloud security, reliability, and operational best practices.


Required Skills:

  • 7+ years of experience in Site Reliability Engineering, DevOps, or related roles.
  • Strong hands-on expertise in AWS services and scalable architecture design.
  • Proficiency in Python and Bash scripting for automation.
  • Hands-on experience with Terraform, Kubernetes, and Ansible.
  • Strong experience in CI/CD pipeline design and release engineering.
  • Experience with monitoring and observability tools such as Dynatrace, Prometheus, Grafana, ELK, or similar platforms.
  • Strong troubleshooting, analytical, and problem-solving skills.


Preferred Qualifications:

  • Prior experience working as a Site Reliability Engineer (SRE).
  • Experience managing production environments with high availability.
  • Strong understanding of cloud security and reliability best practices.

Services you might be interested in

Improve Your Resume Today

Boost your chances with professional resume services!

Get expert-reviewed, ATS-optimized resumes tailored for your experience level. Start your journey now.