Site Reliability Engineer
Datum Technologies Group
2 - 5 years
Chennai
Posted: 10/12/2025
Job Description
Job Title: Site Reliability Engineer (SRE) AWS
Experience: 8+ years
Location: Chennai / Mumbai
Work Mode: Hybrid
Key Skills: AWS, Terraform, Kubernetes, Docker, Grafana, Prometheus, Datadog
Job Summary:
We are looking for a skilled Site Reliability Engineer (SRE) with strong AWS experience and a solid background in DevOps, automation, observability, and large-scale distributed systems.
Responsibilities:
Manage and optimize cloud infrastructure using AWS IaaS.
Implement SRE practices to enhance reliability, performance, and SDLC efficiency.
Build and maintain CI/CD pipelines (Jenkins, GitLab, Terraform).
Work with containers and orchestration (Docker, ECS, Kubernetes).
Troubleshoot performance, networking, and distributed system issues.
Drive DevOps and QA best practices across teams.
Implement observability: SLI/SLO, Error Budgets, monitoring, logging, tracing, alerting.
Lead incident resolution and perform RCA.
Automate tasks using Python/Bash/PowerShell.
Collaborate effectively with cross-functional teams with minimal supervision.
Qualifications:
Strong AWS cloud experience
Proven DevOps & SRE implementation skills
Good understanding of Linux, networking, and distributed systems
Hands-on experience with observability tools
Strong scripting and automation expertise
Excellent communication and teamwork skills
Services you might be interested in
Improve Your Resume Today
Boost your chances with professional resume services!
Get expert-reviewed, ATS-optimized resumes tailored for your experience level. Start your journey now.
