Site Reliability Engineer
HiroJet
3 - 5 years
Bengaluru
Posted: 27/04/2026
Job Description
Role - Site Reliability Engineer
Location - In Office (Bengaluru, India)
About Company - This is an AI voice automation platform that enables businesses to streamline high-volume, repetitive communication across customer support, sales, and recruitment functions. Our voice agents are designed to efficiently handle tasks such as query resolution, order confirmation, delivery scheduling, lead qualification, follow-ups, candidate screening, and interview coordination.
ROLE SUMMARY-
Own the reliability and infrastructure layer that keeps real-time voice platform running. We are moving to Kubernetes as a core part of our infrastructure evolution, and this person will be central to making that happen. This role is hands-on, high ownership, and built for someone who is comfortable being the person others lean on when things go wrong at 2 am.
RESPONSIBILITIES-
Infrastructure and Kubernetes Migration
Lead and execute the migration of existing services to Kubernetes
Design and maintain cluster architecture, workload configurations, namespaces, and resource policies
Manage infrastructure as code using Terraform across Azure and GCP environments
Keep environments consistent across development, staging, and production CI/CD and Developer Experience
Build and maintain CI/CD pipelines that let the engineering team ship fast without breaking things
Work closely with backend and full-stack engineers to make deployments smooth, repeatable, and safe
Improve deployment reliability through better rollout strategies, rollback mechanisms, and environment parity
Monitoring and Alerting-
Own the observability stack using Grafana and Prometheus
Build dashboards and alerts that give the team real visibility into system health,
not just vanity metrics
Track and improve p95 and p99 latency across critical services
Proactively spot degradation before it becomes an incident
Incident Response
Be the first line of defence when production issues come up
Own incident response from detection through to resolution and post-mortem
Build runbooks and processes that reduce mean time to recovery over time
Make sure the same incident does not happen twice
MUST HAVE-
1 to 3 years of experience in an SRE, DevOps, or infrastructure engineering role
Hands-on experience with Kubernetes, you have deployed and managed workloads in production, not just in tutorials
Working knowledge of Terraform or similar IaC tools
Experience building or maintaining CI/CD pipelines
Familiarity with Grafana and Prometheus for monitoring and alerting
Strong debugging instincts, you can trace a production problem across logs, metrics, and traces and get to the root cause
Ownership mindset: You treat uptime like it is your personal responsibility
NICE TO HAVE-
Experience running infrastructure on Azure and GCP
Exposure to real time or low latency systems, voice AI, or streaming workloads
Familiarity with service mesh tools like Istio or Linkerd
Experience writing automation scripts in Python or Bash to reduce toil
THE AI-FIRST REQUIREMENT-
We expect every engineer to actively use AI tools in their daily work. For this role that means using AI to write infrastructure code faster, debug incidents more effectively, generate runbook drafts, and build internal tooling without always needing to loop in another engineer. If AI assisted engineering is already part of how you work, you will fit right in.
WHAT SUCCESS LOOKS LIKE IN 6 MONTHS-
Kubernetes migration complete and stable in production
CI/CD pipelines running reliably with fast, safe deployments across all services
Grafana dashboards and Prometheus alerts giving the team clear, actionable visibility
Incident response times down and post mortems happening consistently after every major issue
Runbooks in place so the team is not dependent on one person knowing everything
Services you might be interested in
Improve Your Resume Today
Boost your chances with professional resume services!
Get expert-reviewed, ATS-optimized resumes tailored for your experience level. Start your journey now.
