🔔 FCM Loaded

Site Reliability Engineer (SRE) – Core IT Infrastructure

TECEZE

2 - 5 years

Hyderabad

Posted: 29/01/2026

Getting a referral is 5x more effective than applying directly

Job Description

Role: Site Reliability Engineer (SRE) Core IT Infrastructure

Location: Hyderabad

Work mode: On-site (full Time)

Experience: 6+ year's


Key Responsibilities


Infrastructure Reliability & Operations

Design, implement, and maintain highly available and fault-tolerant infrastructure

Ensure reliability, performance, scalability, and security of core IT systems

Monitor system health, capacity, and performance using proactive observability practices

Lead incident response, root cause analysis (RCA), and post-incident reviews


Automation & SRE Development

Develop and maintain automation tools, scripts, and frameworks to reduce manual operations

Apply Infrastructure as Code (IaC) principles using tools such as Terraform, Ansible, or CloudFormation

Build self-healing systems and automate repetitive operational tasks

Improve deployment pipelines and operational workflows through engineering solutions


DevOps & Platform Engineering

Collaborate with DevOps, development, and security teams to support CI/CD pipelines

Enable seamless application deployments with minimal downtime

Support containerized and orchestration platforms (Docker, Kubernetes, OpenShift)

Implement best practices for configuration management and environment consistency


Monitoring, Observability & Performance

Design and maintain monitoring, logging, and alerting systems

Define and track SLIs, SLOs, and SLAs

Optimize system performance, capacity planning, and cost efficiency

Enhance observability using tools such as Prometheus, Grafana, ELK, Datadog, or similar


Security & Compliance

Implement infrastructure security best practices

Collaborate with security teams on vulnerability management and compliance requirements

Ensure secure access, identity management, and audit readiness



Required Skills & Qualifications


Technical Skills

Strong experience in Linux/Unix system administration

Proficiency in programming/scripting (Python, Go, Bash, Shell, or similar)

Experience with cloud platforms (AWS, Azure, or GCP)

Hands-on experience with containerization and orchestration

Knowledge of networking concepts (DNS, TCP/IP, load balancing, firewalls)

Experience with monitoring, logging, and alerting tools

Services you might be interested in

Improve Your Resume Today

Boost your chances with professional resume services!

Get expert-reviewed, ATS-optimized resumes tailored for your experience level. Start your journey now.