Login Sign Up
🔔 FCM Loaded

Sr. Site Reliability Engineer

Crest Data

2 - 5 years

Ahmedabad

Posted: 12/04/2026

Getting a referral is 5x more effective than applying directly

Job Description

Job Summary:

Experienced Systems Administrator with a strong foundation in Linux, infrastructure management, and incident response, skilled in monitoring, troubleshooting, and maintaining reliable systems across virtualized and cloud-based environments.


Job Responsibilities

  • Manage and optimize Linux systems with focus on performance, reliability, and troubleshooting.
  • Handle network issues including latency, packet drops, and connectivity
  • Work on cloud platforms (AWS/GCP/Azure) for deployment and scaling
  • Deploy and manage applications using Docker and Kubernetes (cluster troubleshooting & scaling)
  • Build and maintain monitoring systems using Prometheus, Grafana, and ELK
  • Create dashboards, alerts, and PromQL queries
  • Automate tasks using Python/Bash scripting
  • Manage CI/CD pipelines (Jenkins/GitLab CI)
  • Handle P1/P2 incidents, lead bridges, and perform RCA


Key Skills

  • Strong Linux fundamentals.
  • Good understanding of networking (TCP/IP, DNS, HTTP/HTTPS, load balancing)
  • Hands-on experience with Docker & Kubernetes (must-have)
  • Experience with cloud platforms (AWS/GCP/Azure)
  • Knowledge of monitoring tools (Prometheus, Grafana, ELK)
  • Proficiency in Python or Bash scripting
  • Experience in CI/CD tools (Jenkins/GitLab CI)
  • Strong incident management and troubleshooting skills.


Good to Have:

  • Exposure to Terraform or Ansible


Qualifications:

  • Bachelors degree in Computer Science, Engineering (BE/B.Tech), MCA, or M.Sc (IT).

Services you might be interested in

Improve Your Resume Today

Boost your chances with professional resume services!

Get expert-reviewed, ATS-optimized resumes tailored for your experience level. Start your journey now.