Sr Engineer, Site Reliability
TMUS Global Solutions
2 - 5 years
Hyderabad
Posted: 31/01/2026
Getting a referral is 5x more effective than applying directly
Job Description
What Youll Do:
- Ownday-to-day reliability and operational support for assigned cybersecurity platforms and services.
- Design and implementautomation scripts and workflows to eliminate manual operations and prevent recurring issues.
- Monitor service health, analyze alerts, and maintainoperational KPIs and dashboards.
- Participate inincident response, troubleshooting, and root cause analysis, driving fixes for assigned issues.
- Contribute toservice resilience, performance tuning, and capacity planning.
- Build and maintainCI/CD pipelines and support reliable, repeatable deployments.
- Operate and troubleshootDocker and Kubernetes-based workloads.
- Supportcloud-native services on AWS and Azure, including configuration, performance, and cost awareness.
- Maintain and enhancePower BI dashboards for reliability, incident, and automation metrics.
- Fix production bugs and remediatesecurity vulnerabilities and configuration gaps.
- Apply and help evolveSRE practices such as SLIs, SLOs, error budgets, and automation-first operations.
- Collaborate with software engineering and cybersecurity teams to improveoperational readiness and security posture.
- Perform additional duties and projects as needed.
What Youll Bring:
- 5+ years of experience in Site Reliability Engineering, DevOps, platform engineering, or operations-focused engineering roles.
- Bachelors degree inComputer Science, Engineering, or related field (or equivalent experience).
- Experience supportingproduction, security-critical systems.
- Hands-on experience withAWS and/or Azure.
- Strong scripting or programming skills usingPython, Bash, PowerShell, Go, or Java.
- Experience withCI/CD pipelines, DevOps tooling, and automated deployments.
- Practical experience withDocker and working knowledge of Kubernetes.
- Experience withmonitoring, logging, alerting, and operational troubleshooting.
- Familiarity withrelational and/or NoSQL databases from an operational perspective.
- Understanding ofsecure operations, IAM concepts, and vulnerability remediation.
- Ability to work independently while collaborating effectively across teams.
Must Have Skills:
- 5+ years supportingproduction systems
- Automation and scripting (Python, Bash, PowerShell)
- Hands-on experience withAWS
- CI/CD and DevOps practices
- Docker and container-based operations
- Monitoring, alerting, and incident response
- Secure operations and configuration management
- Day-to-day operational ownership of hosted applications
Nice To Have:
- Power BI dashboard creation and maintenance
- Kubernetes troubleshooting experience
- Power BI dashboard creation and maintenance
- Experience withSLIs, SLOs, and error budgets
- Infrastructure-as-code (Terraform, ARM, Bicep)
- Experience supportingcybersecurity platforms
- Exposure toAIOps or anomaly detection
- Experience buildingsecurity automation or SOAR solutions
Services you might be interested in
Improve Your Resume Today
Boost your chances with professional resume services!
Get expert-reviewed, ATS-optimized resumes tailored for your experience level. Start your journey now.
