Software Platform Engineer
Zyoin Group
2 - 5 years
Bengaluru
Posted: 17/02/2026
Job Description
Software Platform Engineer
About the Team
We are part of a Software Platform Engineering team responsible for building and operating a cloud-native infrastructure platform that powers large-scale HPC and AI workloads across AWS environments.
The team develops internal platform tools and servicesincluding CLI tools, CI/CD automation, Kubernetes operators, and observability systemsto enable engineering teams to deploy, manage, and monitor complex distributed systems efficiently.
We own the entire infrastructure lifecycle, from design to production operations.
Role Overview
As a Software Platform Engineer, you will design and build scalable platform solutions that accelerate AI and HPC workloads. Youll work deeply across cloud infrastructure, Kubernetes, CI/CD, and automation, collaborating with global teams to deliver reliable, secure, and high-performance systems.
Key Responsibilities
- Design and develop scalable, cloud-native solutions for HPC and AI workloads
- Build and maintain internal platform tools and automation frameworks
- Design, implement, and optimize CI/CD pipelines for containerized environments
- Work extensively with Kubernetes, Docker, and Infrastructure as Code (IaC)
- Automate infrastructure provisioning and configuration using GitOps and Ansible
- Implement and enhance observability using OpenTelemetry, Prometheus, and Grafana
- Collaborate across engineering teams to support distributed systems at scale
- Ensure platform reliability, security, and performance across environments
Required Skills & Experience
- 3+ years of hands-on engineering experience
- Proficiency in Python, Go, or Rust
- Strong experience in Linux environments
- Hands-on expertise with Kubernetes and Ansible (mandatory)
- Experience with Docker, GitOps, and Infrastructure as Code
- Cloud experience with AWS, Azure, or GCP
- Strong experience with CI/CD workflows (e.g., GitHub Actions)
- Experience with observability tools: OpenTelemetry, Prometheus, Grafana
- Solid understanding of distributed systems
Nice to Have
- Experience building Kubernetes operators/controllers
- Exposure to HPC or AI infrastructure
- Knowledge of microservices and event-driven architectures
Qualifications
- BSc in Computer Science or an equivalent degree
- Proven experience designing, developing, debugging, and maintaining complex distributed systems
- Excellent communication skills and ability to collaborate across teams and geographies
- Self-driven, adaptable, and eager to learn new technologies
Why Join
- Work on cutting-edge AI and HPC infrastructure
- Own platform systems used at scale
- Collaborate with global engineering teams
- High-impact role with deep technical ownership
Services you might be interested in
Improve Your Resume Today
Boost your chances with professional resume services!
Get expert-reviewed, ATS-optimized resumes tailored for your experience level. Start your journey now.
