Platform Engineer (SRE)
D Square Consulting Services Pvt Ltd
2 - 5 years
Bengaluru
Posted: 09/05/2026
Job Description
JD Platform Engineer (SRE)
Experience:10+ years in SRE / DevOps / PlatformEngineering
Work Mode:Remote
PositionType: Contract
Job Summary
We areseeking a highly experienced Platform Engineer Site Reliability Engineer (SRE) to design, build, and operate secure, scalable, and highly available cloud-native platforms on Microsoft Azure. The ideal candidate will have deep hands-on expertise in Azure Kubernetes Service (AKS), Infrastructure as Code, automation, and observability, with a strong focus on reliability and operational excellence. This role combines software engineering and infrastructure skills to ensure resilient platforms, frictionless deployments, and a strong developer experience.
Must-Have Skills
- 10+ years of experience in SRE / DevOps / Platform Engineering roles with production ownership.
- Strong hands-on experience with Microsoft Azure services (networking,compute, storage, security, identity).
- Expertisein Azure Kubernetes Service (AKS) cluster design, administration, troubleshooting, and scaling.
- Strong experience with Docker and Kubernetes ecosystem tools (ingress controllers, Helm, etc.).
- Proficiencyin Infrastructure as Code using Terraform (modules, state management, reusable patterns).
- Experience building andmaintainingCI/CD pipelines (Azure DevOps preferred; GitHub Actions/Jenkins acceptable).
- Strong scripting/programming skills in Python, Go, and/or Bash for automation and tooling.
- Solid understanding of Linux fundamentals, networking concepts, and distributed systems.
- Proven experience in incident management, root cause analysis, and leading production support.
- Hands-on experience implementing observability (logs, metrics, traces, alerts) for platforms and services.
Good to Have Skills
- Experience designing andoperatingmulti-region,highly availablearchitecturesonAzure.
- Knowledge of service mesh, ingress, and traffic management patterns in Kubernetes (e.g., Istio, NGINX, API gateways).
- Experience withGitOpspractices and tools (e.g., Argo CD, Flux).
- Familiarity with security and compliance frameworks in enterprise environments.
- Azure certifications (e.g., AZ-104, AZ-400, AZ-305, or Kubernetes certifications like CKA/CKAD).
- Experience improving developer experience through self-service platforms, templates, and golden paths.
Roles & Responsibilities
Platform Engineering & Azure Infrastructure
- Design, build, andmaintainsecure, scalable, cloud-native platforms on Microsoft Azure for production workloads.
- Architect, provision, and manage Azure Kubernetes Service (AKS) clusters, including cluster upgrades, scaling, and capacity planning.
- Implement Infrastructure as Code (IaC) using Terraform to provision and manage Azure resources consistently and repeatably.
- Design and implement multi-region,highly availablearchitecturesleveragingnative Azure services and best practices.
Kubernetes & Container Orchestration (AKS Focus)
- Deploy, manage, andoptimizeAKS clusters, ensuring reliability, performance, and cost efficiency.
- Enforce Kubernetes best practices including RBAC, network policies, pod security, resource limits/requests, and node management.
- Manage containerized workloads using Docker, including image build, optimization, and registry management.
- Troubleshoot cluster performance, networking, and workload issues across pods, services, and ingress/egress paths.
Reliability & Operations
- Lead incident response for platform-related issues, including on-call participation, triage, mitigation, and communication.
- Conduct post-incident reviews and drive corrective and preventive actions to improve platform reliability.
- Implement and continuously improve observability for platform components (metrics, logs, traces, dashboards, alerts).
- Ensure high availability, minimal downtime, and scalability for platform services through proactive capacity and reliability planning.
Automation & DevOps
- Build andmaintainCI/CD pipelines for infrastructure and application deployments (preferably using Azure DevOps).
- Automate infrastructure provisioning, configuration, deployments, and operational runbooks to reduce manual effort and errors.
- Implement and evolve self-service platform capabilities for development teams (templates, pipelines, standardized environments).
- Work closely with development and security teams to integrate DevOps and SRE best practices across the lifecycle.
Security & Governance
- Implement Azure security best practices, including Managed Identities, Key Vault, RBAC, and network security controls.
- Secure AKS clusters using network policies, pod identity,secretsmanagement, and least-privilege access controls.
- Ensure compliance with enterprise security, governance, and audit requirements across the platform.
- Collaborate with security and compliance teams to address vulnerabilities, harden configurations, andmaintainsecure baselines.
Services you might be interested in
Improve Your Resume Today
Boost your chances with professional resume services!
Get expert-reviewed, ATS-optimized resumes tailored for your experience level. Start your journey now.
