Login Sign Up
🔔 FCM Loaded

Site Reliability Engineer

ITMC Systems, Inc

2 - 5 years

Pune

Posted: 18/03/2026

Getting a referral is 5x more effective than applying directly

Job Description

Open Role || SRE with OpenShift || Hyderabad/Pune


Role: OpenShift & Site Reliability Engineering (SRE)


Job Description:

Providing suggestion/consultation to client based on their requirement to setup OpenShift Platform (On-Premises or Cloud or PAAS) with resources size.

Configuring the OCP post its deployment by client or DevOps Team.

Ensure the OCP Cluster is resilience over SPoF

Day 2 configurations such as CSI Blob driver installation on ARO cluster to consume azure blob storage

ODF deployment and customizing or creating new Storage Classes based on Reclaim and VolumeBind policy requirement

Integrating VMWare with OCP so that it can leverage the underlying hardware for cluster autoscaling and storage consumption

Deploy Thanos setup to store Cluster and Workload metrics for longer duration as ARO monitoring has limitation

Configure monitoring rules and alert managers to intimate cluster and application failure over Email or Ticketing tools

Configure OADP backup tool and test Backup/Restore of Application to meet the RTO & RPO

Ensure clusters comply with security standards and free from vulnerabilities

Perform cluster upgrade on Regular intervals based on OCPs EOL and Application Compatibility

Automate OCP Day2 configurations using Ansible Playbooks


Maximo Application:

Supporting IBM MAS deployment, configuration, troubleshooting, backup & restore validation

Configure TLS certificates

Amend network policies on Application namespace to communicate with Kub-Api to stop/start app services using OC commands via cronjob

Troubleshoot issues during Tekton pipeline execution for IBM MAS Application Install/Upgrade activities.


Azure Platform:

Azure storage management Create & manage Blob/File storages based on the requirement of App and high availability (LRS or ZRS), enable cleanup policies to keep-up data retention for logs/metrics on blob storage, enable azure backup for file storages.

Integrate ARO with Azure ARC to leverage Azure Monitoring and Log analytics for alerting

Rotate Azure Service principal creds before they expire for ARO clusters


Responsibilities:

Deploy and manage OpenShift(RHOCP 4.X) environment from scratch on Bare-Metal using Ansible scripts.

Updating the inventories and deployment methods in ansible playbook as per the deployment ENV.

Build and Manage Tanzu Kubernetes Grid on Bare-Metal VMWare platform.

Multi Cluster administration Management cluster for FCAPS components and Resource cluster for Application components

OpenShift Cluster backup/restore using Trilio

OCP Compliance fixing based on Compliance Operator and Prisma security tools

Prepare Mirror Repository for OCP4 deployment on Restricted network

4G & 5G CU Application onboarding using helm charts on both RHOCP & VMWare-TKG.

Cluster turning for Applications onboarding, such as ODF/OCS, Performance addon, Multus, SRIOV, NMState, Kubevirt, Quay and GitLab installation

FCAPS installation for logging (Elasticsearch-Fluentd-Kibana), Certificate Management, Authentication (RH IDM)

Working with AWS Services - EC2, S3, Route53, EBS, IAM, ELB, Cloud Watch, Auto Scaling, VPC.

HLD Preparation for OCP/TKG Single node/ HA Setups.

Services you might be interested in

Improve Your Resume Today

Boost your chances with professional resume services!

Get expert-reviewed, ATS-optimized resumes tailored for your experience level. Start your journey now.