Principal Consultant – Cloud Site Reliability Engineer (SRE)
Genpact
5 - 10 years
Hyderabad
Posted: 9/3/2024
Job Description
Responsibilities
:1. Reliability and Availability:
• Implement best practices for high availability and disaster recovery across cloud environments.
• Monitor system performance, availability, and incident response to ensure minimal downtime.
• Create and maintain robust monitoring and alerting systems.
2. Automation and Infrastructure as Code (IaC):
• Develop and maintain automation scripts and Infrastructure as Code (IaC) templates for provisioning and managing cloud resources.
• Automate routine tasks to increase operational efficiency and reduce manual interventions.
3. Scalability and Performance Optimization:
• Collaborate with development teams to design and implement scalable and performant cloud architectures.
• Conduct performance analysis and tuning to optimize system response times and resource utilization.
4. Incident Response and Troubleshooting:
• Participate in incident response activities, including root cause analysis, resolution, and post-incident reviews.
• Troubleshoot complex issues across the cloud stack and coordinate with relevant teams for resolution.
5. Security and Compliance:
• Implement security best practices and compliance measures in cloud environments.
• Collaborate with security teams to ensure data protection and compliance with industry standards.
6. Capacity Planning:
• Monitor resource utilization and forecast capacity requirements to support business growth.
• Implement scaling strategies to accommodate changing workloads.
7. Documentation and Knowledge Sharing:
• Maintain comprehensive documentation of cloud configurations, processes, and procedures.
• Share knowledge and best practices with team members and contribute to a culture of continuous learning.
Minimum Qualifications
/ Skills• Bachelor’s degree in computer science, Information Technology, or a related field.
• Good years of experience in cloud operations, SRE, or a related role.
• Proficiency in cloud platforms such as AWS, Azure, or Google Cloud.
• Desired Characteristics:
• Certification in cloud platforms (e.g., AWS Certified DevOps Engineer, Google Cloud Professional DevOps Engineer, Azure DevOps Engineer Expert).
• Experience with containerization and orchestration tools (e.g., Docker, Kubernetes).
• Knowledge of infrastructure monitoring and logging tools (e.g., Prometheus, Grafana, ELK stack).
• Strong scripting and programming skills (e.g., Python, Bash, Go).
• Familiarity with CI/CD pipelines and automation tools (e.g., Jenkins, GitLab CI/CD).
• Excellent problem-solving and communication skills.
• Ability to work collaboratively in a cross-functional and fast-paced environment.
Preferred Qualifications
/ Skills• As a Cloud Site Reliability Engineer, you will be at the forefront of ensuring the reliability, scalability, and security of our cloud infrastructure. Your technical expertise, automation skills, and commitment to best practices will contribute to the success of our cloud operations, enabling us to deliver high-performance and highly available services to our customers. Join us in this critical role and be part of a dynamic team dedicated to excellence and innovation.
• Very good written and presentation / verbal communication skills with experience of customer interfacing role. In-depth requirement understanding skills with good analytical and problem-solving ability, interpersonal efficiency, and positive attitude
• Knowledge of ITIL Service Management framework
About Company
Genpact is a global professional services firm delivering digital transformation by putting digital and data to work to create competitive advantage.
Services you might be interested in
One-Shot Campaign
Reach out to ideal employees in one shot!
The intelligent campaign for reaching out to the ideal audience to whom you can ask for help (guidance or referral).