Staff Observability Engineer: Cloud Platform Engineering
Calix
2 - 5 years
Bengaluru
Posted: 27/02/2025
Job Description
The Cloud Platform Engineering team is responsible for the Platforms, Tools, and CI/CD pipelines at Calix. Our mission is to enable Calix engineers to accelerate the delivery of world-class products while ensuring high availability.
As a Staff Cloud Observability Engineer, you will play a critical role in ensuring the reliability, performance, and scalability of our cloud-based systems. You will be responsible for designing, implementing, and maintaining observability solutions that provide deep insights into our infrastructure and applications. Your expertise in GCP, Open Telemetry (OTel), Prometheus, Jaeger/Zipkin and Grafana will be instrumental in driving our observability strategy and ensuring that our systems are always running at their best.
Key Responsibilities:
- Tooling and Integration: Design, configure, and maintain observability tools and platforms, including Prometheus, Grafana, Jaeger/Zipkin and Open Telemetry (OTel), to monitor and analyze system performance, availability, and reliability.
- Cloud Expertise: Leverage your deep knowledge of Google Cloud Platform (GCP) or AWS to optimize cloud infrastructure for observability, ensuring seamless integration with monitoring tools.
- Incident Management: Collaborate with cross-functional teams to identify, troubleshoot, and resolve complex issues, ensuring minimal impact on business operations.
- Automation: Develop and implement automation scripts and tools to streamline observability processes, reduce manual intervention, and improve efficiency.
- Performance Optimization: Continuously monitor and analyze system performance, identifying areas for improvement and implementing solutions to enhance scalability and reliability.
- Documentation: Create and maintain detailed documentation of observability processes, tools, and best practices to ensure knowledge sharing and continuity.
- Mentorship: Provide guidance and mentorship to junior engineers, fostering a culture of learning and growth within the team.
Qualifications:
- Bachelor's degree in computer science, Information Technology, or a related field. Advanced degree preferred.
- Minimum of 5+ years of experience in cloud observability, monitoring, and performance engineering.
- Expertise in Designing, implementing and supporting operational and reliability aspects of large-scale Observability & Telemetry collection platform with a focus on performance at scale, real time monitoring, logging and alerting.
- Proficiency in Google Cloud Platform (GCP) or AWS, and its observability tools (e.g., Cloud Monitoring, Cloud Logging, Cloud Trace).
- Strong experience with Open Telemetry (OTel) for distributed tracing and metrics collection.
- Expertise in Prometheus for monitoring and alerting.
- Advanced knowledge of Grafana for visualization and dashboard creation.
- Experience in a time-series database like Prometheus, Cortex, Grafana Mimir.
- Experience with infrastructure as code (IaC) tools such as Terraform or Ansible.
- Familiarity with container orchestration platforms like Kubernetes.
- Strong scripting skills (e.g., Python, Bash) for automation and tooling.
- Excellent problem-solving and analytical skills.
- Strong communication and collaboration abilities.
- Ability to work effectively in a fast-paced, dynamic environment.
- Leadership skills with a proven ability to mentor and guide team members.
Preferred Qualifications:
- Certifications in GCP (e.g., Google Cloud Professional Cloud Architect, Google Cloud Professional Data Engineer).
- Experience with other observability tools such as ELK Stack (Elasticsearch, Logstash, Kibana), or Splunk.
- Knowledge of DevOps practices and CI/CD pipelines.
Location:
- India – (Flexible hybrid work model - work from Bangalore office for 20 days in a quarter)
About Company
Calix, Inc. is a cloud and software platform company headquartered in San Jose, California. It specializes in providing cloud-based software, systems, and services that enable broadband service providers to simplify operations, deliver exceptional subscriber experiences, and grow their businesses. Calix’s solutions focus on empowering communication service providers to optimize their networks, leverage advanced analytics, and create personalized customer experiences. Known for its innovation in broadband technology, Calix helps its clients transition to next-generation networks, ensuring scalability, efficiency, and improved customer satisfaction.
Services you might be interested in
One-Shot Campaign
Reach out to ideal employees in one shot!
The intelligent campaign for reaching out to the ideal audience to whom you can ask for help (guidance or referral).