🔔 FCM Loaded

Staff Engineer - Observability & Cloud Infrastructure

ThoughtSpot

2 - 5 years

Bengaluru

Posted: 08/01/2026

Getting a referral is 5x more effective than applying directly

Job Description

About the role:


ThoughtSpot is seeking an experienced Staff Engineer to lead the architecture and evolution of our Cloud Infrastructure and Observability control plane. You will lead the design of a multi-cloud control plane (AWS, GCP, Azure) that powers our Business Intelligence (BI) application, ensuring it is resilient, cost-efficient, and deeply observable. This role is ideal for a distributed systems expert who wants to solve complex challenges like Multi-Cloud Disaster Recovery, AI-Driven Operations, and FinOps-as-Code, while enabling engineering velocity through self-service platforms.


What you will do:


Architect the Next-Gen Observability Stack


  • Build the "Single Pane of Glass": Design and operationalise a cutting-edge observability pipeline (Logs, Metrics, Traces) using Prometheus, ELK/EFK, Kafka, and OpenTelemetry.
  • AI-Powered Operations: Lead the development of a customer-facing Operations Portal that incorporates AI agents and analytics to provide real-time health insights, automated root cause analysis, and QoS visibility to our customers.
  • No-Touch Operations: Drive the platform toward "no-touch/low-touch" operations by implementing self-healing mechanisms and symptom-based alerting.
  • Control Plane Engineering: Architect scalable microservices that orchestrate tenancy, feature flags, and configuration across AWS, GCP, and Azure .


Multi-Cloud & Hybrid Cloud Strategy


  • Drive the architecture and implementation of multi-cloud disaster recovery (DR) frameworks for both multi-tenant and single-tenant SaaS offerings.
  • Create SDLC frameworks that allow for seamless deployment across multiple clouds without requiring redundant testing.
  • Develop an app modernisation framework to migrate applications from legacy infrastructure to modern Kubernetes-based platforms.


Automation & Infrastructure as a Service


  • Implement Infrastructure-as-Code (IaC) solutions using tools such as Terraform, Ansible, and CloudFormation to automate provisioning and deployments.
  • Provide automation and tools for both customer workflows and internal software development lifecycle (SDLC) processes.
  • Integrate open-source technologies and custom-developed modules to build a state-of-the-art infrastructure stack.


Customer-Obsessed Engineering


  • Ensure our observability isn't just watching servers, but watching the Customer Experience. You will instrument key user journeys (Login, Search, Checkout) to detect customer pain before they file a ticket.


Leadership & Collaboration


  • Provide technical leadership to a team of developers, conducting architecture reviews, and code reviews, and sharing best practices in cloud-native software development.
  • Lead cross-functional collaborations to ensure infrastructure is built for scalability, performance, and security.
  • Mentor and develop team members, driving a culture of technical excellence


What you'll bring:


  • Experience: 10+ years of engineering experience, with at least 3+ years in a Staff/Principal role scaling enterprise SaaS platforms.
  • Cloud Native Mastery: Deep hands-on expertise with Kubernetes, Docker. You have built and operated large-scale infrastructure on AWS, GCP, or Azure.
  • Coding Proficiency: expert-level skills in Go (Golang) (preferred), Java , or Python . You can write production-grade microservices and K8s operators.
  • Observability Deep Dive: You understand the internals of monitoring frameworks. You have scaled Prometheus federation, tuned Elasticsearch/Kafka for massive log ingestion, and implemented distributed tracing.
  • IaC Expert: You treat infrastructure as software. Advanced proficiency with Terraform and Ansible is required.
  • Distributed Systems Knowledge: You have a strong grasp of CAP theorem, consensus algorithms (Raft/Paxos), distributed storage, and networking fundamentals.
  • Strategic Thinking: Experience building "Single Pane of Glass" solutions and managing the trade-offs between speed, cost, and reliability in a multi-cloud environment.

Services you might be interested in

Improve Your Resume Today

Boost your chances with professional resume services!

Get expert-reviewed, ATS-optimized resumes tailored for your experience level. Start your journey now.