Member of Technical Staff
Pure Storage
2 - 5 years
Bengaluru
Posted: 12/02/2026
Job Description
The Testbed Health team is the backbone of our engineering velocity. We own the reliability and availability of the physical and virtual infrastructure that our developers use to test code every day. Our mission is simple: Eliminate "Environment Issues" as a reason for test failure. We build the automation, diagnostics ("doctors"), and self-healing systems that keep thousands of devices ready for action.
The Role
We need a Staff Engineer to act as the technical architect for our Testbed Infrastructure. While you will still write code, your primary focus will be designing the future state of our lab automation. You will solve systemic reliability problems and define how we scale from hundreds of testbeds to thousands. You will be the technical partner to the Engineering Manager and a mentor to the wider team.
What You Will Do
- System Architecture: Design the orchestration layer that manages state, scheduling, and resource allocation across our physical and virtual fleets.
- Solve "The Hard Problems": Tackle systemic issues like resource contention, distributed locking, and race conditions in our provisioning workflows.
- Technical Strategy: Move us from "reactive fixing" to "proactive prevention." Define the roadmap for how we handle next-generation hardware and operating systems.
- Cross-Team Leadership: Partner with QA, SRE, and Feature Development teams to standardize how environments are requested and consumed across the company.
- Mentorship: Set the bar for code quality and design within the team. Mentor senior engineers on distributed systems concepts and troubleshooting techniques.
What We Need From You
- Architectural Experience: You have designed and built tools that manage large-scale infrastructure. You understand the trade-offs between consistency and availability in distributed systems.
- Expert Linux & Systems Knowledge: You have a deep understanding of the Linux kernel, boot processes, and virtualization technologies (KVM/QEMU/Docker).
- Strong Programming: Expert-level Python or Go. You can design APIs and services that are robust and scalable.
- Debugging Authority: You are the person others come to when a problem seems "impossible." You can root cause failures that span hardware, network, and software layers.
- Opinionated but Pragmatic: You have strong views on how infrastructure should be built, but you know how to adapt to business needs and legacy constraints.
Services you might be interested in
Improve Your Resume Today
Boost your chances with professional resume services!
Get expert-reviewed, ATS-optimized resumes tailored for your experience level. Start your journey now.
