At Ascendion, we are committed to delivering highly reliable, scalable, and intelligent application-managed services for our customers. We are seeking a Application Support SME - SRE Leader , who will function as both a Subject Matter Expert (SME) and a Techno-Managerial leader .

This role requires someone who brings a blend of deep technical expertise and strong leadership instincts capable of understanding complex application architectures while driving operational excellence across teams, customers, and delivery functions. The ideal candidate has hands-on knowledge of microservices, application lifecycle management , logging, tracing, observability, telemetry, and has led large, complex 24x7 operations .

You will act as the strategic and technical authority for your domain, collaborating with customers, engineering teams, SRE/DevOps groups, and Delivery Leaders to design, implement, and mature reliable application operations.

Personal Leadership

The successful candidate:

Demonstrates strong techno-managerial leadership, balancing engineering depth with delivery oversight
Acts as the SME for application operations, observability, SRE practices, and resilience engineering
Influences cross-functional teams and drives alignment in matrixed environments
Effectively communicates with executives, architects, engineers, and customer stakeholders
Shows a strong commitment to quality, operational excellence, and customer success

Key Responsibilities

Techno-Managerial Leadership & SME Responsibilities

Serve as the primary SME for application-managed services, microservices operations, observability, and reliability engineering.
Guide teams on best practices for logging, distributed tracing, metrics, alerting, and telemetry implementation .
Provide technical direction during critical incidents, major outages, and complex problem investigations.
Mentor engineering teams and uplift their capabilities in application operations, SRE, automation, and monitoring strategies.
Drive design reviews, operational readiness assessments, and reliability improvements across applications.

Service Delivery & Operational Excellence

Own 24x7 service delivery for application operations, ensuring reliability, scalability, and performance.
Lead the establishment and enforcement of SLOs, SLAs, error budgets, runbooks, and operational frameworks.
Ensure systems, methodologies, and monitoring strategies are mature, automated, and scalable.
Oversee incident response, problem management, and root-cause analysis with a focus on continuous improvement.

Customer, Stakeholder, and Cross-Functional Leadership

Act as a trusted advisor to customer leadership, guiding them on application health, modernization, and operational maturity.
Address escalations, communicate status updates, and set clear expectations for service delivery.
Build strong relationships across engineering, DevOps, architecture, and business teams.
Provide SME-level insight during pre-sales discussions, solution design, and proposal reviews.

Project, Change & Lifecycle Management

Oversee application lifecycle governance including deploy, operate, measure, and enhance phases.
Ensure changes to application environments are controlled, tested, and aligned to reliability standards.
Maintain comprehensive documentation, governance artifacts, and operational deliverables.

Performance & Quality Management

Define and monitor critical operational metrics (SLA, SLO, MTTR, MTTD, throughput, capacity).
Drive continuous improvement initiatives including automation, shift-left, cost optimization, and self-healing capabilities.
Produce accurate and actionable reporting for leadership and customers.
Lead performance management, recruitment, mentoring, and training of operational teams.

Technical Skill Requirements

Deep knowledge of microservices, APIs, cloud-native architectures , and containerized environments.
Proven experience running large-scale, highly available, complex application ecosystems in 24x7 environments.
Strong grounding in DevOps and SRE principles : CI/CD, automation, resiliency patterns, deployment strategies.
Strong hands-on understanding of application observability :
Distributed tracing
Centralized logging systems
Metrics instrumentation
Telemetry correlation
APM platforms (Datadog, Dynatrace, New Relic, AppDynamics, etc.)
Experience with cloud platforms (Azure, AWS, GCP) and hybrid architectures.
Strong ITIL process proficiency (Incident, Problem, Change).
Familiarity with infrastructure components supporting apps (networking, compute, databases, messaging, caching).

Preferred Qualifications

ITIL certification
Good understanding of the SDLC lifecycle, application architecture, and microservice
SRE, DevOps, or cloud certification (Azure/AWS/GCP)
Experience operating or supporting production microservices at scale
Exposure to chaos engineering, resiliency testing, and reliability automation
Prior experience handling large enterprise customers and multi-vendor delivery models
Must be proficient and certified in ITIL and have excellent practical ITSM knowledge.
Ability to advise on the enhancement/evaluation of changes in the customer production environment.

Application Support SME - SRE Leader

Ascendion

Job Description

Services you might be interested in

Improve Your Resume Today