Agent Engineer
SuprSend
2 - 5 years
Bengaluru
Posted: 12/06/2026
Job Description
Own the AI agents that let teams run SuprSend in plain English in the dashboard, in Slack, and in their editor.
About SuprSend
SuprSend is notification infrastructure for product and engineering teams. We give companies one platform to design, route, and deliver notifications across email, SMS, push, Slack, WhatsApp, and in-app inbox with workflows, templates, user preferences, and delivery analytics built in. Developers integrate once and stop maintaining notification plumbing forever.
Why this role
Over the past year we've shipped an AI layer on top of that infrastructure:
- Agent: a chat surface inside the SuprSend dashboard that queries a workspace, builds workflows, templates, and schemas, pulls analytics, and debugs delivery.
- Slack Agent: the same capabilities inside Slack, with a Privacy Filter that keeps PII out of shared channels.
- Agent Plugin & SDK: an MCP server plus agent skills that drop SuprSend into Claude Code and GitHub / VS Code Copilot, so developers manage notifications without leaving their editor.
Next, we're building even more deeper capabilities - so any AI client can work with SuprSend securely. Also, teaching the agents to work from documents, images, and voice, so a customer can drop in a PRD and get the workflows, segments, templates, and translations to match.
You'll be our first dedicated AI engineer owning these products day to day and shaping where they go. You'll work directly with the founders and the core engineering team, ship to real customers in production, and have unusually high ownership for the level. This is a builder role: you write the code, run the evals, watch the dashboards, and decide what ships.
What you'll do
- Develop, maintain, and improve our agent products Agent, Slack Agent, and the Agent Plugin and the MCP servers and apps behind them.
- Build multimodal capabilities into the agents: turn uploaded documents, images, and (over time) voice into SuprSend assets a dropped-in PRD becomes the workflows, segments, templates, and translations to match.
- Design tool-calling agents that act reliably against live customer workspaces clean tool design, grounding, error handling, least-privilege tool access, server-side permission enforcement, and human approval on state-mutating actions.
- Own context engineering for the agents retrieval, memory, and context compaction so they stay accurate and affordable as conversations and workspaces grow.
- Build and own the evaluation harness across three tiers fast tool-call checks in CI, nightly LLM-as-judge regression suites, and production drift monitoring measuring trajectory as well as final output, so changes to prompts, models, and tools ship on evidence instead of vibes.
- Run agents in production: tracing and observability, failure handling, prompt and model version management, and keeping quality, latency, and cost in a healthy place.
- Build and operate MCP servers and apps, including the auth and tooling that let external AI clients use SuprSend safely.
- Partner with product and engineering to turn customer problems into agent capabilities, then close the loop using real usage and eval data.
What you'll bring
- ~24 years building software, with meaningful recent time spent shipping LLM-powered features or agents to production not just prototypes or demos.
- Strong Python. You're comfortable owning a production Python service end to end.
- Hands-on experience with LLM APIs (across providers like OpenAI and Anthropic) and at least one agent framework (LangGraph, LlamaIndex, or comparable): tool / function calling, structured outputs, and multi-step flows.
- Context engineering you treat the context window as a budget, not an afterthought: retrieval, memory, compaction, and curating which tools and data are in scope per request.
- Real experience running agents or LLM features in production you understand non-determinism, latency and token cost, version management, and what breaks at scale.
- Practical experience with evals: you've built or maintained eval pipelines (offline and online), used LLM-as-judge or rule-based scoring, evaluated the agent's trajectory and not just its final answer, and relied on them to catch regressions and gate releases.
- Security-aware by default you understand prompt injection and tool poisoning, apply least privilege to tool access, and reach for human approval on anything that mutates customer data.
- Solid engineering fundamentals APIs, data modeling, testing, observability, and code other people can maintain.
- Product sense and customer empathy. You can decide what's worth building and what "good enough" looks like for a real user.
Bonus points
- Multimodal models document and image understanding; voice / speech-to-text a plus.
- Model-agnostic design model routing, handling model upgrades and deprecations, prompt caching, and using small / cheap models for routing and classification.
- Hosted / remote MCP Streamable HTTP transport, OAuth 2.1 and scoped, short-lived tokens; experience building or consuming MCP servers and clients.
- Memory systems short- and long-term memory and persistence across threads.
- Agent-interop awareness A2A and the broader protocol landscape (the Agentic AI Foundation standards).
- Domain and surfaces the developer-tools or notification / messaging space; Slack apps, IDE / editor extensions, or CLI tooling.
How we work
- In-office in Bengaluru, collaborating closely with a small, senior team that ships fast.
- High autonomy and direct access to founders your work reaches customers in days, not quarters.
- Pragmatic and developer-first: we care about reliability, good DX, and honest engineering over hype.
- The space moves fast. We value people who evaluate new tools and models on their merits and adopt the right ones without chasing every launch.
Services you might be interested in
We Search & Apply Jobs for You!
Our team scans through 1000s of opportunities and applies to roles best suited to your profile
Save 100+ hours and focus on what matters - cracking interviews and landing offers.
