Cloud Engineer
Insight Global
2 - 5 years
Hyderabad
Posted: 12/02/2026
Job Description
Cloud Engineer Evergreen AI
Evergreen.AIs Agentic Engineering team builds enterprise-grade agentic applications that transform how organizations operate. As a Cloud Engineer (DevOps), you will own the cloud platform, automation, reliability, security, and cost posture that power these agentic systems at scale. You will enable agent orchestration by providing robust runtime, networking, data/vector infrastructure, and secure integrationsso product teams can ship production-ready (not just POCs) agent experiences quickly and safely.
About This Role
- Platform & Runtime Engineering for Agentic Systems
- Design, provision, and operate Azure landing zones, subscriptions, and AKS clusters for multi-tenant agentic workloads.
- Implement secure networking: VNETs, private endpoints, Azure Firewall, service mesh (Istio/Linkerd), API gateways for tool/agent ingress.
- Infrastructure as Code & Environment Automation
- Build and maintain Terraform/Bicep modules; enforce environment parity across dev/test/stage/prod.
- Implement GitOps/progressive delivery (blue/green, canary) and environment drift detection.
- CI/CD for Applications, Tools, and Agents
- Create reusable pipelines in GitHub Actions/Azure DevOps for microservices, adapters/tools, vector indexers, and data prep jobs.
- Manage artifact registries, container build hardening, SBOM publication, and rollout/rollback orchestration.
- Reliability & Observability (SRE)
- Define SLIs/SLOs and error budgets for agentic services; capacity planning and autoscaling (HPA/KEDA).
- Implement end-to-end telemetry: Azure Monitor, Prometheus, Grafana, OpenTelemetry logs/metrics/traces; build on-call and incident runbooks.
- Security, Compliance & Supply Chain
- Enforce Entra ID (AAD) RBAC, Managed Identity, Key Vault secrets, network isolation, and zero-trust patterns.
- Apply policy-as-code (OPA/Conftest), container/image signing (cosign), vulnerability scanning (Trivy), and audit trails supporting SOC2/ISO/GDPR/HIPAA.
- Data & Vector Infrastructure Enablement
- Operate Azure AI Search or managed vector databases (Pinecone/Weaviate) and manage Kafka/Event Hubs for tool and agent events.
- Provide durable execution backplanes (Azure Functions/Durable Functions/Queues) and secure connector credentials with rotation.
- Cost, Performance & Resilience
- Optimize performance and cost (rightsizing, reserved/spot usage); engineer HA/DR patterns and conduct chaos testing.
- Collaboration
- Partner with AI/Agent engineers to productize agents, toolchains, and orchestration on a compliant, reliable platform; contribute reference blueprints and internal docs.
What Youll Do
- Bachelors or Masters degree in Business, Engineering, Computer Science, Information Systems, or related field.
- 5+ years in Cloud/DevOps/SRE, with strong Azure experience.
- Production experience with Kubernetes (AKS), containerization, networking (ingress, DNS, TLS, private link), and Linux.
- IaC (Terraform/Bicep), CI/CD (GitHub Actions/Azure DevOps), and Git/GitOps best practices.
- Observability stack (Azure Monitor, Prometheus, Grafana, OpenTelemetry); incident response and SRE practices.
- Security depth: Entra ID, Key Vault, secrets management, RBAC, policy-as-code, and supply-chain hardening.
- Strong collaboration skills with product, security, and data/AI teams.
Nice-to-Have
- Exposure to agentic frameworks (LangChain, CrewAI, Haystack) and Azure OpenAI.
- Experience with vector databases (Azure AI Search, Pinecone, Weaviate), Kafka/Event Hubs, and service mesh.
- Familiarity with compliance in regulated environments.
Services you might be interested in
Improve Your Resume Today
Boost your chances with professional resume services!
Get expert-reviewed, ATS-optimized resumes tailored for your experience level. Start your journey now.
