🔔 FCM Loaded

Azure/ AWS Agentic AI Architect

Genisys Group

2 - 5 years

Bengaluru

Posted: 05/02/2026

Getting a referral is 5x more effective than applying directly

Job Description

Role Summary:


Lead and deliver high-impact initiatives aligned to the AI & Data hiring plan. Own execution excellence with measurable business value, technical depth, and governance.

Key Outcomes (1218 months)

Ship production-grade solutions with clear ROI, reliability (SLOs), and security.

Establish engineering standards, pipelines, and observability for repeatable delivery.

Mentor talent; uplift team capability through reviews, playbooks, and hands-on guidance.

Responsibilities

Translate business problems into well-posed technical specifications and architectures.

Lead design reviews, prototype quickly, and harden solutions for scale (1M+ users / high QPS).

Build automated pipelines (CI/CD) and model/data governance across environments.

Define & track KPIs: accuracy/latency/cost, adoption, and compliance readiness.

Partner with Product, Security, Compliance, and Ops to land safe-by-default systems.

Technical Skills

Data architecture: Lakehouse + medallion (Bronze/Silver/Gold) on Delta Lake

Pipelines: Spark/Databricks, Airflow/Dagster, dbt; streaming with Kafka/Kinesis/PubSub

Storage/Warehouses: ADLS/S3/GCS, Snowflake/BigQuery/Redshift, partitioning/Z-ordering, time travel

Quality & governance: Great Expectations, Deequ, Unity Catalog/Glue; lineage and metadata

Performance: file compaction, AQE in Spark, cost-aware design, caching

Security: IAM, fine-grained access, data masking/row-level security

Architecture & Tooling Stack

Source control & workflow: Git, branching standards, PR reviews, trunk-based delivery.

Containers & orchestration: Docker, Kubernetes, Helm; secrets, configs, RBAC.

Observability: logs, metrics, traces; dashboards with alerting & on-call runbooks.

Data/Model registries: metadata, lineage, versioning; staged promotions.

Performance & Reliability

Define SLAs/SLOs for accuracy, tail latency (p99), throughput, and availability.

Capacity planning with autoscaling; load tests; cache design; graceful degradation.

Cost controls: instance sizing, spot/reserved strategies, storage tiering.

Security & Compliance

IAM, network isolation, encryption (KMS), secret rotation.

Threat modeling, dependency scanning, SBOM, supply-chain security.

Domain-regulatory controls (PCI DSS, HIPAA) where applicable; audit readiness.

Qualifications

Bachelors/Masters in CS/CE/EE/Data Science or equivalent practical experience.

Strong applied programming in Python; familiarity with modern data/ML ecosystems.

Proven track record of shipping and operating systems in production.

Interview Focus Areas

Systems design & trade-offs; failure modes and mitigation.

Hands-on debugging & performance tuning; data quality management.

Security/compliance considerations; stakeholder communication.

Sample Screening Questions

1) Design a low-latency inference service for a fraud model; outline caching, batching, and rollback.

2) Define a strategy to detect drift and trigger retraining with safe deployment.

3) Describe medallion architecture and where data validation should live across layers.

Services you might be interested in

Improve Your Resume Today

Boost your chances with professional resume services!

Get expert-reviewed, ATS-optimized resumes tailored for your experience level. Start your journey now.