Azure/ AWS Agentic AI Architect
Genisys Group
2 - 5 years
Bengaluru
Posted: 05/02/2026
Job Description
Role Summary:
Lead and deliver high-impact initiatives aligned to the AI & Data hiring plan. Own execution excellence with measurable business value, technical depth, and governance.
Key Outcomes (1218 months)
Ship production-grade solutions with clear ROI, reliability (SLOs), and security.
Establish engineering standards, pipelines, and observability for repeatable delivery.
Mentor talent; uplift team capability through reviews, playbooks, and hands-on guidance.
Responsibilities
Translate business problems into well-posed technical specifications and architectures.
Lead design reviews, prototype quickly, and harden solutions for scale (1M+ users / high QPS).
Build automated pipelines (CI/CD) and model/data governance across environments.
Define & track KPIs: accuracy/latency/cost, adoption, and compliance readiness.
Partner with Product, Security, Compliance, and Ops to land safe-by-default systems.
Technical Skills
Data architecture: Lakehouse + medallion (Bronze/Silver/Gold) on Delta Lake
Pipelines: Spark/Databricks, Airflow/Dagster, dbt; streaming with Kafka/Kinesis/PubSub
Storage/Warehouses: ADLS/S3/GCS, Snowflake/BigQuery/Redshift, partitioning/Z-ordering, time travel
Quality & governance: Great Expectations, Deequ, Unity Catalog/Glue; lineage and metadata
Performance: file compaction, AQE in Spark, cost-aware design, caching
Security: IAM, fine-grained access, data masking/row-level security
Architecture & Tooling Stack
Source control & workflow: Git, branching standards, PR reviews, trunk-based delivery.
Containers & orchestration: Docker, Kubernetes, Helm; secrets, configs, RBAC.
Observability: logs, metrics, traces; dashboards with alerting & on-call runbooks.
Data/Model registries: metadata, lineage, versioning; staged promotions.
Performance & Reliability
Define SLAs/SLOs for accuracy, tail latency (p99), throughput, and availability.
Capacity planning with autoscaling; load tests; cache design; graceful degradation.
Cost controls: instance sizing, spot/reserved strategies, storage tiering.
Security & Compliance
IAM, network isolation, encryption (KMS), secret rotation.
Threat modeling, dependency scanning, SBOM, supply-chain security.
Domain-regulatory controls (PCI DSS, HIPAA) where applicable; audit readiness.
Qualifications
Bachelors/Masters in CS/CE/EE/Data Science or equivalent practical experience.
Strong applied programming in Python; familiarity with modern data/ML ecosystems.
Proven track record of shipping and operating systems in production.
Interview Focus Areas
Systems design & trade-offs; failure modes and mitigation.
Hands-on debugging & performance tuning; data quality management.
Security/compliance considerations; stakeholder communication.
Sample Screening Questions
1) Design a low-latency inference service for a fraud model; outline caching, batching, and rollback.
2) Define a strategy to detect drift and trigger retraining with safe deployment.
3) Describe medallion architecture and where data validation should live across layers.
Services you might be interested in
Improve Your Resume Today
Boost your chances with professional resume services!
Get expert-reviewed, ATS-optimized resumes tailored for your experience level. Start your journey now.
