Senior Data Architect
Deloitte
5 - 10 years
Bengaluru
Posted: 12/04/2026
Job Description
About the Company
The Senior Data Architect is responsible for defining and delivering enterprise-grade data architectures and reference implementations across complex, regulated environments.
About the Role
You will design scalable, secure, and cost-efficient data platforms and products spanning batch and real-time processing, lakehouse/warehouse, governance, lineage, and master/reference data. You will also enable GenAI/ML outcomes on governed enterprise data (RAG, semantic search, AI-driven insights, and agentic automation), partnering with Engineering, Security, Infrastructure, BI, and Data Science teams to translate business needs into target-state architectures, standards, and execution roadmaps.
Responsibilities
- Architecture, Standards & Roadmaps
- Define target-state enterprise data architecture and multi-year modernization roadmaps (cloud/hybrid), including migration waves and operating model impacts.
- Produce high-quality architecture artifacts: logical/physical models, integration patterns, data flows, domain/data product designs, and non-functional requirements.
- Own the architecture blueprint end-to-end (ingestion storage transformation serving) for analytics and AI.
- Set engineering standards and define reusable reference architectures and frameworks, templates and accelerators:
- Medallion patterns (Bronze/Silver/Gold) or equivalent layered curation
- Data products, domain boundaries/ownership, and contract-first delivery
- Data contracts, schema evolution/versioning, and deprecation/backward compatibility
- Ingestion, Integration, CDC & Streaming
- Standardize ingestion/integration patterns across API, batch, file, and CDC (where applicable).
- Architect streaming/event-driven platforms at scale: Kafka/Event Hubs/Kinesis + Flink/Spark Structured Streaming/Kafka Streams (or equivalents).
- Define reliability patterns: idempotency, dedupe, watermarking/late data handling, replay/backfill, and operational runbooks.
- Modeling, Curation & Serving (BI + AI)
- Model and curate data for consumption: curated datasets, dimensional marts, and semantic alignment for BI; fit-for-purpose datasets for ML/GenAI.
- Define serving patterns: governed SQL, semantic layers, APIs, and activation/reverse ETL (as needed).
- Establish performance/cost standards: partitioning/clustering, compaction, file sizing, workload isolation, and unit-cost/FinOps guardrails.
- Standardize storage/table tech: Parquet/ORC and Delta/Iceberg/Hudi.
- GenAI, RAG & Agentic Solutioning
- Solution agentic workflows for enterprise automation using Google ADK and/or AutoGen:
- Tool/function calling, planning vs execution, memory patterns, and human-in-the-loop approvals
- Guardrails: prompt-injection defenses, least-privilege tool access, grounding/provenance patterns, safe fallbacks
- Define scalable patterns for RAG/semantic search:
- Document ingestion (normalize/classify/dedupe), chunking strategy, metadata enrichment/ACL tagging
- Embeddings pipelines, vector index/retrieval, and secure context delivery
- Leverage platform-native AI where appropriate:
- Databricks (e.g., Mosaic AI, model serving; agent accelerators such as Agent Bricks where applicable)
- Snowflake (Cortex, Snowpark integration, governance-aligned AI consumption)
- Gemini Enterprise / Google AI suites (where applicable to client standards)
- Knowledge graph / semantic modeling implementation (entity resolution, taxonomy/ontology alignment; hybrid graph + vector retrieval).
- MLOps / LLMOps / AgentOps Enablement
- Establish production practices: reproducible training/serving datasets, registry integration, CI/CD for data + ML/GenAI pipelines, promotion across environments.
- Define evaluation + release gating: groundedness/quality metrics, safety checks, regression tests, monitoring/drift signals, and cost/performance baselines.
- Support model hosting/inference patterns (batch + real-time) and operational monitoring.
- Governance, Security, Compliance & Run-Ready Ops
- Define/enforce standards for modeling, quality SLAs/SLOs (freshness/latency/completeness), metadata/catalog/lineage, and auditability.
- Implement compliance-by-design controls: RBAC/ABAC concepts, row/column security, masking/tokenization, encryption, retention, and private connectivity patterns.
- Extend governance to AI usage: approved datasets for AI, access-controlled retrieval, and auditable context usage (per policy).
- Improve run readiness: observability (freshness/latency/failures), alerting, incident response, DR/backup/restore, and RPO/RTO awareness/testing approach.
- Architect and guide implementation of governance platforms (catalog, lineage, stewardship workflows) and ensure alignment with regulations (GDPR, CCPA, HIPAA as applicable).
- Master & Reference Data (MDM)
- Architect MDM and reference data solutions including domain ownership, golden record strategies, survivorship rules, and integration patterns.
- Define how master/reference data is published/consumed across analytical and operational systems.
- Stakeholder Partnership & Technical Leadership
- Act as a primary technical advisor; lead architecture reviews, mentor senior engineers, and drive cross-team alignment and decision-making.
- Partner with Security/Platform/BI/Product to ensure coherent enterprise-wide solutions.
Services you might be interested in
Improve Your Resume Today
Boost your chances with professional resume services!
Get expert-reviewed, ATS-optimized resumes tailored for your experience level. Start your journey now.
