Login Sign Up
🔔 FCM Loaded

Senior Data Architect

Deloitte

5 - 10 years

Bengaluru

Posted: 12/04/2026

Getting a referral is 5x more effective than applying directly

Job Description

About the Company



The Senior Data Architect is responsible for defining and delivering enterprise-grade data architectures and reference implementations across complex, regulated environments.



About the Role



You will design scalable, secure, and cost-efficient data platforms and products spanning batch and real-time processing, lakehouse/warehouse, governance, lineage, and master/reference data. You will also enable GenAI/ML outcomes on governed enterprise data (RAG, semantic search, AI-driven insights, and agentic automation), partnering with Engineering, Security, Infrastructure, BI, and Data Science teams to translate business needs into target-state architectures, standards, and execution roadmaps.



Responsibilities


  • Architecture, Standards & Roadmaps
  • Define target-state enterprise data architecture and multi-year modernization roadmaps (cloud/hybrid), including migration waves and operating model impacts.
  • Produce high-quality architecture artifacts: logical/physical models, integration patterns, data flows, domain/data product designs, and non-functional requirements.
  • Own the architecture blueprint end-to-end (ingestion storage transformation serving) for analytics and AI.
  • Set engineering standards and define reusable reference architectures and frameworks, templates and accelerators:
  • Medallion patterns (Bronze/Silver/Gold) or equivalent layered curation
  • Data products, domain boundaries/ownership, and contract-first delivery
  • Data contracts, schema evolution/versioning, and deprecation/backward compatibility
  • Ingestion, Integration, CDC & Streaming
  • Standardize ingestion/integration patterns across API, batch, file, and CDC (where applicable).
  • Architect streaming/event-driven platforms at scale: Kafka/Event Hubs/Kinesis + Flink/Spark Structured Streaming/Kafka Streams (or equivalents).
  • Define reliability patterns: idempotency, dedupe, watermarking/late data handling, replay/backfill, and operational runbooks.
  • Modeling, Curation & Serving (BI + AI)
  • Model and curate data for consumption: curated datasets, dimensional marts, and semantic alignment for BI; fit-for-purpose datasets for ML/GenAI.
  • Define serving patterns: governed SQL, semantic layers, APIs, and activation/reverse ETL (as needed).
  • Establish performance/cost standards: partitioning/clustering, compaction, file sizing, workload isolation, and unit-cost/FinOps guardrails.
  • Standardize storage/table tech: Parquet/ORC and Delta/Iceberg/Hudi.
  • GenAI, RAG & Agentic Solutioning
  • Solution agentic workflows for enterprise automation using Google ADK and/or AutoGen:
  • Tool/function calling, planning vs execution, memory patterns, and human-in-the-loop approvals
  • Guardrails: prompt-injection defenses, least-privilege tool access, grounding/provenance patterns, safe fallbacks
  • Define scalable patterns for RAG/semantic search:
  • Document ingestion (normalize/classify/dedupe), chunking strategy, metadata enrichment/ACL tagging
  • Embeddings pipelines, vector index/retrieval, and secure context delivery
  • Leverage platform-native AI where appropriate:
  • Databricks (e.g., Mosaic AI, model serving; agent accelerators such as Agent Bricks where applicable)
  • Snowflake (Cortex, Snowpark integration, governance-aligned AI consumption)
  • Gemini Enterprise / Google AI suites (where applicable to client standards)
  • Knowledge graph / semantic modeling implementation (entity resolution, taxonomy/ontology alignment; hybrid graph + vector retrieval).


  • MLOps / LLMOps / AgentOps Enablement
  • Establish production practices: reproducible training/serving datasets, registry integration, CI/CD for data + ML/GenAI pipelines, promotion across environments.
  • Define evaluation + release gating: groundedness/quality metrics, safety checks, regression tests, monitoring/drift signals, and cost/performance baselines.
  • Support model hosting/inference patterns (batch + real-time) and operational monitoring.
  • Governance, Security, Compliance & Run-Ready Ops
  • Define/enforce standards for modeling, quality SLAs/SLOs (freshness/latency/completeness), metadata/catalog/lineage, and auditability.
  • Implement compliance-by-design controls: RBAC/ABAC concepts, row/column security, masking/tokenization, encryption, retention, and private connectivity patterns.
  • Extend governance to AI usage: approved datasets for AI, access-controlled retrieval, and auditable context usage (per policy).
  • Improve run readiness: observability (freshness/latency/failures), alerting, incident response, DR/backup/restore, and RPO/RTO awareness/testing approach.
  • Architect and guide implementation of governance platforms (catalog, lineage, stewardship workflows) and ensure alignment with regulations (GDPR, CCPA, HIPAA as applicable).
  • Master & Reference Data (MDM)
  • Architect MDM and reference data solutions including domain ownership, golden record strategies, survivorship rules, and integration patterns.
  • Define how master/reference data is published/consumed across analytical and operational systems.
  • Stakeholder Partnership & Technical Leadership
  • Act as a primary technical advisor; lead architecture reviews, mentor senior engineers, and drive cross-team alignment and decision-making.
  • Partner with Security/Platform/BI/Product to ensure coherent enterprise-wide solutions.

Services you might be interested in

Improve Your Resume Today

Boost your chances with professional resume services!

Get expert-reviewed, ATS-optimized resumes tailored for your experience level. Start your journey now.