🔔 FCM Loaded

Cloud Architect

PwC India

2 - 5 years

Bengaluru

Posted: 17/12/2025

Getting a referral is 5x more effective than applying directly

Job Description

Key responsibilities


1.Architecture and roadmap

Define reference architectures for lakehouse and medallion patterns using Delta Lake, OneLake, and Synapse/Fabric Lakehouse for scalable analytics and AI.

Create domain-driven data models, canonical schemas, and patterns for batch and streaming integration (bronze/silver/gold).

2.Platform design and build

Design ingestion frameworks for batch (ADF/Fabric Pipelines) and streaming (Event Hubs, Kafka, IoT Hub) into ADLS/OneLake with Delta and Change Data Capture.

Architect Databricks workloads (PySpark/Scala/SQL) for ETL/ELT, feature engineering, and ML data prep with robust job orchestration and scheduling.

3.Real-time streaming

Lead Structured Streaming architectures in Databricks with exactly-once semantics, watermarking, and stateful aggregations; design Kappa/Lambda where appropriate.

Implement low-latency serving layers and materialized views for near-real-time analytics and operational reporting.

4.Microsoft Fabric implementation

Establish Fabric workspaces, Lakehouse, Pipelines, Dataflows Gen2, Shortcuts to ADLS/OneLake, and semantic model standards for governed self-service BI.

Define data product patterns integrating Fabric with Databricks and Power BI for governed, reusable datasets.

5.Data governance and security

Implement RBAC/ABAC, Unity Catalog, Purview (lineage, glossary, classifications), encryption, network isolation, and data masking/tokenization.

Define data quality SLAs, expectations, and contracts; embed quality checks, observability, and lineage in pipelines.

6.DevOps and FinOps

Standardize CI/CD (Azure DevOps/GitHub), environment strategy, IaC (Bicep/Terraform), cluster policies, and workspace baselines.

Optimize cost via right-sized clusters, autoscaling, Photon, Delta optimization/Z-Order, and job scheduling.

7.Delivery leadership

Lead design reviews, threat modeling, performance testing, and production readiness; mentor engineers and partner with product/enterprise architects.

Translate business requirements into technical designs, estimates, and roadmaps; drive stakeholder communication and risk management.


Required skills and experience


812 years in data engineering/architecture with 4+ years on Azure data stack; strong leadership in complex enterprise programs.


Deep expertise


Databricks: PySpark/SQL, Delta Lake, Structured Streaming, Jobs/Workflows, Unity Catalog, cluster policies, performance tuning.

Azure: ADLS Gen2, Event Hubs/Kafka, Azure Functions/Logic Apps, Key Vault, ADF, Synapse; VNETs, Private Endpoints, Managed Identity.

Fabric: Lakehouse, OneLake, Pipelines, Dataflows Gen2, Shortcuts, semantic models, governance integration with Purview and Power BI.


Architecture patterns


Lakehouse, medallion, Data Mesh/data products, CDC with Debezium/Fivetran/ADF mapping data flows, SCD handling, schema evolution.

Batch and streaming design, watermarking, state store management, idempotency, backfills, and late/duplicate data handling.


Data management


Dimensional and semantic modeling, Data Vault/Kimball, query performance, partitioning, Z-Order, OPTIMIZE/VACUUM, file sizing.

DQ frameworks (Great Expectations/Deequ), monitoring/observability (Log Analytics, Databricks metrics), SLA/SLO design.


Security and compliance


Purview lineage and classification, Unity Catalog governance, PII/PHI handling, encryption, tokenization; audit, SOC2/ISO, GDPR/DPDP familiarity.


DevOps/IaC and automation


Git-based development, branch strategies, CI/CD for notebooks/SQL/artifacts, IaC for data resources, automated testing.


Communication and leadership


Strong stakeholder engagement, technical writing, solution estimation, and mentoring.


Nice to have

Experience with data products and mesh operating models; product lifecycle and contracts between producer/consumer domains.


ML/feature store integration (Databricks Feature Store), MLOps awareness for data readiness.


Knowledge of dbt, Terraform, Airflow, Confluent, and enterprise SSO/SCIM/SCIM provisioning with Databricks/Fabric.


Qualifications

Bachelors/Masters in Computer Science, Engineering, or related field.


Certifications: Azure Solutions Architect Expert, Azure Data Engineer Associate, Databricks Data Engineer Professional/Associate, Microsoft Fabric Data Engineer Associate.

Services you might be interested in

Improve Your Resume Today

Boost your chances with professional resume services!

Get expert-reviewed, ATS-optimized resumes tailored for your experience level. Start your journey now.