🔔 FCM Loaded

Senior Data Architect – AI & Cloud Infrastructure

Beverage Trade Network

5 - 10 years

Bengaluru

Posted: 05/02/2026

Getting a referral is 5x more effective than applying directly

Job Description

Company: CPGBuyerAI (a venture by Beverage Trade Network)

Location: Remote (Bangalore)

Engagement Type: Consultant / Contract


About CPGBuyerAI


CPGBuyerAI is an AI-powered buyer discovery and sales intelligence platform built for wine, beer, spirits, and broader CPG brands.


The platform ingests large volumes of structured and unstructured data, enriches it using AI agents and deterministic rules, and delivers explainable, auditable buyer intelligence to customers.

At CPGBuyerAI, data architecture and AI-readiness are core product capabilities, not back-office functions.


Role Overview


We are hiring a hands-on technical consultant to design, build, and own CPGBuyerAIs data architecture and AI infrastructure. This role is responsible for how data is stored, processed, versioned, queried, and served across the platform spanning raw ingestion, normalized datasets, enriched buyer intelligence, and AI-generated outputs.


You will define the end-to-end technical foundation that powers AI agents, buyer scoring, signal detection, and customer-facing workflows. This is a build-and-own role with real architectural authority.


Key Responsibilities


Cloud Architecture & Infrastructure


Design and manage cloud infrastructure (AWS, GCP, or Azure)

Define compute, storage, networking, and security architecture

Support containerized and serverless workloads

Ensure scalability, fault tolerance, and cost efficiency

Implement Infrastructure-as-Code (Terraform or equivalent)


Data Storage Architecture


Design storage layers for raw ingested data, cleaned and normalized datasets, enriched buyer and company records, and AI-generated outputs such as reasoning, scores, and signals

Select and manage relational databases (Postgres / MySQL), NoSQL databases, object storage (S3 / GCS), and vector databases (pgvector, Pinecone, Weaviate, FAISS)

Define data lifecycle policies including versioning, retention, and archival


Data Pipelines & Processing


Build and own ETL / ELT pipelines for APIs, web crawling and scraping outputs, third-party data sources, and customer-uploaded datasets

Implement batch and nearreal-time processing

Use orchestration tools such as Airflow, Dagster, or Prefect

Ensure pipeline monitoring, retries, and backfills


AI-Ready Data Design


Design intermediate data layers optimized for LLM prompts, feature extraction, buyer matching, and scoring

Ensure AI outputs are traceable to source data, versioned, reproducible, and auditable for QA and customer explanations

Support explainability, including why a buyer was matched or scored


Data Quality, Governance & Security


Implement deduplication and entity resolution

Implement confidence scoring, freshness scoring, and source attribution

Manage schema evolution and backward compatibility

Design multi-tenant data isolation

Handle PII securely and design systems to be SOC2-ready


APIs & Data Serving


Design APIs for internal AI agents, product UI, and external integrations

Optimize for query performance and low latency

Support large-scale, read-heavy workloads


Reliability & Cost Control


Implement logging, metrics, and alerting

Design backup, recovery, and failover strategies

Monitor and optimize cloud costs, including storage tiers and compute scaling

Required Technical Skills


Cloud & Infrastructure


Deep experience with AWS, GCP, or Azure

Docker and Kubernetes

Infrastructure-as-Code, Terraform preferred


Data Systems


Strong SQL and data modeling skills

Experience with relational, NoSQL, and object storage systems

Production experience with vector databases or embedding storage


AI / ML Support


Experience supporting ML or LLM-based systems

Understanding of feature stores, prompt data, and model inputs and outputs

Strong Python-based data engineering experience


Pipelines & Orchestration


Experience with Airflow, Dagster, Prefect, or similar tools

Understanding of streaming concepts such as Kafka or PubSub is a plus

Experience with observability and monitoring tools


Experience Profile


8 to 12 or more years in data architecture, data engineering, or platform engineering

Experience building AI or data platforms in production

Comfortable making architectural trade-offs in fast-moving environments

Hands-on builder, not design-only


What Success Looks Like in the First 6 Months


Stable, scalable cloud and data architecture in production

Clear separation of raw, clean, enriched, and AI-output data layers

Reliable ingestion pipelines with monitoring and QA

AI-ready data systems supporting current and future models

Reduced data quality issues and faster issue resolution


Why Join CPGBuyerAI

Build the core AI data foundation from the ground up

High ownership and architectural authority

Real-world AI and data problems at scale

Backed by BTNs global data, events, and media ecosystem

Flexible engagement with meaningful impact

Services you might be interested in

Improve Your Resume Today

Boost your chances with professional resume services!

Get expert-reviewed, ATS-optimized resumes tailored for your experience level. Start your journey now.