Company: CPGBuyerAI (a venture by Beverage Trade Network)

Location: Remote (Bangalore)

Engagement Type: Consultant / Contract

About CPGBuyerAI

CPGBuyerAI is an AI-powered buyer discovery and sales intelligence platform built for wine, beer, spirits, and broader CPG brands.

The platform ingests large volumes of structured and unstructured data, enriches it using AI agents and deterministic rules, and delivers explainable, auditable buyer intelligence to customers.

At CPGBuyerAI, data architecture and AI-readiness are core product capabilities, not back-office functions.

Role Overview

We are hiring a hands-on technical consultant to design, build, and own CPGBuyerAIs data architecture and AI infrastructure. This role is responsible for how data is stored, processed, versioned, queried, and served across the platform spanning raw ingestion, normalized datasets, enriched buyer intelligence, and AI-generated outputs.

You will define the end-to-end technical foundation that powers AI agents, buyer scoring, signal detection, and customer-facing workflows. This is a build-and-own role with real architectural authority.

Key Responsibilities

Cloud Architecture & Infrastructure

Design and manage cloud infrastructure (AWS, GCP, or Azure)

Define compute, storage, networking, and security architecture

Support containerized and serverless workloads

Ensure scalability, fault tolerance, and cost efficiency

Implement Infrastructure-as-Code (Terraform or equivalent)

Data Storage Architecture

Design storage layers for raw ingested data, cleaned and normalized datasets, enriched buyer and company records, and AI-generated outputs such as reasoning, scores, and signals

Select and manage relational databases (Postgres / MySQL), NoSQL databases, object storage (S3 / GCS), and vector databases (pgvector, Pinecone, Weaviate, FAISS)

Define data lifecycle policies including versioning, retention, and archival

Data Pipelines & Processing

Build and own ETL / ELT pipelines for APIs, web crawling and scraping outputs, third-party data sources, and customer-uploaded datasets

Implement batch and nearreal-time processing

Use orchestration tools such as Airflow, Dagster, or Prefect

Ensure pipeline monitoring, retries, and backfills

AI-Ready Data Design

Design intermediate data layers optimized for LLM prompts, feature extraction, buyer matching, and scoring

Ensure AI outputs are traceable to source data, versioned, reproducible, and auditable for QA and customer explanations

Support explainability, including why a buyer was matched or scored

Data Quality, Governance & Security

Implement deduplication and entity resolution

Implement confidence scoring, freshness scoring, and source attribution

Manage schema evolution and backward compatibility

Design multi-tenant data isolation

Handle PII securely and design systems to be SOC2-ready

APIs & Data Serving

Design APIs for internal AI agents, product UI, and external integrations

Optimize for query performance and low latency

Support large-scale, read-heavy workloads

Reliability & Cost Control

Implement logging, metrics, and alerting

Design backup, recovery, and failover strategies

Monitor and optimize cloud costs, including storage tiers and compute scaling

Required Technical Skills

Cloud & Infrastructure

Deep experience with AWS, GCP, or Azure

Docker and Kubernetes

Infrastructure-as-Code, Terraform preferred

Data Systems

Strong SQL and data modeling skills

Experience with relational, NoSQL, and object storage systems

Production experience with vector databases or embedding storage

AI / ML Support

Experience supporting ML or LLM-based systems

Understanding of feature stores, prompt data, and model inputs and outputs

Strong Python-based data engineering experience

Pipelines & Orchestration

Experience with Airflow, Dagster, Prefect, or similar tools

Understanding of streaming concepts such as Kafka or PubSub is a plus

Experience with observability and monitoring tools

Experience Profile

8 to 12 or more years in data architecture, data engineering, or platform engineering

Experience building AI or data platforms in production

Comfortable making architectural trade-offs in fast-moving environments

Hands-on builder, not design-only

What Success Looks Like in the First 6 Months

Stable, scalable cloud and data architecture in production

Clear separation of raw, clean, enriched, and AI-output data layers

Reliable ingestion pipelines with monitoring and QA

AI-ready data systems supporting current and future models

Reduced data quality issues and faster issue resolution

Why Join CPGBuyerAI

Build the core AI data foundation from the ground up

High ownership and architectural authority

Real-world AI and data problems at scale

Backed by BTNs global data, events, and media ecosystem

Flexible engagement with meaningful impact

Senior Data Architect – AI & Cloud Infrastructure

Beverage Trade Network

Job Description

Services you might be interested in

Improve Your Resume Today