Senior Data Architect – AI & Cloud Infrastructure
Beverage Trade Network
5 - 10 years
Bengaluru
Posted: 05/02/2026
Job Description
Company: CPGBuyerAI (a venture by Beverage Trade Network)
Location: Remote (Bangalore)
Engagement Type: Consultant / Contract
About CPGBuyerAI
CPGBuyerAI is an AI-powered buyer discovery and sales intelligence platform built for wine, beer, spirits, and broader CPG brands.
The platform ingests large volumes of structured and unstructured data, enriches it using AI agents and deterministic rules, and delivers explainable, auditable buyer intelligence to customers.
At CPGBuyerAI, data architecture and AI-readiness are core product capabilities, not back-office functions.
Role Overview
We are hiring a hands-on technical consultant to design, build, and own CPGBuyerAIs data architecture and AI infrastructure. This role is responsible for how data is stored, processed, versioned, queried, and served across the platform spanning raw ingestion, normalized datasets, enriched buyer intelligence, and AI-generated outputs.
You will define the end-to-end technical foundation that powers AI agents, buyer scoring, signal detection, and customer-facing workflows. This is a build-and-own role with real architectural authority.
Key Responsibilities
Cloud Architecture & Infrastructure
Design and manage cloud infrastructure (AWS, GCP, or Azure)
Define compute, storage, networking, and security architecture
Support containerized and serverless workloads
Ensure scalability, fault tolerance, and cost efficiency
Implement Infrastructure-as-Code (Terraform or equivalent)
Data Storage Architecture
Design storage layers for raw ingested data, cleaned and normalized datasets, enriched buyer and company records, and AI-generated outputs such as reasoning, scores, and signals
Select and manage relational databases (Postgres / MySQL), NoSQL databases, object storage (S3 / GCS), and vector databases (pgvector, Pinecone, Weaviate, FAISS)
Define data lifecycle policies including versioning, retention, and archival
Data Pipelines & Processing
Build and own ETL / ELT pipelines for APIs, web crawling and scraping outputs, third-party data sources, and customer-uploaded datasets
Implement batch and nearreal-time processing
Use orchestration tools such as Airflow, Dagster, or Prefect
Ensure pipeline monitoring, retries, and backfills
AI-Ready Data Design
Design intermediate data layers optimized for LLM prompts, feature extraction, buyer matching, and scoring
Ensure AI outputs are traceable to source data, versioned, reproducible, and auditable for QA and customer explanations
Support explainability, including why a buyer was matched or scored
Data Quality, Governance & Security
Implement deduplication and entity resolution
Implement confidence scoring, freshness scoring, and source attribution
Manage schema evolution and backward compatibility
Design multi-tenant data isolation
Handle PII securely and design systems to be SOC2-ready
APIs & Data Serving
Design APIs for internal AI agents, product UI, and external integrations
Optimize for query performance and low latency
Support large-scale, read-heavy workloads
Reliability & Cost Control
Implement logging, metrics, and alerting
Design backup, recovery, and failover strategies
Monitor and optimize cloud costs, including storage tiers and compute scaling
Required Technical Skills
Cloud & Infrastructure
Deep experience with AWS, GCP, or Azure
Docker and Kubernetes
Infrastructure-as-Code, Terraform preferred
Data Systems
Strong SQL and data modeling skills
Experience with relational, NoSQL, and object storage systems
Production experience with vector databases or embedding storage
AI / ML Support
Experience supporting ML or LLM-based systems
Understanding of feature stores, prompt data, and model inputs and outputs
Strong Python-based data engineering experience
Pipelines & Orchestration
Experience with Airflow, Dagster, Prefect, or similar tools
Understanding of streaming concepts such as Kafka or PubSub is a plus
Experience with observability and monitoring tools
Experience Profile
8 to 12 or more years in data architecture, data engineering, or platform engineering
Experience building AI or data platforms in production
Comfortable making architectural trade-offs in fast-moving environments
Hands-on builder, not design-only
What Success Looks Like in the First 6 Months
Stable, scalable cloud and data architecture in production
Clear separation of raw, clean, enriched, and AI-output data layers
Reliable ingestion pipelines with monitoring and QA
AI-ready data systems supporting current and future models
Reduced data quality issues and faster issue resolution
Why Join CPGBuyerAI
Build the core AI data foundation from the ground up
High ownership and architectural authority
Real-world AI and data problems at scale
Backed by BTNs global data, events, and media ecosystem
Flexible engagement with meaningful impact
Services you might be interested in
Improve Your Resume Today
Boost your chances with professional resume services!
Get expert-reviewed, ATS-optimized resumes tailored for your experience level. Start your journey now.
