Core Responsibilities

Architecture & System Design: Own the end-to-end design of AI and GenAI systems from data ingestion and vector indexing to model deployment and inference optimization. Architect scalable LLM/RAG pipelines, multiagent workflows, and generative AI services reusable across client domains. Define enterprise standards for embeddings, prompt orchestration, caching layers, and evaluation pipelines.
MLOps & Production Deployment: Establish repeatable patterns for fine-tuning and deploying ML/LLM models in production. Drive automation through MLOps and AIOps pipelines using MLflow, Kubeflow, Airflow, and KServe. Architect for multi-cloud scalability across Azure, AWS, and GCP. Build strategic PoCs to validate model fitment and translate business problems into working AI systems.
Governance, Security & Compliance: Define and enforce AI architecture principles, security policies, and responsible AI guardrails. Implement controls for PII/PHI protection, hallucination risk mitigation, audit logging, and model explainability. Apply zero-trust principles private networking, API gateways, and identity management to keep data within secure perimeters.
Collaboration & Technical Leadership: Partner with data engineering, cloud, security, and product teams for end to end architectural alignment. Lead build-vs-buy assessments for AI platforms, vector databases, and MLOps tooling. Mentor engineers, conduct architecture reviews, and track the evolving AI landscape to recommend timely adoption of emerging tools.

Technical & Professional Qualifications

AI Architecture Experience: 8 + years in software/AI engineering, platform engineering, or cloud architecture, with at least 4 years hands-on in production GenAI or LLM systems.
LLM & Agent Framework Expertise: Deep hands-on experience with LangChain, LlamaIndex, AutoGen, or OpenAI Agents API, with a proven ability to build and deploy multi-agent systems at scale.
MLOps & Engineering Depth: Strong command of MLflow, Kubeflow, Airflow, KServe, Docker, and Kubernetes. Solid Python skills and distributed systems design experience across Azure, AWS, and GCP.
Vector & Search Proficiency: Hands-on experience with vector databases Pinecone, FAISS, Chroma DB, Weaviate, or Elasticsearch and strong understanding of RAG patterns and embedding strategies.
Analytical Thinking: Ability to evaluate foundation model trade-offs, define fine-tuning strategies, and translate complex business problems into scalable AI architectures.
Governance & Security Knowledge: Strong grasp of data governance, PII/PHI handling, OAuth 2.0, zero-trust architecture, and responsible AI frameworks applicable to enterprise environments.
Soft Skills: Exceptional ability to communicate architectural decisions to both technical teams and business stakeholders, with a focus on clarity, pragmatism, and long-term system thinking.

Good to Have

Experience delivering AI solutions in IT services or multi-client consulting environments.
Hands-on with enterprise platforms such as Salesforce, SAP, or ServiceNow.
Knowledge of LLMOps, model observability (Datadog, Grafana, OpenTelemetry), and GPU cost optimization.
Experience with Azure AI Foundry, AWS SageMaker, or GCP Vertex AI at production scale.

Tech Stack

LLM & Agentic Stack: LangChain/LangGraph, OpenAI/Anthropic APIs, Pydantic AI, Langfuse (observability), and VectorDB (Pinecone/Chroma).
Deep Learning & Research Stack: Python, PyTorch, Hugging Face Transformers, NumPy.
Enterprise Production Stack: Python, TensorFlow/Keras, Docker, Kubernetes, AWS SageMaker/Vertex AI.
Data & Analytics Stack: Apache Spark (Databricks), Pandas, SQL, Kafka

AI Architect

SaasAnt

Job Description

Services you might be interested in

Improve Your Resume Today