Data Engineer — LLM & Agentic Systems (RAG, Observability, Feedback Loops)
AIBound
2 - 5 years
Bengaluru
Posted: 12/02/2026
Job Description
Company Description
AIBound is revolutionizing AI security with the industry's first unified control plane for secure AI adoption. We discover, test, and protect each AI model, agent, and identitycatching AI risks before impact so enterprises can innovate safely and at speed. As AI adoption outpaces security across global organizations, AIBound eliminates the dangerous gap between innovation and protection.
Led by our CEO and founder, the former CISO at Palo Alto Networks and Workday, AIBound brings together a world-class team of cybersecurity veterans who have secured some of the world's most advanced enterprises. We're a fast-growing company backed by leading investors, positioned at the critical intersection of AI innovation and enterprise securityone of the most strategic technology frontiers of our generation.
Join us in building the future of AI security, where cutting-edge artificial intelligence meets battle-tested cybersecurity expertise.
Role
At AIBound, our AI security products depend on high-integrity, traceable, and continuously improving data flowsfrom enterprise content ingestion to retrieval, agent memory, and feedback signals. We are hiring a Data Engineer to build the data backbone for LLM and agentic workflows in production: reliable pipelines, searchable knowledge stores, and measurable quality loops.
Youll work closely with AI engineers and MLOps to ensure our data systems support safe, explainable, and scalable AI behavior.
Responsibilities
- Build and operate batch + streaming pipelines that power LLM and agentic systems
- Own end-to-end RAG data pipelines: ingestion, cleaning, chunking, embedding, indexing, refresh strategies
- Integrate and manage vector storage (e.g., Pinecone/Weaviate/Milvus/FAISS) and embedding workflows
- Implement data quality checks, schema controls, and automated validation for AI-facing datasets
- Add observability and lineage: trace data from source transformations retrieval results user outcomes
- Maintain prompt data management: prompt/version history, evaluation datasets, and reproducible runs
- Build feedback loops (thumbs up/down, human review signals, agent tool success/failure) to improve retrieval and responses
- Support agentic workflow data needs: tool I/O logs, memory stores, orchestration metadata
Qualifications
- 12 years of hands-on, real-world data engineering experience (production pipelines)
- Strong Python + SQL skills
- Experience with orchestration and pipelines (e.g., Airflow, Dagster) and/or streaming (Kafka / Pub/Sub / Beam)
- Practical knowledge of RAG pipelines, embeddings, vector search, and metadata filtering
- Understanding of data quality, observability, and lineage (what to measure, how to alert, how to debug)
- Basic understanding of agentic workflows: tools, memory, orchestration, failure modes
Benefits & Culture
- Highly competitive salary and equity package
- Hybrid work environment (2 days onsite per week), and vacation policy
- Comprehensive health benefits
- Professional development budget, conference attendance and access to AI research resources.
- AIBound is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees.
Services you might be interested in
Improve Your Resume Today
Boost your chances with professional resume services!
Get expert-reviewed, ATS-optimized resumes tailored for your experience level. Start your journey now.
