Myth of Data Readiness: Making AI Work with Messy Data

Join me, Anushut, as I explore the myth of data readiness and the limitations of traditional AI approaches, and discover how PromQL's agentic semantic layer can help you deploy reliable AI systems that learn from your business language.

1. Anushut leads the applied research team at PromQL, a sponsor for the reliability track at the AI Engineers Worldwide conference.
2. The talk is about how data readiness for AI systems is a myth and how to make reliable AI systems work with messy data.
3. Data preparation takes up a lot of time, but achieving perfect data is an unrealistic goal.
4. Different systems have different ways of representing the same information (e.g., revenue in cents vs. dollars).
5. In 2019, many companies tried to standardize their data by moving everything to Snowflake or Data Bricks, but that didn't fix all their problems.
6. By 2023, the rise of AI and agents will lead to semantic layers that understand specific data domains, but these too will require regular updates as businesses change their tables, schemas, and wor
7. In 2025, AI will still require perfect data to function properly, which it won't get, leading to losses for companies (e.g., Fortune 500 companies losing $250 million due to poor data quality).
8. Semantic layers and knowledge graphs can help, but they cannot capture every edge case or predefine all necessary information.
9. AI needs to understand a company's tribal knowledge and business language to work effectively with the company's data.
10. Traditionally, business users would consult analysts or engineers who knew the business and its data; these experts would apply their tribal knowledge to answer questions accurately.
11. The goal is to create an AI system that behaves like a day-zero smart analyst who learns from experience and context.
12. PromQL, a foundational language model (LM), creates PromQL plans, a domain-specific language for data retrieval, computation, aggregation, and semantics.
13. The PromQL design decouples the LLM from actual execution to avoid errors; it uses the actual data under the hood and can figure out issues in the data.
14. PromQL allows users to be in charge and correct the AI, nudging it towards better performance over time.
15. The learning process involves creating a company-specific business language (e.g., AcmeQL) that improves the semantic graph and allows for accurate answers to complex questions.
16. An agentic semantic layer can reduce months of work into an immediate start, allowing AIs to learn from interactions and improve their performance over time.
17. PromQL has helped Fortune 500 companies and high-growth fintech companies achieve reliable AI performance.

Source: AI Engineer via YouTube

❓ What do you think? What are your thoughts on the ideas shared in this video? Feel free to share your thoughts in the comments!