Data Readiness: A Pipe Dream in Real-World AI Applications
As an applied research leader at PromQL, I'm here to shatter the myth that data readiness is a pipe dream, and instead, introduce a revolutionary approach to building AI systems that truly understand your business language.
- 1. Anushut leads the applied research team at PromQL, a sponsor for the reliability track at AI engineers worldwide.
- 2. The talk is about how data readiness for AI systems in production environments is a myth.
- 3. Perfect, clean, annotated, and well-named data is not common, and even if it exists, there are always differences in naming conventions and structures across various systems.
- 4. In 2019, the solution was thought to be standardization and moving everything into platforms like Snowflake or Data Bricks, but this has only been 40% successful.
- 5. With the rise of AI and agents in 2023, semantic layers that understand specific data domains were proposed, but these also face challenges due to changing data domains every quarter.
- 6. In 2025, it is expected that AI will still require perfect data to work, which remains an unmet goal.
- 7. Poor data quality costs Fortune 500 companies an average of $250 million annually.
- 8. Semantic layers and knowledge graphs have been tried but cannot capture all the intricacies and exceptions in data.
- 9. AI does not understand business languages, jargon, or context-specific definitions; for example, "GM" could mean gross margin or general manager, depending on the domain.
- 10. Analysts and engineers with tribal knowledge have traditionally bridged this gap by understanding both the business and data systems to answer questions accurately.
- 11. The goal is to create an AI system that behaves like a day-zero smart analyst who learns from their mistakes, improves over time, and understands the business and domain context.
- 12. PromQL has been working on such a solution by decoupling language models (LMs) from answer generation; LMs are used to create plans in PromQL's deterministic runtime language instead.
- 13. This approach reduces hallucinations and allows for more accurate, distributed query execution across various data sources.
- 14. An example is provided where an AI system using PromQL can handle complex tasks like summarizing support tickets, extracting sentiment, and issuing credits based on the analysis.
- 15. The learning process in PromQL involves a prompt learning layer that improves the semantic graph and creates a company's business language (e.g., AcmeQL, GoogleQL).
- 16. Over time, the AI system becomes better at understanding context, naming conventions, relationships, and calculations across various systems and data sources.
- 17. PromQL aims to create an agentic semantic layer that reduces months of work into immediate starts, allowing AI to learn from interactions and improve itself with version-controlled semantic layers
- 18. Customers have reported success with PromQL, including a Fortune 500 food chain company and a high-growth fintech company.
- 19. PromQL's agentic semantic layer helps AI reach 100% accuracy on complex tasks, providing reliable AI solutions for businesses.
Source: AI Engineer via YouTube
❓ What do you think? What are your thoughts on the ideas shared in this video? Feel free to share your thoughts in the comments!