Exploring the Power of Knowledge Graphs and Large Language Models: Harnessing Ontologies for AI Applications with Graph RAG

Join me, Jesus Barasa, Field CTO at NEOJ, as we dive into the powerful combination of knowledge graphs and large language models to build AI applications, with a focus on ontologies and their practical applications in graph rag.

* The speaker is Jesus Barasa, Field CTO for AI at NEOJ.
* He will be discussing the combination of knowledge graphs and large language models to build AI applications, focusing on ontologies and their use in graph-based architecture (Graph RAG).
* Graph RAG is a retrieval-augmented generation system where a knowledge graph is used instead of a vector database for retrieving potentially relevant information.
* The retrieved information is passed to a large language model (LLM) within the context window, resulting in more grounded responses based on trusted information rather than purely generated content.
* Using a knowledge graph provides richer collections of retrieval strategies beyond just vector semantic search, such as contextualizing results and generating structured queries.
* The graph implements the property graph model with nodes and relationships as its main primitives. Nodes represent entities (persons, objects, locations, etc.), while relationships are directed and
* When building a graph from unstructured data, it is often enriched with a lexical graph that describes the source documents and includes chunks or sequences of text from the document.
* Chunks in the lexical graph are connected to the domain graph by entity extraction, keeping track of where entities are mentioned for later retrieval purposes.
* Properties stored in nodes can be numeric, string, or other data types, including vectorized representations that can be indexed and searched using a vector storage system.
* To build a Graph RAG, you must first create a graph by following different pipelines based on the source of your data (structured or unstructured).
* Structured data can come from tabular representations in databases or files, with a target schema defining the graph to be built. The process involves mapping records to node types and relationships
* Unstructured data processing with langchain/LA index follows similar steps: document splitting, chunk embedding, and entity extraction. However, you need to inject your schema (type of entities and
* Ontologies are shared descriptions of a domain that can be formal schemas implemented agnostically, allowing for better model-driven knowledge graph creation and dynamic retrieval strategies.
* An ontology defines classes, properties, and relationships in a specific domain and can be serialized in various formats such as XML or JSON.
* Ontologies are useful for driving the creation of both structured and unstructured data graphs and provide a general and agnostic approach to representing schemas.
* Using an ontology-driven graph enhances retrieval strategies, providing richer contextualization when searching for vectors in proximity within the graph.
* Contextualization involves referencing nodes containing text chunks from the vector index, navigating, enriching, filtering, and aggregating data to add richness to the context passed to the LLM.
* The provided example demonstrates how a movie database can be searched using an ontology-driven retriever that dynamically determines relationships and navigation paths based on the schema defined i
* By storing the ontology in the graph, it is possible to create dynamic queries that drive the behavior of retrieval systems, allowing for flexible and adaptable AI applications.
* The two main takeaways are: (1) using ontologies as an implementation-agnostic data model for knowledge graph creation, and (2) storing ontology in the graph to enable dynamic behavior in retrievers

Source: AI Engineer via YouTube

❓ What do you think? What are your thoughts on the ideas shared in this video? Feel free to share your thoughts in the comments!