#4137 Hybrid RAF with Knowledge Graph and Embedded Documents

#4137 Given a knowledge graph with embedded documents, what is "hybrid" RAG?

In the world of Retrieval-Augmented Generation (RAG), "Hybrid" RAG is essentially the "best of both worlds" approach. When you have a knowledge graph (KG) integrated with embedded documents, you aren't just looking for text that looks similar; you’re looking for how concepts are connected.

Think of it as the difference between searching a library by keyword (Vector) versus having a librarian who knows exactly which authors influenced each other and which chapters belong to which broader theories (Graph).

How It Works: The Two Pillars

Hybrid RAG combines two distinct retrieval methods to feed the Large Language Model (LLM):

1. Vector-Based Retrieval (Unstructured)

The "What": This uses mathematical representations (embeddings) to find documents that are semantically similar to your query.
The Strength: Great at handling fuzzy language, synonyms, and general themes. If you ask about "feline health," it knows to look for "cat medicine."

2. Graph-Based Retrieval (Structured)

The "How": This traverses the Knowledge Graph to find explicit relationships between entities (e.g., Drug A -> Treats -> Disease B).
The Strength: Exceptional at multi-hop reasoning. It can answer: "Which researchers at University X published papers on the side effects of the drug we use for Disease B?"

Why Use It?

By embedding documents directly into or alongside a Knowledge Graph, you solve the "context fragmentation" problem:

Feature	Vector RAG	Graph RAG	Hybrid RAG
Logic	Similarity	Relationships	Contextual Logic
Accuracy	High for simple facts	High for complex paths	Superior overall
Noise	Can grab irrelevant text	Can be too rigid	Filters noise via structure

The Workflow in Practice

The Query: A user asks a complex question.
Vector Search: The system finds the most relevant document chunks (the "blobs" of text).
Graph Traversal: The system identifies key entities in the query and finds their neighbors in the graph (the "facts").
Fusion: The "facts" from the graph provide a map for the "text" from the vectors.
Generation: The LLM receives a prompt enriched with both raw text and structured data, leading to a much more grounded and less "hallucinatory" answer.

Pro Tip: In this setup, document chunks are often treated as "Nodes" themselves, linked to the entities they mention. This allows the system to jump from a structured fact directly into the nuanced paragraph that explains it.

Would you like me to walk through a specific architecture example, such as using LangChain or LlamaIndex to implement this?

How is this workflow performed in Python?

Implementing a Hybrid Graph-Vector RAG in Python usually involves a "tri-force" of tools: a Graph Database (like Neo4j), a Vector Store (like Pinecone or Milvus), and an Orchestration Framework (like LlamaIndex or LangChain).

Here is how the workflow is typically structured in code.

1. The Setup: Connecting the Engines

Most modern Python implementations use LlamaIndex because it has a dedicated PropertyGraphIndex that handles the heavy lifting of linking text chunks to graph nodes.

from llama_index.core import PropertyGraphIndex
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.llms.openai import OpenAI

# Initialize your storage context (e.g., Neo4j for the graph + Vector store)
graph_store = Neo4jPropertyGraphStore(
    username="neo4j", password="password", url="bolt://localhost:7687"
)

index = PropertyGraphIndex.from_documents(
    documents,
    property_graph_store=graph_store,
    embed_model=OpenAIEmbedding(),
)

2. The Retrieval Strategy (The "Hybrid" Part)

The "magic" happens in the retriever. In Python, you define a sub-retriever for the Vector search and a sub-retriever for the Graph traversal, then combine their outputs.

A. Vector Retrieval

Finds the text chunks based on semantic similarity.

score = \cos(\theta) = \frac{A \cdot B}{\|A\| \|B\|}

score = \cos(\theta) = \frac{A \cdot B}{\|A\| \|B\|}

B. Graph Retrieval (Synonym/Relationship Expansion)

Uses Cypher (for Neo4j) to find related nodes:

MATCH (e:Entity {name: "Diabetes"})-[:TREATS]->(m:Medication) 
RETURN m.name

3. The Combined Workflow

In a production-ready Python script, the logic usually follows this flow:

Extract Entities: Use an LLM to pull entities from the user query.
Parallel Search:
- vector_results = vector_index.search(query_embedding)
- graph_results = graph_store.get_rel_map(entities)
Reranking/Fusion: Use a library like RankGPT or a simple Reciprocal Rank Fusion (RRF) to merge the results.
Synthesis: Pass the combined context to the LLM.

Example logic:

# Create a hybrid retriever
retriever = index.as_retriever(
    sub_retrievers=[
        "vector",      # Look for similar text
        "keyword",     # Look for exact matches
        "synonym"      # Traverse the graph for related concepts
    ]
)

# Querying the hybrid system
nodes = retriever.retrieve("What are the side effects of medications for Type 2 Diabetes?")
response = query_engine.synthesize(query, nodes)

4. Key Libraries to Use

Component	Popular Python Choices
Orchestration	`llama-index`, `langchain`, `haystack`
Graph Database	`neo4j` (via `neo4j` driver), `falkordb`
Vector Search	`faiss-cpu`, `qdrant-client`, `pinecone-client`
Schema/Modeling	`pydantic` (essential for structuring graph nodes)

Why Python makes this easier

Python’s ecosystem allows for Multi-Modal Graphing. You can use NetworkX for local graph analysis before pushing it to a heavy-duty database like Neo4j. This allows you to test your "Hybrid" logic locally without a complex cloud setup.

Would you like me to provide a complete code snippet showing how to build a simple local hybrid retriever using NetworkX and FAISS?

FYI, I think the relation arrow is backwards for

"MATCH (e:Entity {name: "Diabetes"})-[:TREATS]->(m:Medication)

RETURN m.name"

Good catch! You’re absolutely right. Unless we’ve discovered some sentient, medical-grade sugar, Diabetes definitely doesn't treat Medication.

In Cypher (and logic), the relationship should flow from the agent to the object. Here is the corrected query:

Cypher

MATCH (m:Medication)-[:TREATS]->(e:Entity {name: "Diabetes"})
RETURN m.name

Why this matters in Hybrid RAG

In a Python implementation (especially when using LLMs to auto-generate Cypher queries), "directionality" is one of the most common points of failure. If the LLM generates the arrow backwards:

Vector RAG would still find documents about Diabetes medications because the words match.
Graph RAG would return an empty set (or a "hallucinated" error) because no node in your database has an outgoing [:TREATS] relationship starting from a disease.

Implementing a Local Hybrid Retriever (NetworkX + FAISS)

To see this in action without a full Neo4j setup, you can use NetworkX for the graph and FAISS for the vectors. This script demonstrates the "Hybrid" fusion:

import networkx as nx
import faiss
import numpy as np
from sentence_transformers import SentenceTransformer

# 1. The Model
model = SentenceTransformer('all-MiniLM-L6-v2')

# 2. The Vector Store (Simulated with FAISS)
documents = [
    "Metformin is a first-line medication for Type 2 Diabetes.",
    "Insulin helps regulate blood sugar levels.",
    "Regular exercise improves cardiovascular health."
]
embeddings = model.encode(documents)
index = faiss.IndexFlatL2(embeddings.shape[1])
index.add(np.array(embeddings))

# 3. The Knowledge Graph (NetworkX)
G = nx.DiGraph()
# Corrected directionality: Medication -> TREATS -> Condition
G.add_edge("Metformin", "Type 2 Diabetes", rel="TREATS")
G.add_edge("Insulin", "Type 2 Diabetes", rel="TREATS")

def hybrid_query(query_text):
    # --- Step A: Vector Search ---
    query_vec = model.encode([query_text])
    D, I = index.search(query_vec, k=1)
    vector_context = documents[I[0][0]]
    
    # --- Step B: Graph Search (Finding Neighbors) ---
    # Simple logic: If an entity from our graph is in the query, find its 'TREATS' neighbors
    graph_context = []
    if "Type 2 Diabetes" in query_text:
        # Finding what TREATS Type 2 Diabetes
        predecessors = [n for n in G.predecessors("Type 2 Diabetes")]
        graph_context = f"Graph identifies these treatments: {', '.join(predecessors)}"
    
    return f"Vector Result: {vector_context}\nGraph Result: {graph_context}"

print(hybrid_query("Tell me about Type 2 Diabetes treatments."))

What’s Next?

In a real-world scenario, the hardest part is Entity Linking (mapping the text "Diabetes" in a document to the specific "Diabetes" node in your graph).

Would you like to see how to use an LLM (via LangChain or LlamaIndex) to automatically extract these entities and query the graph correctly?

From <https://gemini.google.com/app/> Google Gemini 3.0 Pro (Fast)

Search This Blog

Learning from ChatGPT