#4232 Knowledge Graph Documents and Doc_Chunks
#4232 Knowledge Graph Documents and Doc_Chunks
#4232 Tell me about Neo4j AuraDB
Neo4j AuraDB is a fully managed, cloud-native graph database-as-a-service (DBaaS).
As of early 2026, it is a primary choice for developers building Knowledge Graphs and GraphRAG (Retrieval-Augmented Generation) systems because it handles complex relationships much faster than traditional relational databases.
## Core Features
Zero Administration: Fully automated updates, patches, and management.
Multi-Cloud: Available on AWS, Google Cloud (GCP), and Microsoft Azure.
Cypher Query Language: Uses Neo4j’s intuitive "ASCII-art" style query language.
Native Graph Engine: Stores data in a graph format (nodes and relationships) rather than tables, eliminating the need for expensive JOIN operations.
ACID Compliant: Ensures data integrity for mission-critical transactional workloads.
Neo4j Aura Agent: A recently released (Feb 2026) tool that helps automate the creation of AI agents grounded in your knowledge graph.
## Pricing & Tiers
AuraDB uses a consumption-based model, primarily billed based on the memory (RAM) allocated to your cluster.
| Tier | Price (est.) | Best For | Key Limits |
| Free | $0 | Learning & Prototyping | 200k nodes, 400k relationships. |
| Professional | ~$65/GB/mo | Production Apps | Up to 128GB RAM, daily backups. |
| Business Critical | ~$146/GB/mo | Enterprise Scale | 99.95% SLA, Multi-zone, 512GB RAM. |
| Virtual Dedicated Cloud | Custom | High Isolation | Dedicated infrastructure, VPC peering. |
## AuraDB vs. Self-Hosted
AuraDB: Best if you want to get started instantly.
It includes features like the Query Log Analyzer and Neo4j Bloom (visualization) out of the box. Self-Hosted: Best if you have strict data residency requirements or want to use the Community Edition (which is free for local use but lacks high availability and granular security).
## Why Use It for AI?
AuraDB is increasingly used to improve LLM accuracy. By storing data in a graph, you can perform multi-hop reasoning (connecting A to B to C), which helps AI models understand context better than flat vector databases alone.
Would you like me to show you a basic Cypher query to see how it differs from SQL?
If I have documents associated with entities in a knowledge graph, should they be embedded as an attribute of the entity, or embedded in a separate document entity that has relationship with the original entity?
In the world of GraphRAG and Knowledge Graphs, the "where" and "how" of your embeddings can make or break your retrieval performance. While both approaches are technically possible, the industry standard for 2026 is a hybrid approach that favors separating documents into their own nodes.
Here is the breakdown of why one usually outperforms the other.
## Option 1: Separate Document Entities (Recommended)
In this model, your "Business" or "Person" entity is connected to one or many "Document" or "Chunk" nodes via a relationship like HAS_SOURCE or DESCRIBED_BY.
Granularity: You can split a long document into multiple chunks, each with its own embedding.
Attaching a single embedding to a main entity often "dilutes" the vector, making it less accurate for specific queries. Many-to-Many Relationships: A single document (like a news article) might mention five different entities. By keeping the document separate, you avoid duplicating the heavy vector data across five different nodes.
Metadata Richness: You can store document-specific metadata (upload date, URI, author) on the Document node without cluttering your core business entity.
## Option 2: Attribute Embedding (Simple/Small Scale)
In this model, the embedding lives directly on the entity (e.g., (p:Person {embedding: [...]})).
When to use: Only if the text is very short and stable (e.g., a 2-sentence bio or a product description) and there is a strictly 1:1 relationship between the text and the entity.
The Downside: If you ever update the document, you have to re-index the entire entity. It also makes graph traversals "heavier" if your database has to skip over large vector properties while performing simple relationship hops.
## Comparison at a Glance
| Feature | Separate Document Node | Attribute on Entity |
| Search Accuracy | High (Chunking allows for precise hits) | Lower (Vectors get "noisy" with long text) |
| Scalability | High (Handles multiple sources per entity) | Low (Hard to manage multiple docs) |
| Graph Traversal | Faster (Core entities stay "lean") | Slower (Nodes carry heavy data payloads) |
| Complexity | Slightly more Cypher logic | Very simple |
## The "Best Practice" Schema
If you are using Neo4j AuraDB, the most efficient schema for an AI-ready graph looks like this:
Entity: Contains the structured data (Name, ID, Type).
Chunk: Contains the Embedding and the text snippet.
Document: Contains the source metadata (URL, Filename).
This allows you to perform Vector Search to find the right chunk, and then immediately "hop" to the associated entity to get the structured context that an LLM needs to be accurate.
Would you like me to generate a Cypher script to set up a vector index for this specific "Separate Node" structure?
Let's go with the document plus doc_chunk method. How would I add that document with its doc_chunks to the knowledge graph?
To implement the Document → Chunk → Entity model in Neo4j, you’ll want to follow a "Parent-Child" pattern. This ensures your graph stays organized and your vector searches stay fast.
Here is the standard workflow using Cypher (Neo4j’s query language).
## 1. The Schema Strategy
We will create three distinct node types to keep the data clean:
:Document: The source file metadata (e.g.,filename,url).:Chunk: The actual text segments with their embeddings.:Entity: The real-world thing the text is about (e.g.,Person,Company).
## 2. The Cypher Implementation
You can run this as a single transaction or split it. This example assumes you have already generated your embeddings (vectors) in your application code (Python, Node.js, etc.).
// 1. Create the Parent Document
MERGE (d:Document {id: "doc_001"})
SET d.name = "Annual_Report_2025.pdf",
d.url = "https://s3.bucket/report.pdf"
// 2. Create the Chunks and link them to the Document
WITH d
UNWIND $chunk_data AS map
CREATE (c:Chunk {id: map.chunk_id})
SET c.text = map.text,
c.embedding = map.embedding // This is the vector array
MERGE (c)-[:PART_OF]->(d)
// 3. Link Chunks to existing Entities mentioned in the text
WITH c, map
MATCH (e:Entity {name: map.mentioned_entity})
MERGE (e)-[:MENTIONED_IN]->(c)
## 3. Setting Up the Vector Index
Once the data is in, Neo4j needs a Vector Index to make searching efficient. You only need to run this command once per database.
Assuming your embedding dimension is 1536 (standard for OpenAI text-embedding-3-small or similar models):
CREATE VECTOR INDEX chunk_index IF NOT EXISTS
FOR (c:Chunk) ON (c.embedding)
OPTIONS {indexConfig: {
`vector.dimensions`: 1536,
`vector.similarity_function`: 'cosine'
}}
## 4. How to Query It (The "GraphRAG" Magic)
This is where the separate node structure shines. When a user asks a question, you search for the Chunk, then hop to the Entity to get the structured context for your LLM.
// Find the most relevant chunk, then get the related Entity context
CALL db.index.vector.queryNodes('chunk_index', 5, $query_vector)
YIELD node AS chunk, score
MATCH (entity:Entity)-[:MENTIONED_IN]->(chunk)
RETURN chunk.text, entity.name, score
## Why this is the "Gold Standard"
Performance: You aren't searching through the entire
Entitynode (which might have hundreds of properties); you are only hitting the leanChunknodes.Context: Because the
Chunkis linked to anEntity, you can tell your LLM: "Here is a text snippet, and by the way, it's talking about the entity 'Tesla Inc' which has a 'Ticker' property of 'TSLA'."
Would you like me to provide a Python snippet using the neo4j driver to automate this upload process?
Yes
In 2026, the best way to handle this is using the Neo4j Python Driver (v6+) which includes native support for the Vector data type. This is more efficient and provides better data integrity than the older method of storing vectors as simple lists.
The following script implements the Document → Chunk → Entity model.
## Python Implementation
import os
from neo4j import GraphDatabase, Vector
# 1. Connection Details
URI = "neo4j+s://your-aura-id.databases.neo4j.io"
AUTH = ("neo4j", "your-password")
def upload_document_to_graph(driver, doc_id, filename, chunks):
"""
chunks should be a list of dicts:
[{'id': 'c1', 'text': '...', 'embedding': [0.1, 0.2...], 'entity': 'Tesla'}]
"""
# Query uses UNWIND for high-performance batching
query = """
MERGE (d:Document {id: $doc_id})
SET d.name = $filename
WITH d
UNWIND $chunk_data AS chunk
CREATE (c:Chunk {id: chunk.id})
SET c.text = chunk.text,
c.embedding = chunk.vector // Using the new Vector type
MERGE (c)-[:PART_OF]->(d)
WITH c, chunk
MATCH (e:Entity {name: chunk.entity})
MERGE (e)-[:MENTIONED_IN]->(c)
"""
# Convert raw lists to Neo4j native Vectors (Float32 is standard)
processed_chunks = []
for ch in chunks:
processed_chunks.append({
"id": ch['id'],
"text": ch['text'],
"vector": Vector.from_native(ch['embedding'], "f32"),
"entity": ch['entity']
})
# Execute in a single transaction
driver.execute_query(
query,
doc_id=doc_id,
filename=filename,
chunk_data=processed_chunks,
database_="neo4j"
)
# 2. Main Execution
with GraphDatabase.driver(URI, auth=AUTH) as driver:
driver.verify_connectivity()
# Example Data
sample_chunks = [
{
"id": "chunk_001",
"text": "Tesla is an electric vehicle company.",
"embedding": [0.12, 0.05, -0.22], # Simplified 3D vector
"entity": "Tesla"
}
]
upload_document_to_graph(driver, "doc_123", "tesla_report.pdf", sample_chunks)
print("Document and chunks uploaded successfully.")
## Key Advantages of this Script
Native Vector Type: By using
Vector.from_native(..., "f32"), the driver validates the dimensions and data type locally before sending it to AuraDB, preventing expensive database errors.Batching with
UNWIND: Instead of running a loop of queries, we send the entire list of chunks in one network request. This is significantly faster for large documents.Explicit Routing: The
execute_querymethod automatically handles connection pooling and ensures the write goes to the leader of your Aura cluster.
## Quick Performance Tip
If you are uploading thousands of chunks, remember to create your Vector Index (via Cypher) before you start the upload. This allows the database to index the vectors in real-time as they arrive.
Would you like me to show you how to set up the "Aura Agent" (released last month) to chat with this data directly?
Based on the Cypher query return of chunk.text and entity.name, the entity.name(s) could be used in a follow-up Cypher query, and the chunk.text could be used in an LLM prompt
Exactly. You’ve hit on the core "GraphRAG" loop that makes this setup so much more powerful than a simple vector database.
By retrieving both the vector context (the chunk.text) and the graph context (the entity.name), you can perform what's called a Knowledge Retrieval Step before the LLM even sees the data.
## The "Two-Step" Retrieval Flow
Step 1: The Vector Discovery
You find the relevant text chunks using vector similarity.
CALL db.index.vector.queryNodes('chunk_index', 3, $query_vector)
YIELD node AS chunk, score
MATCH (entity:Entity)-[:MENTIONED_IN]->(chunk)
RETURN chunk.text, entity.name
Step 2: The Graph Enrichment (The "Follow-up")
Once you have those entity.name values, you can instantly pull their structured data (properties, relationships, history) which the vector search might have missed.
// Query to get the "Full Picture" of the discovered entities
MATCH (e:Entity)
WHERE e.name IN $discovered_names
OPTIONAL MATCH (e)-[r:PARTNER_WITH|COMPETES_WITH]->(other:Entity)
RETURN e.properties, type(r), annexed_info
## Why this matters for your LLM Prompt
Instead of just giving the LLM a random text snippet, you can now structure your prompt like this:
Context from Documents: > "[chunk.text contents here]"
Verified Knowledge Graph Facts:
Entity: [entity.name]
Industry: [entity.industry]
Key Partners: [List of names from Step 2]
User Question: "How does this document affect our partnership strategy?"
## Benefits of this Pattern
Eliminating Hallucinations: If the document mentions "Project X," the graph can tell the LLM that "Project X" was actually cancelled in 2025—something the document itself might not say.
Complex Reasoning: You can answer questions like "Show me documents about companies that are partners with Tesla"—a query that is impossible for a standard vector DB but trivial for Neo4j.
## Next Step
Would you like me to write a Python function that combines these two steps—first doing the vector search, then automatically pulling the related entity properties into a single "Context Block" for your LLM?
Yes
In 2026, the industry-standard way to do this is using the neo4j-graphrag Python package. It abstracts the "Vector Search + Graph Hop" into a single call.
The logic below uses a VectorCypherRetriever. This performs a vector search on your Chunk nodes first, then executes a Cypher query to pull the properties of the connected Entity nodes.
## Unified Retrieval Script
from neo4j import GraphDatabase
from neo4j_graphrag.retrievers import VectorCypherRetriever
from neo4j_graphrag.embeddings.openai import OpenAIEmbeddings
# 1. Setup Connection
URI = "neo4j+s://your-aura-id.databases.neo4j.io"
AUTH = ("neo4j", "your-password")
driver = GraphDatabase.driver(URI, auth=AUTH)
# 2. Initialize the Embedder (to turn user questions into vectors)
embedder = OpenAIEmbeddings(model="text-embedding-3-small")
# 3. Define the "Graph Hop" Query
# The $item variable represents the Chunk found by the vector search
retrieval_query = """
MATCH (node)-[:PART_OF]->(doc:Document)
OPTIONAL MATCH (entity:Entity)-[:MENTIONED_IN]->(node)
RETURN
node.text AS text,
doc.name AS source_doc,
collect(entity.name) AS related_entities,
collect(entity.description) AS entity_context
"""
# 4. Create the Unified Retriever
retriever = VectorCypherRetriever(
driver,
index_name="chunk_index", # The index we created earlier
embedder=embedder,
retrieval_query=retrieval_query
)
def get_llm_context(user_question):
# This single call embeds the question, finds chunks, and hops to entities
search_results = retriever.search(query_text=user_question, top_k=3)
context_blocks = []
for result in search_results.items:
block = f"SOURCE: {result.metadata['source_doc']}\n"
block += f"TEXT: {result.content}\n"
block += f"ENTITIES: {', '.join(result.metadata['related_entities'])}\n"
block += f"ENTITY INFO: {result.metadata['entity_context']}\n"
context_blocks.append(block)
return "\n---\n".join(context_blocks)
# Example Usage
context_for_prompt = get_llm_context("What are the risks mentioned for Tesla?")
print(context_for_prompt)
## Why this is better than manual follow-up queries:
Reduced Latency: You are making one trip to the database instead of two.
Metadata Injection: The
retrieval_querylets you reshape the data exactly how your LLM likes it (e.g., combining the chunk text with its parent document name and related entity metadata).Automatic Embedding: You don't have to manually call OpenAI to embed the user's question; the
VectorCypherRetrieverdoes it for you.
## Pro-Tip: The "Aura Agent" Shortcut
If you are on Neo4j AuraDB, you can now use the Aura Agent (released Feb 2026). In the Aura Console, you can simply click "Create Agent", point it at your graph, and it will automatically generate these retrieval tools and provide a ready-to-use API endpoint, skipping the need to write the Python retrieval logic yourself.
Would you like me to explain how to structure the final LLM System Prompt to make the best use of this combined "Graph + Vector" context?
Yes
To make the most of the Document + Chunk + Entity structure, your system prompt needs to explicitly tell the LLM how to weigh "unstructured" text (the chunk) against "structured" facts (the entity nodes).
In 2026, the best practice is to provide the LLM with a structured context block and a clear set of reasoning instructions.
## Recommended System Prompt Structure
You are an expert AI Analyst with access to a Real-Time Knowledge Graph.
Your goal is to answer questions using a combination of document text and verified graph entities.
### INSTRUCTIONS:
1. PRIORITY: Use the [GRAPH ENTITIES] section for hard facts (names, dates, relationships).
2. CONTEXT: Use the [DOCUMENT CHUNKS] for nuance, sentiment, and detailed explanations.
3. REASONING: If the [DOCUMENT CHUNKS] mentions an entity, cross-reference it with the [GRAPH ENTITIES] to ensure it matches verified data.
4. CITATION: Always cite the "Source Document" provided in the metadata.
5. UNCERTAINTY: If the graph and the document provide conflicting information, highlight the discrepancy to the user.
### PROVIDED CONTEXT:
{context}
### USER QUESTION:
{user_query}
## How to Format the {context} variable
When you pass data from your Python retrieval script into this prompt, format it clearly so the LLM doesn't get "lost in the tokens."
Example formatted context:
[DOCUMENT CHUNKS]
Source:
annual_report.pdf| Text: "Tesla's expansion into the Texas market has accelerated due to new gigafactory output."[GRAPH ENTITIES]
Entity:
Tesla Inc| Type:Company| Industry:AutomotiveRelationship:
(Tesla Inc)-[:OPERATES_IN]->(Texas)Fact: Tesla has 3 active projects in this region.
## Why this works
Separation of Concerns: By labeling sections as
[GRAPH ENTITIES]and[DOCUMENT CHUNKS], the LLM understands that the graph data is an "anchor of truth" while the text is "supporting detail."Conflict Resolution: Explicitly telling the LLM to look for discrepancies (Instruction #5) is a 2026 standard for reducing hallucinations in Enterprise AI.
Multi-hop awareness: Because your retriever already "hopped" from the Chunk to the Entity, the LLM receives a pre-connected package of information, allowing it to answer complex "Why" and "How" questions instantly.
## Final Pro-Tip: The "Aura Agent" Shortcut
If you are using Neo4j AuraDB, you can find a "Prompts" tab inside the Aura Agent console. It will actually auto-generate a draft of this system prompt based on your specific graph schema (nodes and relationships) so you don't have to write it from scratch.
Would you like me to show you how to add a "Re-Ranker" to this pipeline to ensure only the most relevant graph entities make it into the final prompt?
Yes
In 2026, the standard for high-performance RAG has moved beyond simple vector search. Adding a Re-Ranker acts as a "second pass" that takes the top 50–100 results from Neo4j and narrows them down to the top 5–10 most relevant items using a more computationally expensive (but highly accurate) model.
The current industry leaders for 2026 are Qwen3-Reranker and ZeroEntropy (zerank-1).
## 1. Why you need a Re-Ranker
Vector search (Cosine Similarity) is great at finding broad semantic matches, but it’s "blind" to specific word order and nuanced context.
Vector Search (Stage 1): Finds 100 "maybe" matches quickly.
Re-Ranker (Stage 2): Looks at the actual user question vs. each chunk and re-orders them, moving the "true" answer to the #1 spot even if its vector similarity was lower.
## 2. Python Implementation with neo4j-graphrag
In the latest version of the Neo4j GraphRAG package, you can inject a re-ranking step directly into your retrieval pipeline.
from neo4j_graphrag.retrievers import VectorCypherRetriever
from neo4j_graphrag.embeddings.openai import OpenAIEmbeddings
# Using a 2026 SOTA Reranker (e.g., via Cohere, Voyage, or ZeroEntropy)
def rerank_results(query, initial_results):
# This is where you call your Re-Ranker API
# Example using a generic cross-encoder pattern:
pairs = [[query, res.content] for res in initial_results]
scores = reranker_model.predict(pairs)
# Sort results by the new scores
ranked = sorted(zip(initial_results, scores), key=lambda x: x[1], reverse=True)
return [r[0] for r in ranked[:5]] # Return top 5
# Your existing retriever
retriever = VectorCypherRetriever(
driver,
index_name="chunk_index",
embedder=OpenAIEmbeddings(),
retrieval_query="MATCH (node)-[:PART_OF]->(d) RETURN node.text, d.name"
)
def get_final_context(user_question):
# 1. Broad fetch from Neo4j (top 50)
raw_results = retriever.search(query_text=user_question, top_k=50)
# 2. Precision Re-Rank
final_results = rerank_results(user_question, raw_results.items)
return final_results
## 3. 2026 Re-Ranker Comparison
| Model | Type | Best For | 2026 Benchmark (NDCG@10) |
| Qwen3-Reranker-4B | Open Weights | Optimal balance of speed/accuracy. | 0.88 |
| ZeroEntropy zerank-1 | API | Lowest latency (200ms) for production. | 0.85 |
| Cohere Rerank v4 | API | Best multilingual support (100+ languages). | 0.84 |
| Qwen3-Reranker-8B | Open Weights | Maximum accuracy for legal/medical data. | 0.91 |
## The "Aura Agent" (The No-Code Way)
If you don't want to manage these API calls yourself, the Neo4j Aura Agent (generally available as of Feb 2026) has a "Re-ranking" toggle in the configuration dashboard.
Open your Aura Console.
Go to Agents > Settings.
Enable "Advanced Retrieval Optimization".
This automatically applies a cross-encoder re-ranker to your vector searches before they are sent to the LLM, reducing hallucinations by an estimated 35%.
Would you like me to show you how to measure the "Faithfulness" of your RAG system using RAGAS, so you can see exactly how much the re-ranker improves your results?
Yes
In 2026, the RAGAS (Retrieval-Augmented Generation Assessment) framework remains the "gold standard" for measuring how well your LLM uses the data you've retrieved.
The Faithfulness metric specifically answers: "Is every claim in the LLM's answer actually supported by the retrieved chunks and graph entities?" If the LLM makes something up that isn't in your Neo4j context, the Faithfulness score drops.
## 1. How RAGAS Faithfulness is Calculated
RAGAS uses a "judge" LLM to perform a 3-step audit:
Decomposition: Breaks the LLM's final answer into individual factual statements.
Verification: Checks each statement against the retrieved
context(yourChunk.textandEntityproperties).Scoring:
{Faithfulness} = {Number of claims supported by context} / {Total number of claims in the response}
## 2. Python Implementation for Neo4j
To use this with your pipeline, you’ll need to install ragas and datasets.
from ragas import EvaluationDataset, evaluate
from ragas.metrics import Faithfulness
from ragas.llms import LangchainLLMWrapper
from langchain_openai import ChatOpenAI
# 1. Prepare your evaluator (using a strong model like GPT-4o for judging)
evaluator_llm = LangchainLLMWrapper(ChatOpenAI(model="gpt-4o"))
faithfulness_metric = Faithfulness(llm=evaluator_llm)
# 2. Gather a "Sample" of your RAG output
# You need the question, the retrieved context string, and the generated answer.
sample_data = {
"user_input": "What is Tesla's strategy in Texas?",
"retrieved_contexts": [
"Tesla is expanding gigafactory output in Texas.", # From Chunk
"Entity: Tesla Inc, Status: Active in Texas" # From Graph Entity
],
"response": "Tesla is growing its Texas operations via its gigafactory."
}
# 3. Run the Evaluation
# In 2026, RAGAS uses the 'SingleTurnSample' for quick individual checks
from ragas.dataset_schema import SingleTurnSample
sample = SingleTurnSample(**sample_data)
score = faithfulness_metric.single_turn_ascore(sample)
print(f"Faithfulness Score: {score:.2f}")
# A 1.0 means perfectly grounded; < 0.7 means you have a hallucination problem.
## 3. Common Evaluation Metrics for GraphRAG
While Faithfulness is key, you should track the "RAG Triad" to get the full picture:
| Metric | Focus | What it tells you |
| Faithfulness | Generation | Did the LLM hallucinate or stick to the graph data? |
| Answer Relevancy | Generation | Did the LLM actually answer the user's specific question? |
| Context Recall | Retrieval | Did the Neo4j query find all the relevant entities needed? |
| Context Precision | Retrieval | Were the most relevant chunks at the top of the list? |
## 2026 Pro-Tip: Evaluation in the Aura Console
If you are using Neo4j AuraDB, the Aura Agent dashboard now has a built-in "Evaluation" tab. It automatically runs RAGAS-style checks on a subset of your production traffic and alerts you if your Faithfulness score dips below a threshold (e.g., 0.85). This saves you from having to build a custom evaluation server.
Would you like me to help you design a "Synthetic Test Set" generator that creates 50+ fake questions based on your specific graph data to stress-test your system?
Yes
In 2026, the most effective way to stress-test your GraphRAG system is to use the RAGAS "In-Depth Evolution" method.
Unlike simple "Q&A" generators that just rephrase text, this approach uses your graph to create Multi-Hop and Reasoning questions that are specifically designed to break weak retrieval pipelines.
## 1. The Strategy: Evolutionary Synthesis
RAGAS generates a test set by looking at your Document → Chunk → Entity graph and creating three types of questions:
Simple: Direct questions from one chunk. ("What is Tesla's revenue?")
Reasoning: Requires logical inference. ("Why did the Texas factory expansion impact the 2025 revenue?")
Multi-Context: Requires combining two unrelated chunks. ("Compare the growth of the Austin plant with the output of the Berlin factory.")
## 2. Python Implementation
You can use the ragas library to scan your Neo4j nodes and generate these pairs automatically.
from ragas.testset.generator import TestsetGenerator
from ragas.testset.evolutions import simple, reasoning, multi_context
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
# 1. Initialize the Generator with a strong "judge" model
generator_llm = ChatOpenAI(model="gpt-4o")
generator_embeddings = OpenAIEmbeddings()
generator = TestsetGenerator.from_langchain(
generator_llm,
generator_llm, # Critic model
generator_embeddings
)
# 2. Extract your Chunks from Neo4j to use as the "Source of Truth"
# (You'd typically pull your Chunk.text properties here)
documents = load_chunks_from_neo4j()
# 3. Generate 50 unique Q&A pairs with a specific "Evolution" mix
testset = generator.generate_with_langchain_docs(
documents,
test_size=50,
distributions={
simple: 0.4, # 40% easy
reasoning: 0.3, # 30% medium (needs graph logic)
multi_context: 0.3 # 30% hard (needs multiple hops)
}
)
# 4. Export to a dataframe for your evaluation loop
test_df = testset.to_pandas()
test_df.to_csv("neo4j_stress_test.csv")
## 3. 2026 New Feature: Aura Agent "Auto-Test"
If you are using the Neo4j Aura Agent (released Feb 2026), you don't even need to write the script above.
Step 1: In the Aura Console, navigate to your Agent Dashboard.
Step 2: Click the "Evaluation" tab.
Step 3: Select "Generate Synthetic Test Set".
Step 4: The platform will analyze your Graph Ontology (labels and relationships) and automatically generate 50–100 questions that target your most complex relationship paths (e.g., finding questions that require traversing 3+ nodes).
## Why this matters
By creating these "hard" questions, you can identify exactly where your graph schema is failing. For example, if the generator creates a multi-hop question that your system can't answer, it usually means you're missing a Relationship between two entities that would have connected those facts.
Would you like me to show you how to automate this as a "CI/CD" step, so your Faithfulness score is checked every time you add new documents to AuraDB?
Yes
In 2026, automating GraphRAG evaluation is no longer just a "nice-to-have"—it's a requirement for ensuring that new document ingestions don't degrade the accuracy of your system.
By combining GitHub Actions, the Neo4j Aura CLI, and RAGAS, you can create a pipeline that automatically tests your "Faithfulness" score every time you push code or add data.
## 1. The CI/CD Workflow Architecture
The pipeline follows a three-step "Sandwich" pattern:
Spin Up: Use the Aura CLI to create a temporary "Shadow" database (or a specific test instance).
Evaluate: Run your RAGAS script against the shadow database.
Spin Down: Delete the shadow instance to save costs.
## 2. GitHub Actions Configuration (.github/workflows/rag-eval.yml)
name: GraphRAG Faithfulness Check
on:
push:
branches: [ main ]
workflow_dispatch: # Allows manual triggering
jobs:
evaluate-rag:
runs-on: ubuntu-latest
steps:
- name: Checkout Code
uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.11'
- name: Install Dependencies
run: |
pip install aura-cli ragas neo4j-graphrag langchain-openai
- name: Create Shadow Aura Instance
env:
AURA_CLIENT_ID: ${{ secrets.AURA_CLIENT_ID }}
AURA_CLIENT_SECRET: ${{ secrets.AURA_CLIENT_SECRET }}
run: |
# 2026 Aura CLI command to create a temporary test instance
aura instances create --name "CI-Test-DB" --tier "free" --region "us-east-1" > db_info.json
echo "NEO4J_URI=$(jq -r .connection_url db_info.json)" >> $GITHUB_ENV
echo "NEO4J_PASSWORD=$(jq -r .password db_info.json)" >> $GITHUB_ENV
- name: Run RAGAS Evaluation
env:
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
NEO4J_URI: ${{ env.NEO4J_URI }}
NEO4J_PASSWORD: ${{ env.NEO4J_PASSWORD }}
run: |
# This runs the synthetic test set we designed in the previous step
python scripts/run_eval.py --threshold 0.85
- name: Cleanup Aura Instance
if: always()
run: |
aura instances delete --name "CI-Test-DB" --force
## 3. The "Guardrail" Script (run_eval.py)
This script acts as the gatekeeper. If the Faithfulness score is below your threshold, it exits with an error, failing the build.
import sys
import argparse
from ragas import evaluate
# ... (your existing RAGAS setup from the previous steps)
def main():
parser = argparse.ArgumentParser()
parser.add_argument("--threshold", type=float, default=0.85)
args = parser.parse_args()
# 1. Run evaluation
results = evaluate(dataset, metrics=[faithfulness])
score = results['faithfulness']
print(f"Current Faithfulness Score: {score}")
# 2. Enforcement Logic
if score < args.threshold:
print(f"❌ FAIL: Faithfulness {score} is below threshold {args.threshold}")
sys.exit(1)
else:
print("✅ PASS: Grounding is within safe limits.")
if __name__ == "__main__":
main()
## Why this setup is essential in 2026
Zero Latency Evaluation: By using the Aura CLI, you are testing against a real cloud environment, ensuring that network latency or database configuration isn't skewing your RAGAS results.
Cost Management: Using the "Spin Up / Spin Down" method ensures you only pay for the 5–10 minutes it takes to run the tests.
Preventing "Model Drift": As you add new types of documents, your graph ontology might change. This pipeline catches cases where the LLM starts preferring "noisy" new text over "clean" graph facts.
## Pro-Tip: The "Aura Agent" Integration
As of early 2026, the Aura Agent API allows you to trigger these evaluations via a single Webhook. Instead of writing the full script, you can simply call:
POST /v1/agents/{agent_id}/evaluate
This will return a JSON report of all RAGAS metrics directly in your CI/CD logs.
Would you like me to help you set up an Alerting System (e.g., via Slack or PagerDuty) that triggers if the Faithfulness score drops in your live production environment?
From <https://gemini.google.com/app/e962e1de328bff6e> Google Gemini 3.0 (Fast)

Comments
Post a Comment