#3916 RAG Final Prompt

#3916 I have python code that collects the 3 (top-k=3) best matches from semantic search, along with chunk indexes of the best matches. Provide a prompt that I can submit to Gemini LLM as the final step of RAG.

Here is a robust, structured prompt template designed for Gemini.

Since you have the chunk indexes, this prompt includes instructions for citations. This is a major advantage in RAG, allowing the LLM to prove where it found the answer (e.g., "The release date is 2024 [Source: 2]").

The Prompt Template

You are a helpful and accurate Knowledge Assistant. Your goal is to answer the User Query using ONLY the provided Context Information.

### INSTRUCTIONS:
1.  **Strict Grounding:** Answer solely based on the information provided in the "Context" section below. Do not use outside knowledge or make assumptions.
2.  **Citations:** Every factual statement in your answer must be supported by a citation to the specific Context Chunk ID (e.g., "[Source: 1]").
3.  **Ambiguity:** If the provided context does not contain enough information to answer the query, explicitly state: "The provided context does not contain sufficient information to answer this question."
4.  **Tone:** Be concise, professional, and direct.

### CONTEXT INFORMATION:
{formatted_context_string}

### USER QUERY:
{user_query}

### ANSWER:

Python Implementation Code

Here is how to format your search results to fit the {formatted_context_string} variable in the prompt above.

I am assuming your search results look something like a list of objects or tuples: (text_content, chunk_index).

import google.generativeai as genai

# Setup your Gemini API (Assuming you have this configured)
# genai.configure(api_key="YOUR_API_KEY")
# model = genai.GenerativeModel('gemini-1.5-flash')

def generate_rag_response(user_query, best_matches):
    """
    Args:
        user_query (str): The original question.
        best_matches (list): List of tuples/dicts containing (text, index).
                             Example: [("Text from doc...", 102), ("More text...", 105)]
    """
    
    # 1. Format the context chunks with their IDs
    # We use XML-style tags or clear headers to help Gemini separate chunks.
    context_parts = []
    for match in best_matches:
        # Assuming match structure is (text, chunk_index)
        text = match[0]
        index = match[1]
        context_parts.append(f"---\n[Source: {index}]\n{text}\n---")
    
    formatted_context_string = "\n".join(context_parts)

    # 2. Define the Prompt Template (f-string)
    prompt = f"""
You are a helpful and accurate Knowledge Assistant. Your goal is to answer the User Query using ONLY the provided Context Information.

### INSTRUCTIONS:
1. **Strict Grounding:** Answer solely based on the information provided in the "Context" section below. Do not use outside knowledge.
2. **Citations:** Every factual statement in your answer must be supported by a citation to the specific Source ID (e.g., "[Source: 12]").
3. **No Hallucination:** If the context does not answer the question, say "I don't know based on the provided info."

### CONTEXT INFORMATION:
{formatted_context_string}

### USER QUERY:
{user_query}

### ANSWER:
"""

    # 3. Send to Gemini
    # response = model.generate_content(prompt)
    # return response.text
    return prompt # Returning prompt to show the output

# --- Example Usage ---

# Mock data representing your semantic search results
# (Text content, Chunk Index)
mock_matches = [
    ("The Apollo program was designed to land humans on the Moon and bring them safely back to Earth.", 14),
    ("Six missions landed on the Moon: Apollos 11, 12, 14, 15, 16, and 17.", 15),
    ("Apollo 13 was intended to land, but a technical failure forced it to circle the Moon and return.", 16)
]

query = "Which Apollo missions actually landed on the moon?"

final_prompt = generate_rag_response(query, mock_matches)
print(final_prompt)

Why this structure works

Clear Delimiters (--- or ###): LLMs can sometimes get confused about where the "instructions" end and the "data" begins. Using headers like ### CONTEXT INFORMATION acts as a hard boundary.
The Source ID Injection: By prepending [Source: {index}] to the text inside the prompt strings, you allow Gemini to "see" the ID associated with that specific block of text.
Negative Constraints: Explicitly telling the model not to use outside knowledge reduces hallucinations. You want the model to act as a processor of your data, not a creative writer.
Handling "I don't know": It is vital to tell the model what to do if the semantic search failed (i.e., the retrieved chunks are irrelevant). Without this instruction, the model might try to make up an answer to be helpful.

Next Step

Would you like me to adjust this prompt to handle conflicting information (e.g., if Chunk A says "Yes" and Chunk B says "No")?

Yes

Here is the updated Python code and prompt template designed to handle conflicting data.

In RAG systems, it is common to retrieve an old document (e.g., "The launch is in 2024") and a new document (e.g., "The launch is delayed to 2025") simultaneously. This prompt instructs Gemini to expose the conflict rather than guessing which one is right.

The Conflict-Aware Prompt Template

The key addition here is Instruction #4, which forces the model to act as a neutral reporter when data clashes.

import google.generativeai as genai

def generate_conflict_aware_rag(user_query, best_matches):
    """
    Args:
        user_query (str): The original question.
        best_matches (list): List of tuples (text, chunk_index).
    """
    
    # 1. Format the context with clear Source IDs
    context_parts = []
    for match in best_matches:
        text = match[0]
        index = match[1]
        context_parts.append(f"---\n[Source ID: {index}]\n{text}\n---")
    
    formatted_context_string = "\n".join(context_parts)

    # 2. Define the Prompt
    prompt = f"""
You are an intelligent Knowledge Assistant. Answer the User Query using ONLY the provided Context Information.

### INSTRUCTIONS:
1. **Strict Grounding:** Answer solely based on the "Context Information" below. Do not use outside knowledge.
2. **Citations:** Every factual statement must include a citation (e.g., "[Source ID: 12]").
3. **Handling Conflicts:** If two sources provide contradicting information:
    - Do not try to guess which one is correct.
    - Explicitly state the conflict (e.g., "Source 1 states X, while Source 2 states Y").
4. **Uncertainty:** If the context does not contain the answer, state "I cannot answer this based on the provided context."

### CONTEXT INFORMATION:
{formatted_context_string}

### USER QUERY:
{user_query}

### ANSWER:
"""
    return prompt

# --- Example of How This Handles Conflicts ---

# Mock Data: Two conflicting chunks about a generic parameter
mock_matches = [
    ("The maximum operating temperature is 150 degrees.", 101),
    ("Updated guidelines: The maximum operating temperature has been lowered to 120 degrees.", 205),
    ("Normal operating pressure is 50 psi.", 300)
]

query = "What is the max temperature?"

final_prompt = generate_conflict_aware_rag(query, mock_matches)

# If you sent this to Gemini, the expected output would look like:
# "There is conflicting information in the provided context. 
#  [Source ID: 101] states the maximum temperature is 150 degrees, 
#  however, [Source ID: 205] states it has been lowered to 120 degrees."

Why this is safer

If you don't explicitly tell the LLM to highlight conflicts, it will often apply "Soft Logic" to smooth things over. It might average the numbers, pick the one that sounds more authoritative, or simply pick the first one it reads. By forcing it to report "Source A says X, Source B says Y," you (the human) can decide which chunk is the source of truth.

Next Step

To automatically resolve these conflicts (so the LLM knows which version is actually the latest), we would need to inject dates into the context string.

Would you like me to update the Python code to accept a date field in your matches and instruct the LLM to prioritize the most recent source?

From <https://gemini.google.com/app/> Google Gemini 3.0 Pro <Thinking>

Search This Blog

Learning from ChatGPT