#3714 Post-Semantic Search RAG Prompt

#3714 After apply semantic search across a set of embedded documents, what in addition to the original user prompt and the results of the search should an LLM prompt include to summarize the search results as they relate to the original prompt?

To effectively summarize semantic search results, an LLM prompt should include explicit instructions and constraints in addition to the original user prompt and the search results. These instructions guide the model to generate a factual, relevant, and verifiable summary.

The key additions fall into three main categories:

## 1. Grounding Instructions and Constraints grounding

This is the most critical addition. You must explicitly tell the LLM how to use the provided information and what not to do. This helps prevent the model from using its own knowledge (hallucinating) and ensures the answer is based solely on the search results.

Role Definition: Define the model's role. For example: "You are a helpful assistant that answers user questions based only on the provided search results."
The Grounding Rule: State the core constraint clearly. For example: "Base your answer strictly on the information given in the search results below. Do not use any outside knowledge."
Handling Insufficient Information: Provide an escape hatch for when the answer isn't in the text. For example: "If the provided documents do not contain enough information to answer the question, you must state that the answer cannot be found in the provided context."

## 2. Synthesis and Citation Rules 📝

Instead of just listing facts, you want the LLM to create a coherent synthesis of the information. Requiring citations is crucial for building trust and allowing for fact-checking.

Synthesis Command: Instruct the model to connect the dots. For example: "Synthesize the key information from the following results to create a comprehensive answer."
Citation Mandate: Specify how to attribute information. For example: "Cite the source number for every claim you make using the format [Source #]." This forces the model to map its output directly to the provided text.

## 3. Desired Output Format 📄

Specify exactly how you want the final answer to be structured to ensure it's clear, concise, and useful.

Structure: Define the layout. For example: "Provide the answer as a concise paragraph," or "Summarize the findings in bullet points."
Tone and Style: Guide the tone of the response. For example: "The summary should be neutral, objective, and easy to understand."

## Example Prompt Template

Here is a template that combines all these elements:

**Role:** You are an expert Q&A assistant. Your goal is to provide a clear and accurate answer to the user's question based *exclusively* on the provided search results.

**Instructions:**
1.  Carefully read the user's question and the provided search results.
2.  Synthesize the information from all sources to formulate a comprehensive answer.
3.  **Crucially, you must not use any information outside of the provided search results.**
4.  For each piece of information in your answer, you must cite the corresponding source number in brackets, like this: [Source #].
5.  If the search results do not contain an answer to the question, you must respond with: "I could not find an answer to your question in the provided documents."

**User Question:**
{original_user_prompt}

**Search Results:**
[Source 1] {text_chunk_from_document_1}
[Source 2] {text_chunk_from_document_2}
[Source 3] {text_chunk_from_document_3}
...

**Answer:**

Reword for the role as an expert NASA engineer, capable of synthesizing inputs for any space program domain

Here is a flight-ready mission directive for synthesizing technical inputs across any space program domain. This is how we fuse telemetry and mission data into actionable intelligence.

Mission Directive: Synthesizing Disparate Program Data

Role: Senior Mission Specialist. Your function is to act as a data fusion expert, capable of integrating and synthesizing multi-domain technical inputs into a unified, coherent summary. Your analysis must be grounded entirely within the provided data streams.

Core Directive: Your primary objective is to construct a concise, accurate, and verifiable summary that directly addresses the core query. This is a data-driven task; external knowledge or assumptions are mission-critical failures.

Operational Parameters & Protocols:

Your prompt structure for the LLM must enforce the following protocols to ensure mission success:

1. Establish Ground Truth (Data Grounding Protocol)

This is the critical flight rule. The system must be locked to the provided data feeds to prevent fabrication or "mission hallucination."

System Constraint: Define the system's operational boundaries. Command: "You are a NASA data analysis system. Your analysis and conclusions must be based solely on the provided technical documentation and telemetry data."
Contingency Protocol: Define a clear procedure for incomplete data. Command: "If the provided data is insufficient to generate a conclusive answer, you will state: 'Insufficient data to provide a definitive conclusion based on the provided inputs.'"

2. Data Fusion & Attribution (Synthesis & Traceability Protocol)

The goal is not to echo raw data, but to fuse it into a higher-level understanding. Every piece of that understanding must be traceable to its source.

Fusion Command: Instruct the system to integrate, not just list. Command: "Synthesize the key findings from the following data sources into a unified technical summary."
Data Traceability Mandate: Require explicit links back to the source data for verification. Command: "Every assertion in your summary must be appended with a citation pointing to the source document and section, e.g.,
$D OC - 3, SEC - B .2$
." This ensures a clear audit trail for all synthesized information.

3. Output Formatting (Standardized Reporting Protocol)

Information must be delivered in a standardized format for rapid, unambiguous comprehension by mission stakeholders.

Structure Definition: Specify the required output structure. Command: "Format the summary as a series of bullet points, each addressing a key finding, followed by a concluding paragraph that summarizes the overall assessment."
Tone & Lexicon: Set the communication standard. Command: "The tone will be objective, technical, and quantitative. Use precise engineering and mission-specific terminology."

Standardized Prompt Template

This is the flight-ready template for all data synthesis tasks:

**MISSION ROLE:** NASA Senior Mission Specialist (Data Fusion).

**DIRECTIVE:**
1.  Analyze the Operator Query and the provided Mission Data Packets.
2.  Fuse the data to construct a unified technical summary that directly addresses the query.
3.  **Flight Rule 1:** Base your summary *exclusively* on the provided data packets. Do not introduce external information.
4.  **Flight Rule 2:** Every key finding in your summary must be traced to its source using the format [Packet #].
5.  **Contingency:** If the data packets do not contain the required information, your response will be: "Insufficient data to provide a definitive conclusion."

**Operator Query:**
{original_user_prompt}

**Mission Data Packets:**
[Packet 1] {technical_document_or_data_chunk_1}
[Packet 2] {technical_document_or_data_chunk_2}
[Packet 3] {technical_document_or_data_chunk_3}
...

**Synthesized Report:**

From <https://gemini.google.com/app/> Google Gemini 2.5 Pro

Search This Blog

Learning from ChatGPT