#4463 Prompts for KG Domain, System, Subsystem, Component, Part Descriptions

#4463 Prompts for KG Domain, System, Subsystem, Component, Part Descriptions

#4463 I have a knowledge graph with nodes and relationships Domain->System, Subsystem, Component, Part. For each node type, provide a prompt that I can submit to an LLM to generate a useful description that can be embedded in the graph with relationship to the node

Building a knowledge graph with a clear structural hierarchy (Domain → System → Subsystem → Component → Part) is a fantastic way to model complex data. To get the most out of your LLM-generated descriptions, the prompts need to enforce a specific scope for each level. If you don't, the LLM will likely hallucinate details from other levels or repeat itself.

Here is a set of tailored, production-ready prompts designed to keep descriptions context-aware, structured, and perfect for graph embeddings.

1. Domain Level Prompt

Focus: High-level industry context, boundaries, and overall purpose.

You are an expert knowledge graph architect. Write a concise, high-level description for the Domain node listed below. 

Node Name: [Insert Domain Name]

Guidelines for the description:
1. Define the overall industry, field, or macro-environment this domain represents.
2. Outline the primary purpose and the high-level boundaries of this domain (what it includes and what it explicitly excludes).
3. Identify the key global objectives or outcomes this domain aims to achieve.

Formatting: Keep it to 2-3 sentences max. Focus on macro-level context. Do not mention specific technical components or parts.

2. System Level Prompt

Focus: Functional capabilities, integration, and relationship to the Domain.

You are an expert systems engineer. Write a functional description for the System node listed below.

Node Name: [Insert System Name]
Belongs to Domain: [Insert Parent Domain Name]

Guidelines for the description:
1. Explain the primary function and operational objective of this system within the [Parent Domain Name] domain.
2. Describe how it interacts with or relies on other systems at a high level.
3. Define the overall "inputs" and "outputs" of this system.

Formatting: Keep it to 3-4 sentences. Focus on operational capabilities and system boundaries. Do not list specific sub-components yet.

3. Subsystem Level Prompt

Focus: Specialized processes, localized control, and relationship to the System.

You are a technical systems specialist. Write a targeted description for the Subsystem node listed below.

Node Name: [Insert Subsystem Name]
Part of System: [Insert Parent System Name]

Guidelines for the description:
1. Define the specific, specialized role this subsystem plays in supporting the larger [Parent System Name].
2. Explain the mechanism, logic, or process this subsystem manages.
3. Describe how it isolates or controls specific variables/tasks for the parent system.

Formatting: Keep it to 2-3 sentences. Focus on the localized mechanics and operational logic.

4. Component Level Prompt

Focus: Physical/logical units, specific mechanisms, and relationship to the Subsystem.

You are a mechanical/software component engineer. Write a technical description for the Component node listed below.

Node Name: [Insert Component Name]
Housed within Subsystem: [Insert Parent Subsystem Name]

Guidelines for the description:
1. Identify what this component is (e.g., an assembly, a specific software module, a hardware unit) and its direct engineering function.
2. Explain how it physically or logically connects to and serves the [Parent Subsystem Name].
3. State its primary performance metrics or operational constraints (e.g., voltage, material property, data throughput, if applicable generally).

Formatting: Keep it to 2-3 sentences. Focus on engineering specs and direct functional utility.

5. Part Level Prompt

Focus: Atomic details, materials, replaceability, and relationship to the Component.

You are a manufacturing and inventory specialist. Write a precise, atomic description for the Part node listed below.

Node Name: [Insert Part Name]
Component it belongs to: [Insert Parent Component Name]

Guidelines for the description:
1. Define this part at its most atomic level (e.g., a specific fastener, a raw sensor, a distinct line-item asset).
2. Detail its material composition, physical characteristics, or standard manufacturing type.
3. Mention its serviceability (e.g., is it a consumable, a standard off-the-shelf part, or a custom-fabricated piece?).

Formatting: Keep it to 1-2 sentences. Focus purely on the physical/atomic reality of the item. Do not explain the high-level system logic.

💡 Pro-Tips for Graph Embeddings

  • The "No-Overlap" Rule: Notice how the prompts strictly forbid the LLM from stepping on the toes of other levels. This ensures that when you vectorize (embed) these descriptions, the vector space accurately reflects the hierarchical depth rather than just repeating the same keywords across all levels.

  • Variable Injection: When automating this via an API script (like Python), make sure your script dynamically injects the Parent node names. Including the parent node name gives the LLM the "graph context" it needs to write a relevant description.

Are you planning to automate this generation using a Python script connected to a specific graph database (like Neo4j), or are you running these batches manually?


Python script connected to a Neo4j graph database

Here is a production-ready Python script that leverages the neo4j official driver and an LLM client (using the standard openai library structure, which works seamlessly for OpenAI, local Ollama instances, or vLLM setups) to automate this process.

The script queries your Neo4j database for nodes lacking descriptions, determines their hierarchical depth based on their relationships, injects the parent context dynamically, fetches the LLM description, and writes it back to the node.

Python Automation Script

import os
from neo4j import GraphDatabase
from openai import OpenAI

# =====================================================================
# CONFIGURATION & INITIALIZATION
# =====================================================================
NEO4J_URI = os.getenv("NEO4J_URI", "bolt://localhost:7687")
NEO4J_USER = os.getenv("NEO4J_USER", "neo4j")
NEO4J_PASSWORD = os.getenv("NEO4J_PASSWORD", "password")

# LLM Client Configuration (Adjust base_url/api_key for OpenAI or local Ollama)
llm_client = OpenAI(
    base_url=os.getenv("LLM_BASE_URL", "https://api.openai.com/v1"),
    api_key=os.getenv("LLM_API_KEY", "your-api-key-here")
)
MODEL_NAME = "gpt-4o"  # Or your preferred local/cloud model

db_driver = GraphDatabase.driver(NEO4J_URI, auth=(NEO4J_USER, NEO4J_PASSWORD))

# =====================================================================
# PROMPT TEMPLATES (Hierarchical & Context-Aware)
# =====================================================================
PROMPT_TEMPLATES = {
    "Domain": """You are an expert knowledge graph architect. Write a concise, high-level description for the Domain node listed below. 

Node Name: {node_name}

Guidelines for the description:
1. Define the overall industry, field, or macro-environment this domain represents.
2. Outline the primary purpose and the high-level boundaries of this domain (what it includes and what it explicitly excludes).
3. Identify the key global objectives or outcomes this domain aims to achieve.

Formatting: Keep it to 2-3 sentences max. Focus on macro-level context. Do not mention specific technical components or parts.""",

    "System": """You are an expert systems engineer. Write a functional description for the System node listed below.

Node Name: {node_name}
Belongs to Domain: {parent_name}

Guidelines for the description:
1. Explain the primary function and operational objective of this system within the {parent_name} domain.
2. Describe how it interacts with or relies on other systems at a high level.
3. Define the overall "inputs" and "outputs" of this system.

Formatting: Keep it to 3-4 sentences. Focus on operational capabilities and system boundaries. Do not list specific sub-components yet.""",

    "Subsystem": """You are a technical systems specialist. Write a targeted description for the Subsystem node listed below.

Node Name: {node_name}
Part of System: {parent_name}

Guidelines for the description:
1. Define the specific, specialized role this subsystem plays in supporting the larger {parent_name}.
2. Explain the mechanism, logic, or process this subsystem manages.
3. Describe how it isolates or controls specific variables/tasks for the parent system.

Formatting: Keep it to 2-3 sentences. Focus on the localized mechanics and operational logic.""",

    "Component": """You are a mechanical/software component engineer. Write a technical description for the Component node listed below.

Node Name: {node_name}
Housed within Subsystem: {parent_name}

Guidelines for the description:
1. Identify what this component is (e.g., an assembly, a specific software module, a hardware unit) and its direct engineering function.
2. Explain how it physically or logically connects to and serves the {parent_name}.
3. State its primary performance metrics or operational constraints if applicable generally.

Formatting: Keep it to 2-3 sentences. Focus on engineering specs and direct functional utility.""",

    "Part": """You are a manufacturing and inventory specialist. Write a precise, atomic description for the Part node listed below.

Node Name: {node_name}
Component it belongs to: {parent_name}

Guidelines for the description:
1. Define this part at its most atomic level (e.g., a specific fastener, a raw sensor, a distinct line-item asset).
2. Detail its material composition, physical characteristics, or standard manufacturing type.
3. Mention its serviceability (e.g., is it a consumable, a standard off-the-shelf part, or a custom-fabricated piece?).

Formatting: Keep it to 1-2 sentences. Focus purely on the physical/atomic reality of the item. Do not explain the high-level system logic."""
}

# =====================================================================
# CORE FUNCTIONS
# =====================================================================
def get_nodes_missing_descriptions(tx):
    """
    Finds all nodes matching your hierarchy labels that don't have a description property.
    Also dynamically fetches their parent node name based on incoming relationships.
    """
    query = """
    MATCH (n)
    WHERE (n:Domain OR n:System OR n:Subsystem OR n:Component OR n:Part)
      AND n.description IS NULL
    OPTIONAL MATCH (parent)-[:HAS_SYSTEM|HAS_SUBSYSTEM|HAS_COMPONENT|HAS_PART]->(n)
    RETURN 
        id(n) AS node_id, 
        labels(n)[0] AS label, 
        n.name AS node_name, 
        coalesce(parent.name, 'Unknown Parent') AS parent_name
    """
    result = tx.run(query)
    return [record.data() for record in result]

def update_node_description(tx, node_id, label, description):
    """Updates a specific node with the generated description text."""
    query = f"""
    MATCH (n) WHERE id(n) = $node_id
    SET n.description = $description
    """
    tx.run(query, node_id=node_id, description=description)

def generate_description(label, node_name, parent_name):
    """Calls the LLM to build the targeted text snippet."""
    template = PROMPT_TEMPLATES.get(label)
    if not template:
        return None
        
    formatted_prompt = template.format(node_name=node_name, parent_name=parent_name)
    
    try:
        response = llm_client.chat.completions.create(
            model=MODEL_NAME,
            messages=[
                {"role": "system", "content": "You are a precise, technical documentation engine. Provide output strictly matching the formatting constraints provided."},
                {"role": "user", "content": formatted_prompt}
            ],
            temperature=0.2 # Low temperature keeps descriptions objective and consistent
        )
        return response.choices[0].message.content.strip()
    except Exception as e:
        print(f"❌ Error communicating with LLM: {e}")
        return None

# =====================================================================
# MAIN RUNNER
# =====================================================================
def main():
    print("Connecting to Neo4j database...")
    with db_driver.session() as session:
        # 1. Fetch missing items
        print("Scanning graph for nodes lacking descriptions...")
        nodes_to_process = session.read_transaction(get_nodes_missing_descriptions)
        
        if not nodes_to_process:
            print("✅ All hierarchical nodes already have descriptions.")
            return
            
        print(f"Found {len(nodes_to_process)} nodes to process.")
        
        # 2. Iterate and enrich
        for node in nodes_to_process:
            n_id = node['node_id']
            label = node['label']
            name = node['node_name']
            parent = node['parent_name']
            
            print(f"Processing [{label}] Node: '{name}' (Parent: '{parent}')...")
            
            description = generate_description(label, name, parent)
            
            if description:
                session.write_transaction(update_node_description, n_id, label, description)
                print(f" -> Successfully appended description to graph.")
            else:
                print(f" -> ⚠️ Skipped node '{name}' due to generation failure.")

    db_driver.close()
    print("Process complete.")

if __name__ == "__main__":
    main()

Highlights of this setup:

  • Dynamic Graph Traversal: The Cypher query uses an OPTIONAL MATCH to pull the direct parent node's name on the fly. This fulfills the placeholder requirement without needing separate scripts for every layer.

  • Low Temperature Setting: Setting temperature=0.2 ensures the language stays stable, professional, and descriptive rather than overly creative.

  • Idempotency: Because it explicitly filters on n.description IS NULL, you can run this script repeatedly as a pipeline cron job. As new systems, subsystems, or components are added to your graph over time, it will automatically catch and process only the newly added nodes.


From <https://gemini.google.com/app/>  Google Gemini (3.5 Flash)

Comments

Popular posts from this blog

#2892 Example of ChatGPT o1 Prompt

#3679 Google Gemini 2.5 Pro: Create Python Application to View and Query Neo4j KG