#3895 Natural Language to DSL

#3895 Consider a DSL using natural language noun1-verb-noun2 that equates to function verb(noun1,noun2)

This architecture—transforming Noun-Verb-Noun (Infix notation) into Verb(Noun, Noun) (Prefix/Functional notation)—is the foundational structure of subject-predicate-object triples. This is highly effective for building readable scripts that map directly to executable code or Knowledge Graph insertions.

Here is a breakdown of how to implement this DSL pattern effectively.

1. The Conceptual Mapping

You are essentially performing a syntactic transformation from natural language to a function signature.

Component	Natural Language (DSL)	Functional Equivalent	Role
Noun 1	System	arg1	Subject / Caller
Verb	initiates	function_name	Method / Predicate
Noun 2	Sequence	arg2	Object / Parameter

2. Python Implementation: Dictionary Dispatch

The most robust way to implement this in Python without using a heavy parser (like simple string splitting) is using a Dispatcher. This prevents the security risk of using eval() and allows you to map natural language verbs to specific internal function names.

class SystemController:
    def __init__(self):
        # The 'Knowledge Base' or state
        self.state = {"Valve A": "Closed", "Pump B": "Off"}

    # --- The Actual Functions ---
    def _open(self, subject, target):
        print(f"ACTION: {subject} is opening {target}")
        self.state[target] = "Open"

    def _verify(self, subject, target):
        status = self.state.get(target, "Unknown")
        print(f"QUERY: {subject} sees {target} is {status}")

    # --- The Parser ---
    def execute_dsl(self, command_string):
        # 1. Tokenize (Simple space splitting)
        try:
            n1, verb, n2 = command_string.split()
        except ValueError:
            return "Error: Command must be 'Noun Verb Noun'"

        # 2. Map Verbs to Functions (The Dispatcher)
        verb_map = {
            "opens": self._open,
            "checks": self._verify,
            "scans": self._verify
        }

        if verb in verb_map:
            # 3. Execute: verb(noun1, noun2)
            func = verb_map[verb]
            func(n1, n2)
        else:
            print(f"Error: Unknown verb '{verb}'")

# Usage
bot = SystemController()
bot.execute_dsl("Pilot opens Valve_A")  # Output: ACTION: Pilot is opening Valve_A
bot.execute_dsl("Computer checks Valve_A") # Output: QUERY: Computer sees Valve_A is Unknown

3. Mathematica / Wolfram Language Implementation

If you are doing symbolic modeling, Mathematica is uniquely suited for this because it allows you to define custom notations that rewrite input on the fly.

You can use Notation or simple pattern matching to define the "Noun Verb Noun" syntax as a valid input form.

(* Define the function logic *)
activate[source_, target_] := Print[source, " is powering up ", target];

(* Define the DSL transformation rule *)
(* Whenever Mathematica sees "A activates B", it treats it as activate[A, B] *)
MakeExpression[RowBox[{n1_, "activates", n2_}], StandardForm] := 
  MakeExpression[RowBox[{"activate", "[", n1, ",", n2, "]"}], StandardForm]

(* Usage in Notebook *)
(* User types: *)
Generator activates Circuit 
(* Mathematica executes: activate[Generator, Circuit] *)

4. Handling Complexity (Multi-word Nouns)

The generic split() method fails if your Nouns are "Red Button". To handle natural language properly, you need a strategy to identify the Verb first, as it acts as the pivot point.

Algorithm:

Define a list of known valid verbs (e.g., ["starts", "stops", "reboots"]).
Scan the sentence for the verb.
Everything to the left is Noun1 (Subject).
Everything to the right is Noun2 (Object).

Python Example using Regex:

import re

def parse_complex_dsl(sentence):
    # known verbs act as delimiters
    verbs = ["initiates test on", "connects to", "verifies"]
    
    # Create regex pattern: (.*?)\s(verb1|verb2|verb3)\s(.*)
    pattern = f"(.*?)\\s({'|'.join(verbs)})\\s(.*)"
    match = re.search(pattern, sentence, re.IGNORECASE)
    
    if match:
        noun1 = match.group(1).strip()
        verb  = match.group(2).strip()
        noun2 = match.group(3).strip()
        
        # Transform to function call format string for display
        return f"{verb.replace(' ', '_')}( '{noun1}', '{noun2}' )"
    return "No valid verb found."

# Usage
print(parse_complex_dsl("Main Engine 3 initiates test on Hydraulic Pump"))
# Result: initiates_test_on( 'Main Engine 3', 'Hydraulic Pump' )

5. Application: Knowledge Graphs (Triples)

This pattern is the exact syntax used for RDF Triples or Graph Databases (like Neo4j).

DSL: User follows Topic
Graph: (User)-[:FOLLOWS]->(Topic)

If you are building a system to organize data, this DSL allows you to read natural language requirements and immediately hydrate a graph database.

Would you like to see how to extend this to handle "chained" commands (e.g., "Noun1 Verb Noun2 AND Verb Noun3")?

How would other word types like adjectives, etc., be utilized?

When extending a DSL beyond simple triples, modifiers like adjectives and adverbs serve to constrain data or configure execution. In programming terms, these convert simple variable references into Object Instantiation or Function Configuration.

Here is how you map linguistic modifiers to software architecture.

1. Adjectives: Properties and Filters

Role: Adjectives modify Nouns.

Software Equivalent: Constructor Arguments or Database Filters.

When a user says "Open emergency valve," they are not referring to the generic class Valve, but a specific instance or a subset of valves with the property type="emergency".

Pattern: `Adjective` `Noun` $\rightarrow$ `Object(property)`

Implementation Strategy:

Instead of passing a string string "valve", your parser creates an object or a dictionary.

# DSL: "System opens emergency valve"

# 1. Standard Noun-Verb-Noun
# verb = open
# noun1 = System
# noun2 = "emergency valve" <--- Needs decomposition

# 2. Decomposition Logic
def resolve_entity(phrase):
    tokens = phrase.split()
    # Assume last word is the Class, previous words are properties
    entity_class = tokens[-1] 
    attributes = tokens[:-1]
    
    return {"class": entity_class, "attributes": attributes}

# Resulting Function Call:
# open( 
#    subject={"class": "System", "attributes": []}, 
#    object={"class": "valve", "attributes": ["emergency"]} 
# )

In a Knowledge Graph context:

This maps to a query where the Adjective matches a property on the node.

MATCH (n:Valve {type: 'emergency'})

2. Adverbs: Configuration and Modes

Role: Adverbs modify Verbs.

Software Equivalent: Keyword Arguments (kwargs) or Flags.

When a user says "System slowly opens valve," they are modifying the execution method of the function.

Pattern: `Adverb` `Verb` $\rightarrow$ `function(mode=Adverb)`

Implementation Strategy:

You maintain a list of known adverbs that map to configuration parameters.

# DSL: "System slowly opens valve"

def execute(command):
    # known adverbs
    speed_modifiers = ["slowly", "quickly", "instantly"]
    
    tokens = command.split()
    
    # Extract adverbs
    mode = "default"
    clean_tokens = []
    for t in tokens:
        if t in speed_modifiers:
            mode = t # Capture the adverb
        else:
            clean_tokens.append(t)
            
    # Reassemble and execute
    # ... logic to extract subject/verb/object from clean_tokens ...
    
    # Final Call
    # open(subject, object, speed=mode)

3. Prepositional Phrases: Context & Values

Role: Phrases starting with "at," "to," "by," "with."

Software Equivalent: Explicit Value Assignment.

These are critical for setting numerical values or targets.

"Set volume to 50%"
"Run test for 10 seconds"

Implementation Strategy:

These usually appear at the end of the sentence. You parse the standard N-V-N, and treat the remainder as a payload.

# DSL: "Engine increases thrust to 80%"

# Regex is your friend here
# Pattern: (Noun) (Verb) (Noun) (Preposition) (Value)
import re

text = "Engine increases thrust to 80%"
match = re.search(r"(.*?) (.*?) (.*?) (to|at|by) (.*)", text)

if match:
    n1, verb, n2, prep, val = match.groups()
    
    # Function Signature:
    # increase(target="thrust", value="80%", value_type="absolute")
    # Note: 'to' usually implies absolute value, 'by' implies relative increment.

4. Advanced: Functional Composition (Mathematica Style)

If you view this through a functional programming lens (like Haskell or Mathematica), adjectives can be functions that wrap the noun.

Instead of Valve(type="red"), you can think of it as Red(Valve).

Mathematica Example:

(* The Adjective is a function that returns a specific subset *)
Emergency[x_] := Select[x, #Type == "Emergency" &];

(* The Verb operates on the result of that function *)
Open[ target_ ] := SetStatus[ target, "Open" ];

(* DSL Input *)
Open @ Emergency @ Valves

This is elegant because it allows infinite chaining: Open @ Red @ Emergency @ Valves.

Summary Table

Word Type	Sentence Role	Software Role	Example
Noun	Subject/Object	Class Instance / ID	Valve_A
Adjective	Modifier	Property / Attribute	`Emergency` (type)
Verb	Predicate	Function / Method	Opens
Adverb	Modifier	Parameter / Flag	`Slowly` (speed)
Preposition	Relator	Argument Key	`To` (assign value)

Would you like to explore how to implement a parser for this that uses a "Dictionary of Grammar" to automatically classify these word types?

Yes

This approach shifts the complexity from the code to the data. By defining a Lexicon (a Dictionary of Grammar), your parser becomes a generic engine that doesn't need to change even if you add new commands or hardware.

Here is the architectural pattern for a Lexicon-Driven Slot Filler.

1. The Architecture: Slot Filling

Instead of a complex recursive tree, we use a "Slot Filling" strategy. We assume a canonical command structure fits a template, and we categorize words to fill those slots.

Target Template: [Adverb/Mode] [Verb] [Adjective/Filter] [Noun]

2. The Lexicon (The Data)

This dictionary maps the "Surface Form" (what the user types) to the "Semantic Form" (what the code understands).

class DSLParser:
    def __init__(self):
        # The "Brain": Maps words to Types and Internal Values
        self.lexicon = {
            # --- VERBS (Methods) ---
            "opens":     {"type": "VERB", "action": "SET_STATE", "value": "OPEN"},
            "closes":    {"type": "VERB", "action": "SET_STATE", "value": "CLOSED"},
            "checks":    {"type": "VERB", "action": "QUERY",     "value": "STATUS"},
            "calibrates":{"type": "VERB", "action": "MAINTAIN",  "value": "CALIBRATE"},

            # --- NOUNS (Classes/Entities) ---
            "valve":     {"type": "NOUN", "target": "device_valve"},
            "pump":      {"type": "NOUN", "target": "device_pump"},
            "sensor":    {"type": "NOUN", "target": "device_sensor"},

            # --- ADJECTIVES (Filters/Properties) ---
            "emergency": {"type": "ADJ",  "key": "category", "value": "critical"},
            "auxiliary": {"type": "ADJ",  "key": "category", "value": "backup"},
            "oxygen":    {"type": "ADJ",  "key": "medium",   "value": "o2"},

            # --- ADVERBS (Parameters) ---
            "slowly":    {"type": "ADV",  "key": "speed",    "value": 0.5},
            "immediately":{"type": "ADV", "key": "priority", "value": "high"},
        }

3. The Parser Logic (The Engine)

The parser iterates through the sentence, looks up the word in the lexicon, and assigns it to a "Command Object." This handles word order flexibility (e.g., "Slowly open valve" vs "Open valve slowly") reasonably well.

    def parse(self, sentence):
        tokens = sentence.lower().split()
        
        # The "Command Object" (The Slots)
        cmd = {
            "action": None,       # From Verb
            "target_class": None, # From Noun
            "filters": {},        # From Adjectives
            "params": {}          # From Adverbs
        }

        for token in tokens:
            definition = self.lexicon.get(token)

            if not definition:
                print(f"Warning: Word '{token}' not recognized. Ignoring.")
                continue

            # Dispatch based on Part of Speech (POS)
            token_type = definition["type"]

            if token_type == "VERB":
                cmd["action"] = definition["action"]
                # Some verbs might imply a parameter (e.g., open -> value: OPEN)
                if "value" in definition:
                    cmd["params"]["target_state"] = definition["value"]

            elif token_type == "NOUN":
                cmd["target_class"] = definition["target"]

            elif token_type == "ADJ":
                # Adjectives become database filters
                key = definition["key"]
                val = definition["value"]
                cmd["filters"][key] = val

            elif token_type == "ADV":
                # Adverbs become execution parameters
                key = definition["key"]
                val = definition["value"]
                cmd["params"][key] = val

        return cmd

4. Execution Logic

Now you have a clean dictionary object that is easy to pass to a backend function or an SQL generator.

    def execute(self, cmd_object):
        # Example of converting the parsed object into a SQL-like query
        if not cmd_object["action"] or not cmd_object["target_class"]:
            return "Error: Incomplete command."

        # Constructing a pseudo-query
        query = f"SELECT * FROM {cmd_object['target_class']} WHERE 1=1"
        
        # Apply Adjective Filters
        for k, v in cmd_object["filters"].items():
            query += f" AND {k}='{v}'"
            
        print(f"--- EXECUTING ---")
        print(f"Action: {cmd_object['action']}")
        print(f"Params: {cmd_object['params']}")
        print(f"Context: {query}")

# --- DEMO ---
bot = DSLParser()
user_input = "Slowly opens auxiliary oxygen valve"
parsed_data = bot.parse(user_input)
bot.execute(parsed_data)

Output:

--- EXECUTING ---
Action: SET_STATE
Params: {'target_state': 'OPEN', 'speed': 0.5}
Context: SELECT * FROM device_valve WHERE category='backup' AND medium='o2'

5. Why this works for your stack

SQLite Integration: The cmd["filters"] dictionary maps directly to WHERE clauses (as shown in the execution example).
Knowledge Graphs: The Noun (device_valve) matches a Node Label, and Adjectives match Node Properties. The parsed object is essentially a set of criteria to match a specific node in your graph.

Synonym Management: You can easily map multiple keys to the same definition in the dictionary without changing code:

"shut": {"type": "VERB", "action": "SET_STATE", "value": "CLOSED"},
"close": {"type": "VERB", "action": "SET_STATE", "value": "CLOSED"},

6. Handling "Unknowns" (The 40-year veteran perspective)

In a real NASA context, you cannot just ignore unknown words. You need Error Recovery.

If definition is None:

Fuzzy Match: Use difflib.get_close_matches to see if it was a typo (e.g., "vlave" -> "valve").
Pass-through: If it looks like a distinct ID (e.g., "V-102"), assume it is a specific Noun Instance rather than a Class.
- Heuristic: If it contains numbers or capitals, treat as ID.

Would you like me to demonstrate how to convert this parsed "Command Object" into a Cypher query for your Knowledge Graph?

Describe how to handle "chained" commands (e.g., "Noun1 Verb Noun2 AND Verb Noun3")

Handling chained commands introduces the concept of Context Persistence (or Anaphora Resolution in linguistics). The parser needs a "Short-Term Memory" to recall the subject or verb from the previous clause when it is omitted in the current one.

Here is the architectural pattern to handle Noun1 Verb Noun2 AND Verb Noun3.

1. The Strategy: Split and Inherit

We treat the conjunction ("AND", "THEN") as a hard delimiter that breaks the stream into Segments. We then process these segments sequentially. If a segment is incomplete (missing a Subject), it inherits it from the previous valid command.

The Logic Flow:

Normalize: Replace all delimiters (,, and, then) with a unique separator |.
Split: Break the string into a list of segments.
Iterate: Parse each segment.
Inherit:
- If Segment B has no Subject, use Subject from Segment A.
- If Segment B has no Subject AND no Verb, use both from Segment A (e.g., "Open Valve A and Valve B").

2. Python Implementation

We will wrap the DSLParser from the previous step in a ChainProcessor.

class ChainProcessor:
    def __init__(self, parser):
        self.parser = parser
        # Context Memory
        self.last_subject = None
        self.last_verb = None

    def process_chain(self, raw_text):
        # 1. Normalize delimiters to a single pipe character
        # Handle " and ", " then ", or just commas
        clean_text = raw_text.replace(" and ", "|").replace(" then ", "|").replace(", ", "|")
        
        segments = clean_text.split("|")
        compiled_commands = []

        for seg in segments:
            # Parse the individual segment using the Logic from the previous step
            cmd_obj = self.parser.parse(seg.strip())
            
            # --- CONTEXT RECOVERY LOGIC ---
            
            # 1. Handle Missing Subject (Noun1)
            if not cmd_obj["target_class"]: 
                if self.last_subject:
                    cmd_obj["target_class"] = self.last_subject
                    print(f"   -> Context: Inheriting Subject '{self.last_subject}'")
                else:
                    print("Error: Ambiguous command. No subject defined.")
                    continue
            else:
                # Update context if a new subject is explicitly stated
                self.last_subject = cmd_obj["target_class"]

            # 2. Handle Missing Verb (Optional, for "Open Valve A and Valve B")
            if not cmd_obj["action"]:
                if self.last_verb:
                    cmd_obj["action"] = self.last_verb
                    print(f"   -> Context: Inheriting Verb '{self.last_verb}'")
                else:
                    print("Error: Ambiguous command. No verb defined.")
                    continue
            else:
                self.last_verb = cmd_obj["action"]

            compiled_commands.append(cmd_obj)

        return compiled_commands

# --- MOCK USE CASE ---
# Assuming 'parser' is the Lexicon-Driven instance from the previous step
processor = ChainProcessor(bot) # 'bot' is the DSLParser instance

# Scenario: The user specifies the Subject once, then issues multiple commands
input_str = "Pilot opens Valve_A and closes Valve_B"

results = processor.process_chain(input_str)

# Output Execution
for i, cmd in enumerate(results):
    print(f"Command {i+1}: {cmd['action']} on {cmd['target_class']}")

3. Handling Different "Ellipsis" Types

There are three common variations of chaining you will encounter.

A. Subject Ellipsis (The Prompt's Request)

Input: "System opens valve and checks pump."
Logic: Segment 2 is checks pump.
Resolution: Parse detects Verb + Noun. Missing Subject. Inject System.

B. Subject & Verb Ellipsis

Input: "System opens Valve A and Valve B."
Logic: Segment 2 is Valve B.
Resolution: Parse detects only Noun. Inject System (Subject) and opens (Verb).

C. Subject Switching (Reset Context)

Input: "Pilot opens valve and Alarm sounds."
Logic: Segment 2 is Alarm sounds.
Resolution: Parse detects new Noun (Alarm) and Verb. Update last_subject to Alarm.

4. Graph Database Implication (Transactional Scope)

When you move to Knowledge Graphs (Cypher/Neo4j), chained commands usually imply a Transaction. You want all actions in the chain to succeed, or none of them.

If you generate Cypher from this:

// Command 1
MATCH (s:Person {id: 'Pilot'}), (v1:Valve {id: 'Valve_A'})
MERGE (s)-[:OPENS]->(v1)

// Command 2 (Chained)
WITH s // Carry over the subject context
MATCH (v2:Valve {id: 'Valve_B'})
MERGE (s)-[:CLOSES]->(v2)

5. Advanced: The "Result" Chain

Sometimes the input of the second command is the output of the first.

Input: "Find leaking valves and close them."
Pattern: Find(Criteria) $\rightarrow$ List<Valve> $\rightarrow$ Close(List)

This requires a slightly different approach where Noun2 in the second command is a reserved keyword (like "them", "it", "results").

if cmd_obj["target_class"] in ["them", "it", "those"]:
    # The target is not a text string, but the return value of the previous function
    cmd_obj["target_ptr"] = "PREVIOUS_RESULT"

Summary of Rules

Pattern	Missing Element	Action
Noun V N AND V N	Subject	Inherit Subject
Noun V N AND N	Subject & Verb	Inherit Subject & Verb
Noun V N AND N V N	None	Update Subject & Verb (New Context)

Would you like to see the Python code for the "Result Chain" (handling "them/it") logic, or move to how this integrates with your NASA standards database project?

From <https://gemini.google.com/app/> Google Gemini 3.0 Pro (Thinking)

Search This Blog

Learning from ChatGPT