Skip to content

Prompts Module

The prompts module provides prompt templates and utilities for querying LLMs about causal relationships between variables.

Overview

This module contains:

  • EdgeQueryPrompt: A dataclass for building prompts to query edge existence and orientation
  • parse_edge_response: A function to parse LLM JSON responses into EdgeKnowledge objects
  • Template constants: Pre-defined prompt templates for system and user messages

EdgeQueryPrompt

EdgeQueryPrompt dataclass

EdgeQueryPrompt(
    node_a: str,
    node_b: str,
    domain: Optional[str] = None,
    descriptions: Optional[dict[str, str]] = None,
    system_prompt: Optional[str] = None,
)

Builder for edge existence/orientation query prompts.

This class constructs system and user prompts for querying an LLM about causal relationships between variables.

Attributes:

  • node_a (str) –

    Name of the first variable.

  • node_b (str) –

    Name of the second variable.

  • domain (Optional[str]) –

    Optional domain context (e.g., "medicine", "economics").

  • descriptions (Optional[dict[str, str]]) –

    Optional dict mapping variable names to descriptions.

  • system_prompt (Optional[str]) –

    Custom system prompt (uses default if None).

Example

prompt = EdgeQueryPrompt("smoking", "cancer", domain="medicine") system, user = prompt.build()

Use with LLMClient

response = client.complete(system=system, user=user)

Methods:

  • build

    Build the system and user prompts.

  • from_context

    Create an EdgeQueryPrompt from a context dictionary.

build

build() -> tuple[str, str]

Build the system and user prompts.

Returns:

  • tuple[str, str]

    Tuple of (system_prompt, user_prompt).

Source code in src\causaliq_knowledge\llm\prompts.py
def build(self) -> tuple[str, str]:
    """Build the system and user prompts.

    Returns:
        Tuple of (system_prompt, user_prompt).
    """
    system = self.system_prompt or DEFAULT_SYSTEM_PROMPT

    # Build user prompt
    if self.domain:
        user = USER_PROMPT_WITH_DOMAIN_TEMPLATE.format(
            domain=self.domain,
            node_a=self.node_a,
            node_b=self.node_b,
        )
    else:
        user = USER_PROMPT_TEMPLATE.format(
            node_a=self.node_a,
            node_b=self.node_b,
        )

    # Add variable descriptions if provided
    if self.descriptions:
        desc_a = self.descriptions.get(self.node_a, "No description")
        desc_b = self.descriptions.get(self.node_b, "No description")
        user += VARIABLE_DESCRIPTIONS_TEMPLATE.format(
            node_a=self.node_a,
            desc_a=desc_a,
            node_b=self.node_b,
            desc_b=desc_b,
        )

    return system, user

from_context classmethod

from_context(
    node_a: str, node_b: str, context: Optional[dict] = None
) -> EdgeQueryPrompt

Create an EdgeQueryPrompt from a context dictionary.

This is a convenience method for creating prompts from the context dict used by KnowledgeProvider.query_edge().

Parameters:

  • node_a

    (str) –

    Name of the first variable.

  • node_b

    (str) –

    Name of the second variable.

  • context

    (Optional[dict], default: None ) –

    Optional context dict with keys: - domain: str - descriptions: dict[str, str] - system_prompt: str

Returns:

Source code in src\causaliq_knowledge\llm\prompts.py
@classmethod
def from_context(
    cls,
    node_a: str,
    node_b: str,
    context: Optional[dict] = None,
) -> "EdgeQueryPrompt":
    """Create an EdgeQueryPrompt from a context dictionary.

    This is a convenience method for creating prompts from the
    context dict used by KnowledgeProvider.query_edge().

    Args:
        node_a: Name of the first variable.
        node_b: Name of the second variable.
        context: Optional context dict with keys:
            - domain: str
            - descriptions: dict[str, str]
            - system_prompt: str

    Returns:
        EdgeQueryPrompt instance.
    """
    if context is None:
        return cls(node_a=node_a, node_b=node_b)

    return cls(
        node_a=node_a,
        node_b=node_b,
        domain=context.get("domain"),
        descriptions=context.get("descriptions"),
        system_prompt=context.get("system_prompt"),
    )

Usage Example

from causaliq_knowledge.llm import EdgeQueryPrompt
from causaliq_knowledge.llm import GroqClient, GroqConfig

# Create a prompt for querying the relationship between two variables
prompt = EdgeQueryPrompt(
    node_a="smoking",
    node_b="lung_cancer",
    domain="medicine",
    descriptions={
        "smoking": "Tobacco consumption frequency",
        "lung_cancer": "Diagnosis of lung cancer",
    },
)

# Build the system and user prompts
system_prompt, user_prompt = prompt.build()

# Use with GroqClient
config = GroqConfig(model="llama-3.1-8b-instant")
client = GroqClient(config=config)
messages = [
    {"role": "system", "content": system_prompt},
    {"role": "user", "content": user_prompt},
]
json_data, response = client.complete_json(messages)

Using from_context

The from_context class method provides a convenient way to create prompts from a context dictionary, which is the format used by KnowledgeProvider.query_edge():

context = {
    "domain": "economics",
    "descriptions": {
        "interest_rate": "Central bank interest rate",
        "inflation": "Consumer price index change",
    },
}

prompt = EdgeQueryPrompt.from_context(
    node_a="interest_rate",
    node_b="inflation",
    context=context,
)

parse_edge_response

parse_edge_response

parse_edge_response(
    json_data: Optional[dict], model: Optional[str] = None
) -> EdgeKnowledge

Parse a JSON response dict into an EdgeKnowledge object.

Parameters:

  • json_data

    (Optional[dict]) –

    Parsed JSON dict from LLM response, or None if parsing failed.

  • model

    (Optional[str], default: None ) –

    Optional model identifier to include in the result.

Returns:

  • EdgeKnowledge

    EdgeKnowledge object. Returns uncertain result if json_data is None

  • EdgeKnowledge

    or missing required fields.

Source code in src\causaliq_knowledge\llm\prompts.py
def parse_edge_response(
    json_data: Optional[dict],
    model: Optional[str] = None,
) -> EdgeKnowledge:
    """Parse a JSON response dict into an EdgeKnowledge object.

    Args:
        json_data: Parsed JSON dict from LLM response, or None if parsing
            failed.
        model: Optional model identifier to include in the result.

    Returns:
        EdgeKnowledge object. Returns uncertain result if json_data is None
        or missing required fields.
    """
    if json_data is None:
        return EdgeKnowledge.uncertain(
            reasoning="Failed to parse LLM response as JSON",
            model=model,
        )

    # Extract fields with defaults
    exists = json_data.get("exists")
    direction_str = json_data.get("direction")
    confidence = json_data.get("confidence", 0.0)
    reasoning = json_data.get("reasoning", "")

    # Validate confidence is a number
    try:
        confidence = float(confidence)
        confidence = max(0.0, min(1.0, confidence))
    except (TypeError, ValueError):
        confidence = 0.0

    # Convert direction string to enum
    direction = None
    if direction_str:
        try:
            direction = EdgeDirection(direction_str.lower())
        except ValueError:
            # Invalid direction, leave as None
            pass

    return EdgeKnowledge(
        exists=exists,
        direction=direction,
        confidence=confidence,
        reasoning=str(reasoning),
        model=model,
    )

Usage Example

from causaliq_knowledge.llm import (
    EdgeQueryPrompt,
    GroqClient,
    GroqConfig,
    parse_edge_response,
)

# Create client and prompt
config = GroqConfig(model="llama-3.1-8b-instant")
client = GroqClient(config=config)
prompt = EdgeQueryPrompt("X", "Y", domain="statistics")
system, user = prompt.build()

# Query the LLM
messages = [
    {"role": "system", "content": system},
    {"role": "user", "content": user},
]
json_data, response = client.complete_json(messages)

# Parse the response into EdgeKnowledge
knowledge = parse_edge_response(json_data, model="groq/llama-3.1-8b-instant")

print(f"Edge exists: {knowledge.exists}")
print(f"Direction: {knowledge.direction}")
print(f"Confidence: {knowledge.confidence}")
print(f"Reasoning: {knowledge.reasoning}")

Prompt Templates

The module exports several template constants that can be customized:

DEFAULT_SYSTEM_PROMPT

The default system prompt instructs the LLM to act as a causal reasoning expert and respond with structured JSON.

USER_PROMPT_TEMPLATE

Basic user prompt template for querying edge relationships without domain context.

USER_PROMPT_WITH_DOMAIN_TEMPLATE

User prompt template that includes domain context for more accurate responses.

VARIABLE_DESCRIPTIONS_TEMPLATE

Template addition for including variable descriptions in the prompt.

Custom System Prompts

You can provide a custom system prompt to EdgeQueryPrompt:

custom_system = """You are a biomedical expert.
Assess causal relationships based on established medical literature.
Respond with JSON: {"exists": bool, "direction": str, "confidence": float, "reasoning": str}
"""

prompt = EdgeQueryPrompt(
    node_a="gene_X",
    node_b="protein_Y",
    domain="molecular_biology",
    system_prompt=custom_system,
)