LLM Knowledge Provider¶
The LLMKnowledge class is the main entry point for querying LLMs about
causal relationships. It implements the KnowledgeProvider interface and
supports multi-model consensus using vendor-specific API clients.
Architecture¶
LLMKnowledge uses direct vendor-specific API clients rather than wrapper
libraries like LiteLLM or LangChain. Currently supported providers:
- Groq: Fast inference for open-source models (free tier)
- Gemini: Google's Gemini models (generous free tier)
- OpenAI: GPT-4o and other OpenAI models
- Anthropic: Claude models
- DeepSeek: DeepSeek-V3 and DeepSeek-R1 models
- Mistral: Mistral AI models
- Ollama: Local LLMs (Llama, Mistral, Phi, etc.)
Usage¶
from causaliq_knowledge.llm import LLMKnowledge
# Single model (default: Groq)
provider = LLMKnowledge()
# Query about a potential edge
result = provider.query_edge("smoking", "lung_cancer")
print(f"Exists: {result.exists}")
print(f"Direction: {result.direction}")
print(f"Confidence: {result.confidence}")
# Multi-model consensus
provider = LLMKnowledge(
models=["groq/llama-3.1-8b-instant", "gemini/gemini-2.5-flash"],
consensus_strategy="weighted_vote",
)
result = provider.query_edge(
"exercise",
"heart_health",
context={"domain": "medicine"},
)
Model Identifiers¶
Models are specified with a provider prefix:
| Provider | Format | Example |
|---|---|---|
| Groq | groq/<model> |
groq/llama-3.1-8b-instant |
| Gemini | gemini/<model> |
gemini/gemini-2.5-flash |
| OpenAI | openai/<model> |
openai/gpt-4o-mini |
| Anthropic | anthropic/<model> |
anthropic/claude-sonnet-4-20250514 |
| DeepSeek | deepseek/<model> |
deepseek/deepseek-chat |
| Mistral | mistral/<model> |
mistral/mistral-small-latest |
| Ollama | ollama/<model> |
ollama/llama3 |
LLMKnowledge¶
LLMKnowledge
¶
LLMKnowledge(
models: Optional[list[str]] = None,
consensus_strategy: str = "weighted_vote",
temperature: float = 0.1,
max_tokens: int = 500,
timeout: float = 30.0,
max_retries: int = 3,
)
LLM-based knowledge provider using direct API clients.
This provider queries one or more LLMs about causal relationships and combines their responses using a configurable consensus strategy. Uses direct API clients for reliability and control.
Attributes:
-
models(list[str]) –List of model identifiers (e.g., "groq/llama-3.1-8b-instant").
-
consensus_strategy(str) –Strategy for combining multi-model responses.
-
clients(str) –Dict mapping model names to direct client instances.
Example
provider = LLMKnowledge(models=["groq/llama-3.1-8b-instant"]) result = provider.query_edge("smoking", "lung_cancer") print(f"Exists: {result.exists}, Confidence: {result.confidence}")
Multi-model consensus¶
provider = LLMKnowledge( ... models=[ ... "groq/llama-3.1-8b-instant", ... "gemini/gemini-2.5-flash" ... ], ... consensus_strategy="weighted_vote" ... )
Parameters:
-
(models¶Optional[list[str]], default:None) –List of model identifiers. Supported formats: - "groq/llama-3.1-8b-instant" (Groq API) - "gemini/gemini-2.5-flash" (Google Gemini API) - "ollama/llama3.2:1b" (Local Ollama server) Defaults to ["groq/llama-3.1-8b-instant"].
-
(consensus_strategy¶str, default:'weighted_vote') –How to combine multi-model responses. Options: "weighted_vote", "highest_confidence".
-
(temperature¶float, default:0.1) –LLM temperature (lower = more deterministic).
-
(max_tokens¶int, default:500) –Maximum tokens in LLM response.
-
(timeout¶float, default:30.0) –Request timeout in seconds.
-
(max_retries¶int, default:3) –Number of retries on failure (unused for direct APIs).
Raises:
-
ValueError–If consensus_strategy is not recognized or unsupported model.
Methods:
-
query_edge–Query LLMs about a potential causal edge.
-
get_stats–Get combined statistics from all clients.
query_edge
¶
query_edge(node_a: str, node_b: str, context: Optional[dict] = None) -> EdgeKnowledge
Query LLMs about a potential causal edge.
Parameters:
-
(node_a¶str) –Name of the first variable.
-
(node_b¶str) –Name of the second variable.
-
(context¶Optional[dict], default:None) –Optional context dict with keys: - domain: str - Domain context (e.g., "medicine") - descriptions: dict[str, str] - Variable descriptions - system_prompt: str - Custom system prompt
Returns:
-
EdgeKnowledge–EdgeKnowledge with combined result from all models.
get_stats
¶
Get combined statistics from all clients.
Returns:
-
Dict[str, Any]–Dict with total_calls, total_cost, and per-model stats.
Consensus Strategies¶
When using multiple models, responses are combined using a consensus strategy.
weighted_vote¶
The default strategy. Combines responses by:
- Existence: Weighted vote by confidence (True, False, or None)
- Direction: Weighted majority among agreeing models
- Confidence: Average confidence of agreeing models
- Reasoning: Combined from all models
weighted_vote
¶
weighted_vote(responses: list[EdgeKnowledge]) -> EdgeKnowledge
Combine multiple responses using weighted voting.
Strategy: - For existence: weighted vote by confidence - For direction: weighted majority among those agreeing on existence - Final confidence: average confidence of agreeing models - Reasoning: combine reasoning from all models
Parameters:
-
(responses¶list[EdgeKnowledge]) –List of EdgeKnowledge from different models.
Returns:
-
EdgeKnowledge–Combined EdgeKnowledge result.
highest_confidence¶
Simply returns the response with the highest confidence score.
highest_confidence
¶
highest_confidence(responses: list[EdgeKnowledge]) -> EdgeKnowledge
Return the response with highest confidence.
Parameters:
-
(responses¶list[EdgeKnowledge]) –List of EdgeKnowledge from different models.
Returns:
-
EdgeKnowledge–EdgeKnowledge with highest confidence score.
Example: Multi-Model Comparison¶
from causaliq_knowledge.llm import LLMKnowledge
# Query multiple models (Groq + Gemini)
provider = LLMKnowledge(
models=["groq/llama-3.1-8b-instant", "gemini/gemini-2.5-flash"],
consensus_strategy="weighted_vote",
temperature=0.1, # Low temperature for consistency
)
# Query with domain context
result = provider.query_edge(
node_a="interest_rate",
node_b="inflation",
context={
"domain": "macroeconomics",
"descriptions": {
"interest_rate": "Central bank policy rate",
"inflation": "Year-over-year CPI change",
},
},
)
print(f"Combined result: {result.exists} ({result.direction})")
print(f"Confidence: {result.confidence:.2f}")
print(f"Reasoning: {result.reasoning}")
# Check usage stats
stats = provider.get_stats()
print(f"Total cost: ${stats['total_cost']:.4f}")