Skip to content

Mistral Client

Direct Mistral AI API client for Mistral models.

Overview

Mistral AI is a French AI company known for high-quality open-weight and proprietary models. Their API is OpenAI-compatible, making integration straightforward.

Key features:

  • Mistral Small: Fast, cost-effective for simple tasks
  • Mistral Large: Most capable, best for complex reasoning
  • Codestral: Optimized for code generation
  • Strong EU-based option for data sovereignty
  • OpenAI-compatible API

Configuration

The client requires a MISTRAL_API_KEY environment variable:

# Linux/macOS
export MISTRAL_API_KEY="your-api-key"

# Windows PowerShell
$env:MISTRAL_API_KEY="your-api-key"

# Windows cmd
set MISTRAL_API_KEY=your-api-key

Get your API key from: https://console.mistral.ai

Usage

Basic Usage

from causaliq_knowledge.llm import MistralClient, MistralConfig

# Default config (uses MISTRAL_API_KEY env var)
client = MistralClient()

# Or with custom config
config = MistralConfig(
    model="mistral-small-latest",
    temperature=0.1,
    max_tokens=500,
    timeout=30.0,
)
client = MistralClient(config)

# Make a completion request
messages = [{"role": "user", "content": "What is 2 + 2?"}]
response = client.completion(messages)
print(response.content)

Using with CLI

# Query with Mistral
cqknow query smoking lung_cancer --model mistral/mistral-small-latest

# Use large model for complex queries
cqknow query income education --model mistral/mistral-large-latest --domain economics

# List available Mistral models
cqknow models mistral

Using with LLMKnowledge Provider

from causaliq_knowledge.llm import LLMKnowledge

# Single model
provider = LLMKnowledge(models=["mistral/mistral-small-latest"])
result = provider.query_edge("smoking", "lung_cancer")

# Multi-model consensus
provider = LLMKnowledge(
    models=[
        "mistral/mistral-large-latest",
        "groq/llama-3.1-8b-instant",
    ],
    consensus_strategy="weighted_vote",
)

Available Models

Model Description Best For
mistral-small-latest Fast, cost-effective Simple tasks
mistral-medium-latest Balanced performance General use
mistral-large-latest Most capable Complex reasoning
codestral-latest Code-optimized Programming tasks
open-mistral-nemo 12B open model Budget-friendly
open-mixtral-8x7b MoE open model Balanced open model
ministral-3b-latest Ultra-small Edge deployment
ministral-8b-latest Small Resource-constrained

Pricing

Mistral AI offers competitive pricing (as of Jan 2025):

Model Input (per 1M tokens) Output (per 1M tokens)
mistral-small $0.20 $0.60
mistral-medium $2.70 $8.10
mistral-large $2.00 $6.00
codestral $0.20 $0.60
open-mistral-nemo $0.15 $0.15
ministral-3b $0.04 $0.04
ministral-8b $0.10 $0.10

See Mistral pricing for details.

API Reference

MistralConfig dataclass

MistralConfig(
    model: str = "mistral-small-latest",
    temperature: float = 0.1,
    max_tokens: int = 500,
    timeout: float = 30.0,
    api_key: Optional[str] = None,
)

Configuration for Mistral AI API client.

Extends OpenAICompatConfig with Mistral-specific defaults.

Attributes:

  • model (str) –

    Mistral model identifier (default: mistral-small-latest).

  • temperature (float) –

    Sampling temperature (default: 0.1).

  • max_tokens (int) –

    Maximum response tokens (default: 500).

  • timeout (float) –

    Request timeout in seconds (default: 30.0).

  • api_key (Optional[str]) –

    Mistral API key (falls back to MISTRAL_API_KEY env var).

Methods:

  • __post_init__

    Set API key from environment if not provided.

__post_init__

__post_init__() -> None

Set API key from environment if not provided.

MistralClient

MistralClient(config: Optional[MistralConfig] = None)

Direct Mistral AI API client.

Mistral AI is a French company providing high-quality LLMs with an OpenAI-compatible API.

Available models
  • mistral-small-latest: Fast, cost-effective
  • mistral-medium-latest: Balanced performance
  • mistral-large-latest: Most capable
  • codestral-latest: Optimized for code
Example

config = MistralConfig(model="mistral-small-latest") client = MistralClient(config) msgs = [{"role": "user", "content": "Hello"}] response = client.completion(msgs) print(response.content)

Parameters:

  • config

    (Optional[MistralConfig], default: None ) –

    Mistral configuration. If None, uses defaults with API key from MISTRAL_API_KEY environment variable.

Methods:

Attributes:

  • cache (Optional['TokenCache']) –

    Return the configured cache, if any.

  • call_count (int) –

    Return the number of API calls made.

  • model_name (str) –

    Return the model name being used.

  • provider_name (str) –

    Return the provider name.

  • use_cache (bool) –

    Return whether caching is enabled.

cache property

cache: Optional['TokenCache']

Return the configured cache, if any.

call_count property

call_count: int

Return the number of API calls made.

model_name property

model_name: str

Return the model name being used.

Returns:

  • str

    Model identifier string.

provider_name property

provider_name: str

Return the provider name.

use_cache property

use_cache: bool

Return whether caching is enabled.

cached_completion

cached_completion(messages: List[Dict[str, str]], **kwargs: Any) -> LLMResponse

Make a completion request with caching.

If caching is enabled and a cached response exists, returns the cached response without making an API call. Otherwise, makes the API call and caches the result.

Parameters:

  • messages

    (List[Dict[str, str]]) –

    List of message dicts with "role" and "content" keys.

  • **kwargs

    (Any, default: {} ) –

    Provider-specific options (temperature, max_tokens, etc.)

Returns:

  • LLMResponse

    LLMResponse with the generated content and metadata.

complete_json

complete_json(
    messages: List[Dict[str, str]], **kwargs: Any
) -> tuple[Optional[Dict[str, Any]], LLMResponse]

Make a completion request and parse response as JSON.

Parameters:

  • messages

    (List[Dict[str, str]]) –

    List of message dicts with "role" and "content" keys.

  • **kwargs

    (Any, default: {} ) –

    Override config options passed to completion().

Returns:

  • tuple[Optional[Dict[str, Any]], LLMResponse]

    Tuple of (parsed JSON dict or None, raw LLMResponse).

completion

completion(messages: List[Dict[str, str]], **kwargs: Any) -> LLMResponse

Make a chat completion request.

Parameters:

  • messages

    (List[Dict[str, str]]) –

    List of message dicts with "role" and "content" keys.

  • **kwargs

    (Any, default: {} ) –

    Override config options (temperature, max_tokens).

Returns:

  • LLMResponse

    LLMResponse with the generated content and metadata.

Raises:

  • ValueError

    If the API request fails.

is_available

is_available() -> bool

Check if the API is available.

Returns:

  • bool

    True if API key is configured.

list_models

list_models() -> List[str]

List available models from the API.

Queries the API to get models accessible with the current API key, then filters using _filter_models().

Returns:

  • List[str]

    List of model identifiers.

Raises:

  • ValueError

    If the API request fails.

set_cache

set_cache(cache: Optional['TokenCache'], use_cache: bool = True) -> None

Configure caching for this client.

Parameters:

  • cache

    (Optional['TokenCache']) –

    TokenCache instance for caching, or None to disable.

  • use_cache

    (bool, default: True ) –

    Whether to use the cache (default True).