Skip to content

mnesis.tokens.estimator

estimator

Multi-model token estimation with caching and graceful fallback.

TokenEstimator

TokenEstimator(*, heuristic_only: bool = False)

Multi-model token counting with caching and graceful fallback.

Priority order: 1. tiktoken for OpenAI model families (gpt-4, gpt-3.5, o1, o3) 2. Character-based heuristic (len // 3) for Claude models 3. Character-based heuristic (len // 4) for all other models

Caching: - Encoder objects are cached by encoding name (one load per process). - Token counts are cached by SHA-256 of content for immutable content (file references, summary nodes). Use estimate_cached() for this.

Parameters:

Name Type Description Default
heuristic_only bool

When True, always use the character-based heuristic and skip tiktoken entirely. Useful for testing, benchmarking, or environments where tiktoken is not installed.

False

estimate

estimate(text: str, model: ModelInfo | None = None) -> int

Estimate the token count for a string.

Parameters:

Name Type Description Default
text str

The text to estimate.

required
model ModelInfo | None

Optional model info for accurate tokenisation. Uses heuristic when None or model encoding is unknown.

None

Returns:

Type Description
int

Estimated token count, always >= 1 for non-empty text.

estimate_cached

estimate_cached(
    text: str,
    cache_key: str,
    model: ModelInfo | None = None,
) -> int

Estimate with caching, keyed by cache_key.

Use for immutable content (file references, summary nodes) where the same text will be estimated multiple times across context builds.

Parameters:

Name Type Description Default
text str

The text to estimate.

required
cache_key str

A stable identifier for this content (e.g. SHA-256 hash).

required
model ModelInfo | None

Optional model info for accurate tokenisation.

None

Returns:

Type Description
int

Estimated token count from cache or fresh computation.

estimate_message

estimate_message(
    msg: MessageWithParts, model: ModelInfo | None = None
) -> int

Estimate total tokens for a message including all non-pruned parts.

Pruned tool outputs (compacted_at set) contribute only the tombstone string length rather than the full output length.

Parameters:

Name Type Description Default
msg MessageWithParts

The message with its associated parts.

required
model ModelInfo | None

Optional model info for accurate tokenisation.

None

Returns:

Type Description
int

Total estimated token count for the message.

content_hash staticmethod

content_hash(text: str) -> str

Return a stable SHA-256 hex digest for use as a cache key.