`mnesis.compaction`¶

compaction ¶

Mnesis compaction components.

CompactionEngine ¶

CompactionEngine(
    store: ImmutableStore,
    dag_store: SummaryDAGStore,
    token_estimator: TokenEstimator,
    event_bus: EventBus,
    config: MnesisConfig,
    id_generator: Any = None,
    session_model: str | None = None,
)

Orchestrates the full compaction protocol (summarise → condense → loop).

Guarantees: - run_compaction() never raises — errors are caught and Level 3 runs. - The resulting summary always fits within the token budget. - Level 3 (deterministic) is the final fallback and always succeeds. - Atomic SQLite commit per round: partial failures leave no inconsistent state. - The EventBus receives COMPACTION_COMPLETED (or COMPACTION_FAILED).

Threshold semantics:

Soft threshold (soft_threshold_fraction, default 60 %): triggers early background compaction via check_and_trigger(). This keeps the context lean well before the hard limit.
Hard threshold (100 % of usable): checked inside session.send() before the LLM call; if exceeded, the caller should await wait_for_pending() to block until compaction completes.

Example::

engine = CompactionEngine(store, dag_store, estimator, event_bus, config)
if engine.is_soft_overflow(tokens, model_info):
    engine.check_and_trigger(session_id, tokens, model_info)  # non-blocking
if engine.is_hard_overflow(tokens, model_info):
    await engine.wait_for_pending()  # block if must compact before LLM call

is_soft_overflow ¶

is_soft_overflow(
    tokens: TokenUsage, model: ModelInfo
) -> bool

Return True if the soft threshold has been crossed.

The soft threshold triggers early background compaction (non-blocking).

Parameters:

Name	Type	Description	Default
`tokens`	`TokenUsage`	Current cumulative token usage.	required
`model`	`ModelInfo`	Model metadata providing context limit.	required

Returns:

Type	Description
`bool`	True if tokens exceed `soft_threshold_fraction * usable`.

is_hard_overflow ¶

is_hard_overflow(
    tokens: TokenUsage, model: ModelInfo
) -> bool

Return True if the hard threshold has been crossed.

The hard threshold means the current context must be compacted before the next LLM call to avoid an over-limit request.

Parameters:

Name	Type	Description	Default
`tokens`	`TokenUsage`	Current cumulative token usage.	required
`model`	`ModelInfo`	Model metadata providing context limit.	required

Returns:

Type	Description
`bool`	True if tokens exceed the full usable budget.

is_overflow ¶

is_overflow(tokens: TokenUsage, model: ModelInfo) -> bool

Return True if compaction should be triggered (soft threshold check).

Preserved for backwards compatibility; delegates to :meth:is_soft_overflow.

Parameters:

Name	Type	Description	Default
`tokens`	`TokenUsage`	Current cumulative token usage for the session.	required
`model`	`ModelInfo`	Model metadata providing context limit.	required

Returns:

Type	Description
`bool`	True if tokens exceed the soft threshold.

check_and_trigger ¶

check_and_trigger(
    session_id: str,
    tokens: TokenUsage,
    model: ModelInfo,
    abort: Event | None = None,
) -> bool

Check for overflow and trigger async background compaction if needed.

Non-blocking — schedules a background task and returns immediately. The task handle is stored in self._pending_task so callers can await or cancel it during shutdown.

Parameters:

Name	Type	Description	Default
`session_id`	`str`	The session to compact.	required
`tokens`	`TokenUsage`	Current token usage.	required
`model`	`ModelInfo`	Model info for overflow detection.	required
`abort`	`Event \| None`	Optional event to signal early termination.	`None`

Returns:

Type	Description
`bool`	True if compaction was triggered, False otherwise.

wait_for_pending `async` ¶

wait_for_pending() -> None

Await any in-flight background compaction task to natural completion, then clear it.

run_compaction `async` ¶

run_compaction(
    session_id: str,
    abort: Event | None = None,
    model_override: str | None = None,
) -> CompactionResult

Run the full compaction protocol. Never raises.

Steps (per round, up to max_compaction_rounds): 1. Run tool output pruner (reduce input size first). 2. Summarise raw messages: level 1 → level 2 → level 3. 3. Condense accumulated summary nodes if still over budget: lvl 1→2→3. 4. If no progress was made, break early to avoid spinning.

Parameters:

Name	Type	Description	Default
`session_id`	`str`	The session to compact.	required
`abort`	`Event \| None`	Optional asyncio.Event — checked before each level attempt.	`None`
`model_override`	`str \| None`	Override compaction model (for testing).	`None`

Returns:

Type	Description
`CompactionResult`	CompactionResult describing what happened.

SummaryCandidate `dataclass` ¶

SummaryCandidate(
    text: str,
    token_count: int,
    span_start_message_id: str,
    span_end_message_id: str,
    compaction_level: int,
    messages_covered: int,
)

A candidate compaction summary before it is committed to the store.

ToolOutputPruner ¶

ToolOutputPruner(
    store: ImmutableStore,
    estimator: TokenEstimator,
    config: MnesisConfig,
)

Bases: ToolOutputPruner

Async version that properly resolves part IDs.

prune `async` ¶

prune(session_id: str) -> PruneResult

Run prune pass with async part ID resolution.

level1_summarise `async` ¶

level1_summarise(
    messages: list[MessageWithParts],
    model: str,
    budget: ContextBudget,
    estimator: TokenEstimator,
    llm_call: Any,
    compaction_prompt: str | None = None,
    model_context_limit: int = 200000,
) -> SummaryCandidate | None

Attempt Level 1 (selective) summarisation via LLM.

File IDs found in the input messages are automatically appended to the summary via a [LCM File IDs: ...] footer.

Input messages are capped at 75 % of the compaction model's context window to prevent the summarisation call itself from overflowing.

Parameters:

Name	Type	Description	Default
`messages`	`list[MessageWithParts]`	All non-summary messages in the session.	required
`model`	`str`	Model string to use for compaction.	required
`budget`	`ContextBudget`	Token budget for validation.	required
`estimator`	`TokenEstimator`	Token estimator for result validation.	required
`llm_call`	`Any`	Async callable `(model, messages, max_tokens) -> str`.	required
`compaction_prompt`	`str \| None`	Custom system prompt override.	`None`
`model_context_limit`	`int`	Context window of the compaction model.	`200000`

Returns:

Type	Description
`SummaryCandidate \| None`	SummaryCandidate if successful and fits budget, or None to escalate.

level2_summarise `async` ¶

level2_summarise(
    messages: list[MessageWithParts],
    model: str,
    budget: ContextBudget,
    estimator: TokenEstimator,
    llm_call: Any,
    compaction_prompt: str | None = None,
    model_context_limit: int = 200000,
) -> SummaryCandidate | None

Attempt Level 2 (aggressive) summarisation via LLM.

Uses a more compressed prompt format and drops reasoning details. File IDs found in the input messages are propagated to the summary. Input is capped at 75 % of the compaction model's context window.

Parameters:

Name	Type	Description	Default
`messages`	`list[MessageWithParts]`	All non-summary messages in the session.	required
`model`	`str`	Model string to use for compaction.	required
`budget`	`ContextBudget`	Token budget for validation.	required
`estimator`	`TokenEstimator`	Token estimator for result validation.	required
`llm_call`	`Any`	Async callable.	required
`compaction_prompt`	`str \| None`	Custom system prompt override.	`None`
`model_context_limit`	`int`	Context window of the compaction model.	`200000`

Returns:

Type	Description
`SummaryCandidate \| None`	SummaryCandidate if successful and fits budget, or None to escalate.

level3_deterministic ¶

level3_deterministic(
    messages: list[MessageWithParts],
    budget: ContextBudget,
    estimator: TokenEstimator,
) -> SummaryCandidate

Level 3 deterministic fallback (no LLM required).

Keeps the most recent messages that fit within 85% of the usable budget, prefixed with a truncation notice. This always produces a valid result.

File IDs found in all input messages (not just kept ones) are preserved in a [LCM File IDs: ...] footer even when their surrounding context is truncated — this is the lossless guarantee.

Parameters:

Name	Type	Description	Default
`messages`	`list[MessageWithParts]`	All non-summary messages in the session.	required
`budget`	`ContextBudget`	Token budget — result is guaranteed to fit within usable.	required
`estimator`	`TokenEstimator`	Token estimator.	required

Returns:

Type	Description
`SummaryCandidate`	SummaryCandidate that always fits within budget.usable.

mnesis.compaction¶

compaction ¶

CompactionEngine ¶

is_soft_overflow ¶

is_hard_overflow ¶

is_overflow ¶

check_and_trigger ¶

wait_for_pending async ¶

run_compaction async ¶

SummaryCandidate dataclass ¶

ToolOutputPruner ¶

prune async ¶

level1_summarise async ¶

level2_summarise async ¶

level3_deterministic ¶

`mnesis.compaction`¶

wait_for_pending `async` ¶

run_compaction `async` ¶

SummaryCandidate `dataclass` ¶

prune `async` ¶

level1_summarise `async` ¶

level2_summarise `async` ¶