mnesis.compaction¶
compaction
¶
Mnesis compaction components.
CompactionEngine
¶
CompactionEngine(
store: ImmutableStore,
dag_store: SummaryDAGStore,
token_estimator: TokenEstimator,
event_bus: EventBus,
config: MnesisConfig,
id_generator: Any = None,
session_model: str | None = None,
)
Orchestrates the full compaction protocol (summarise → condense → loop).
Guarantees:
- run_compaction() never raises — errors are caught and Level 3 runs.
- The resulting summary always fits within the token budget.
- Level 3 (deterministic) is the final fallback and always succeeds.
- Atomic SQLite commit per round: partial failures leave no inconsistent state.
- The EventBus receives COMPACTION_COMPLETED (or COMPACTION_FAILED).
Threshold semantics:
- Soft threshold (
soft_threshold_fraction, default 60 %): triggers early background compaction viacheck_and_trigger(). This keeps the context lean well before the hard limit. - Hard threshold (100 % of usable): checked inside
session.send()before the LLM call; if exceeded, the caller shouldawait wait_for_pending()to block until compaction completes.
Example::
engine = CompactionEngine(store, dag_store, estimator, event_bus, config)
if engine.is_soft_overflow(tokens, model_info):
engine.check_and_trigger(session_id, tokens, model_info) # non-blocking
if engine.is_hard_overflow(tokens, model_info):
await engine.wait_for_pending() # block if must compact before LLM call
is_soft_overflow
¶
Return True if the soft threshold has been crossed.
The soft threshold triggers early background compaction (non-blocking).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
tokens
|
TokenUsage
|
Current cumulative token usage. |
required |
model
|
ModelInfo
|
Model metadata providing context limit. |
required |
Returns:
| Type | Description |
|---|---|
bool
|
True if tokens exceed |
is_hard_overflow
¶
Return True if the hard threshold has been crossed.
The hard threshold means the current context must be compacted before the next LLM call to avoid an over-limit request.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
tokens
|
TokenUsage
|
Current cumulative token usage. |
required |
model
|
ModelInfo
|
Model metadata providing context limit. |
required |
Returns:
| Type | Description |
|---|---|
bool
|
True if tokens exceed the full usable budget. |
is_overflow
¶
Return True if compaction should be triggered (soft threshold check).
Preserved for backwards compatibility; delegates to
:meth:is_soft_overflow.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
tokens
|
TokenUsage
|
Current cumulative token usage for the session. |
required |
model
|
ModelInfo
|
Model metadata providing context limit. |
required |
Returns:
| Type | Description |
|---|---|
bool
|
True if tokens exceed the soft threshold. |
check_and_trigger
¶
check_and_trigger(
session_id: str,
tokens: TokenUsage,
model: ModelInfo,
abort: Event | None = None,
) -> bool
Check for overflow and trigger async background compaction if needed.
Non-blocking — schedules a background task and returns immediately.
The task handle is stored in self._pending_task so callers can
await or cancel it during shutdown.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
session_id
|
str
|
The session to compact. |
required |
tokens
|
TokenUsage
|
Current token usage. |
required |
model
|
ModelInfo
|
Model info for overflow detection. |
required |
abort
|
Event | None
|
Optional event to signal early termination. |
None
|
Returns:
| Type | Description |
|---|---|
bool
|
True if compaction was triggered, False otherwise. |
wait_for_pending
async
¶
Await any in-flight background compaction task to natural completion, then clear it.
run_compaction
async
¶
run_compaction(
session_id: str,
abort: Event | None = None,
model_override: str | None = None,
) -> CompactionResult
Run the full compaction protocol. Never raises.
Steps (per round, up to max_compaction_rounds):
1. Run tool output pruner (reduce input size first).
2. Summarise raw messages: level 1 → level 2 → level 3.
3. Condense accumulated summary nodes if still over budget: lvl 1→2→3.
4. If no progress was made, break early to avoid spinning.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
session_id
|
str
|
The session to compact. |
required |
abort
|
Event | None
|
Optional asyncio.Event — checked before each level attempt. |
None
|
model_override
|
str | None
|
Override compaction model (for testing). |
None
|
Returns:
| Type | Description |
|---|---|
CompactionResult
|
CompactionResult describing what happened. |
SummaryCandidate
dataclass
¶
SummaryCandidate(
text: str,
token_count: int,
span_start_message_id: str,
span_end_message_id: str,
compaction_level: int,
messages_covered: int,
)
A candidate compaction summary before it is committed to the store.
ToolOutputPruner
¶
level1_summarise
async
¶
level1_summarise(
messages: list[MessageWithParts],
model: str,
budget: ContextBudget,
estimator: TokenEstimator,
llm_call: Any,
compaction_prompt: str | None = None,
model_context_limit: int = 200000,
) -> SummaryCandidate | None
Attempt Level 1 (selective) summarisation via LLM.
File IDs found in the input messages are automatically appended to the
summary via a [LCM File IDs: ...] footer.
Input messages are capped at 75 % of the compaction model's context window to prevent the summarisation call itself from overflowing.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
messages
|
list[MessageWithParts]
|
All non-summary messages in the session. |
required |
model
|
str
|
Model string to use for compaction. |
required |
budget
|
ContextBudget
|
Token budget for validation. |
required |
estimator
|
TokenEstimator
|
Token estimator for result validation. |
required |
llm_call
|
Any
|
Async callable |
required |
compaction_prompt
|
str | None
|
Custom system prompt override. |
None
|
model_context_limit
|
int
|
Context window of the compaction model. |
200000
|
Returns:
| Type | Description |
|---|---|
SummaryCandidate | None
|
SummaryCandidate if successful and fits budget, or None to escalate. |
level2_summarise
async
¶
level2_summarise(
messages: list[MessageWithParts],
model: str,
budget: ContextBudget,
estimator: TokenEstimator,
llm_call: Any,
compaction_prompt: str | None = None,
model_context_limit: int = 200000,
) -> SummaryCandidate | None
Attempt Level 2 (aggressive) summarisation via LLM.
Uses a more compressed prompt format and drops reasoning details. File IDs found in the input messages are propagated to the summary. Input is capped at 75 % of the compaction model's context window.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
messages
|
list[MessageWithParts]
|
All non-summary messages in the session. |
required |
model
|
str
|
Model string to use for compaction. |
required |
budget
|
ContextBudget
|
Token budget for validation. |
required |
estimator
|
TokenEstimator
|
Token estimator for result validation. |
required |
llm_call
|
Any
|
Async callable. |
required |
compaction_prompt
|
str | None
|
Custom system prompt override. |
None
|
model_context_limit
|
int
|
Context window of the compaction model. |
200000
|
Returns:
| Type | Description |
|---|---|
SummaryCandidate | None
|
SummaryCandidate if successful and fits budget, or None to escalate. |
level3_deterministic
¶
level3_deterministic(
messages: list[MessageWithParts],
budget: ContextBudget,
estimator: TokenEstimator,
) -> SummaryCandidate
Level 3 deterministic fallback (no LLM required).
Keeps the most recent messages that fit within 85% of the usable budget, prefixed with a truncation notice. This always produces a valid result.
File IDs found in all input messages (not just kept ones) are preserved
in a [LCM File IDs: ...] footer even when their surrounding context is
truncated — this is the lossless guarantee.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
messages
|
list[MessageWithParts]
|
All non-summary messages in the session. |
required |
budget
|
ContextBudget
|
Token budget — result is guaranteed to fit within usable. |
required |
estimator
|
TokenEstimator
|
Token estimator. |
required |
Returns:
| Type | Description |
|---|---|
SummaryCandidate
|
SummaryCandidate that always fits within budget.usable. |