mnesis.compaction.engine¶
engine
¶
Compaction orchestration engine with condensation and multi-round loop.
This engine implements the full Volt-compatible compaction flow:
- Tool output pruning — backward-scan and tombstone oversized tool outputs.
- Summarisation — level 1 → level 2 → level 3 escalation for raw messages.
- Condensation — if accumulated summary nodes still exceed the hard threshold after summarisation, condense them (level 1 → 2 → 3).
- Multi-round loop — repeat steps 2-3 up to
max_compaction_roundstimes until either the context fits or no progress is made.
Soft/hard threshold distinction:
- Soft (
soft_threshold_fraction* usable, default 60 %) — triggers early background compaction so the next turn is likely already compact. - Hard (100 % of usable) — blocks the next send until compaction finishes, preventing an over-limit context from reaching the LLM.
File IDs are propagated through every compaction round; see
:mod:mnesis.compaction.file_ids and :mod:mnesis.compaction.levels.
CompactionEngine
¶
CompactionEngine(
store: ImmutableStore,
dag_store: SummaryDAGStore,
token_estimator: TokenEstimator,
event_bus: EventBus,
config: MnesisConfig,
id_generator: Any = None,
session_model: str | None = None,
)
Orchestrates the full compaction protocol (summarise → condense → loop).
Guarantees:
- run_compaction() never raises — errors are caught and Level 3 runs.
- The resulting summary always fits within the token budget.
- Level 3 (deterministic) is the final fallback and always succeeds.
- Atomic SQLite commit per round: partial failures leave no inconsistent state.
- The EventBus receives COMPACTION_COMPLETED (or COMPACTION_FAILED).
Threshold semantics:
- Soft threshold (
soft_threshold_fraction, default 60 %): triggers early background compaction viacheck_and_trigger(). This keeps the context lean well before the hard limit. - Hard threshold (100 % of usable): checked inside
session.send()before the LLM call; if exceeded, the caller shouldawait wait_for_pending()to block until compaction completes.
Example::
engine = CompactionEngine(store, dag_store, estimator, event_bus, config)
if engine.is_soft_overflow(tokens, model_info):
engine.check_and_trigger(session_id, tokens, model_info) # non-blocking
if engine.is_hard_overflow(tokens, model_info):
await engine.wait_for_pending() # block if must compact before LLM call
is_soft_overflow
¶
Return True if the soft threshold has been crossed.
The soft threshold triggers early background compaction (non-blocking).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
tokens
|
TokenUsage
|
Current cumulative token usage. |
required |
model
|
ModelInfo
|
Model metadata providing context limit. |
required |
Returns:
| Type | Description |
|---|---|
bool
|
True if tokens exceed |
is_hard_overflow
¶
Return True if the hard threshold has been crossed.
The hard threshold means the current context must be compacted before the next LLM call to avoid an over-limit request.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
tokens
|
TokenUsage
|
Current cumulative token usage. |
required |
model
|
ModelInfo
|
Model metadata providing context limit. |
required |
Returns:
| Type | Description |
|---|---|
bool
|
True if tokens exceed the full usable budget. |
is_overflow
¶
Return True if compaction should be triggered (soft threshold check).
Preserved for backwards compatibility; delegates to
:meth:is_soft_overflow.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
tokens
|
TokenUsage
|
Current cumulative token usage for the session. |
required |
model
|
ModelInfo
|
Model metadata providing context limit. |
required |
Returns:
| Type | Description |
|---|---|
bool
|
True if tokens exceed the soft threshold. |
check_and_trigger
¶
check_and_trigger(
session_id: str,
tokens: TokenUsage,
model: ModelInfo,
abort: Event | None = None,
) -> bool
Check for overflow and trigger async background compaction if needed.
Non-blocking — schedules a background task and returns immediately.
The task handle is stored in self._pending_task so callers can
await or cancel it during shutdown.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
session_id
|
str
|
The session to compact. |
required |
tokens
|
TokenUsage
|
Current token usage. |
required |
model
|
ModelInfo
|
Model info for overflow detection. |
required |
abort
|
Event | None
|
Optional event to signal early termination. |
None
|
Returns:
| Type | Description |
|---|---|
bool
|
True if compaction was triggered, False otherwise. |
wait_for_pending
async
¶
Await any in-flight background compaction task to natural completion, then clear it.
run_compaction
async
¶
run_compaction(
session_id: str,
abort: Event | None = None,
model_override: str | None = None,
) -> CompactionResult
Run the full compaction protocol. Never raises.
Steps (per round, up to max_compaction_rounds):
1. Run tool output pruner (reduce input size first).
2. Summarise raw messages: level 1 → level 2 → level 3.
3. Condense accumulated summary nodes if still over budget: lvl 1→2→3.
4. If no progress was made, break early to avoid spinning.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
session_id
|
str
|
The session to compact. |
required |
abort
|
Event | None
|
Optional asyncio.Event — checked before each level attempt. |
None
|
model_override
|
str | None
|
Override compaction model (for testing). |
None
|
Returns:
| Type | Description |
|---|---|
CompactionResult
|
CompactionResult describing what happened. |