mnesis.compaction.file_ids¶
file_ids
¶
File ID extraction and propagation utilities.
Volt's "lossless" guarantee rests on preserving file_xxx identifiers
across every compaction round. Even when prose context is discarded, the
pointer to the external file content is never lost.
This module provides:
- :func:extract_file_ids — pull all file_<hex> references from text.
- :func:append_file_ids_footer — attach a [LCM File IDs: ...] footer to
a summary string when file IDs are present.
- :func:collect_file_ids_from_nodes — aggregate file IDs from a list of
SummaryNode objects (for condensation input propagation).
extract_file_ids
¶
Extract all file_<hex> identifiers from text.
Deduplicates and preserves first-occurrence order.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
Raw text that may contain LCM file ID references. |
required |
Returns:
| Type | Description |
|---|---|
list[str]
|
Ordered, deduplicated list of file ID strings (e.g. |
list[str]
|
|
extract_file_ids_from_messages
¶
Extract all file IDs referenced across a list of messages.
Concatenates all text content from messages and deduplicates.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
messages
|
list[MessageWithParts]
|
Messages to scan for file ID references. |
required |
Returns:
| Type | Description |
|---|---|
list[str]
|
Ordered, deduplicated list of file ID strings. |
collect_file_ids_from_nodes
¶
Aggregate all file IDs already embedded in a list of summary nodes.
Each node's content may contain a [LCM File IDs: ...] footer; this
function extracts IDs from every node and returns a deduplicated union.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
nodes
|
list[SummaryNode]
|
Summary nodes whose content is scanned for file IDs. |
required |
Returns:
| Type | Description |
|---|---|
list[str]
|
Ordered, deduplicated list of file ID strings. |
append_file_ids_footer
¶
Append a [LCM File IDs: ...] footer to text when file_ids is non-empty.
If text already contains the footer this function is idempotent — it will not duplicate the block. The footer is always placed at the end.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
The summary text to annotate. |
required |
file_ids
|
list[str]
|
File IDs to include in the footer. |
required |
Returns:
| Type | Description |
|---|---|
str
|
Annotated text, or the original text unchanged if file_ids is empty. |