Skip to content

mnesis.compaction.file_ids

file_ids

File ID extraction and propagation utilities.

Volt's "lossless" guarantee rests on preserving file_xxx identifiers across every compaction round. Even when prose context is discarded, the pointer to the external file content is never lost.

This module provides: - :func:extract_file_ids — pull all file_<hex> references from text. - :func:append_file_ids_footer — attach a [LCM File IDs: ...] footer to a summary string when file IDs are present. - :func:collect_file_ids_from_nodes — aggregate file IDs from a list of SummaryNode objects (for condensation input propagation).

extract_file_ids

extract_file_ids(text: str) -> list[str]

Extract all file_<hex> identifiers from text.

Deduplicates and preserves first-occurrence order.

Parameters:

Name Type Description Default
text str

Raw text that may contain LCM file ID references.

required

Returns:

Type Description
list[str]

Ordered, deduplicated list of file ID strings (e.g.

list[str]

["file_a1b2c3d4", "file_deadbeef12345678"]).

extract_file_ids_from_messages

extract_file_ids_from_messages(
    messages: list[MessageWithParts],
) -> list[str]

Extract all file IDs referenced across a list of messages.

Concatenates all text content from messages and deduplicates.

Parameters:

Name Type Description Default
messages list[MessageWithParts]

Messages to scan for file ID references.

required

Returns:

Type Description
list[str]

Ordered, deduplicated list of file ID strings.

collect_file_ids_from_nodes

collect_file_ids_from_nodes(
    nodes: list[SummaryNode],
) -> list[str]

Aggregate all file IDs already embedded in a list of summary nodes.

Each node's content may contain a [LCM File IDs: ...] footer; this function extracts IDs from every node and returns a deduplicated union.

Parameters:

Name Type Description Default
nodes list[SummaryNode]

Summary nodes whose content is scanned for file IDs.

required

Returns:

Type Description
list[str]

Ordered, deduplicated list of file ID strings.

append_file_ids_footer(
    text: str, file_ids: list[str]
) -> str

Append a [LCM File IDs: ...] footer to text when file_ids is non-empty.

If text already contains the footer this function is idempotent — it will not duplicate the block. The footer is always placed at the end.

Parameters:

Name Type Description Default
text str

The summary text to annotate.

required
file_ids list[str]

File IDs to include in the footer.

required

Returns:

Type Description
str

Annotated text, or the original text unchanged if file_ids is empty.