contract.mux¶
MapFromPathsParams
dataclass
¶
MapFromPathsParams(
paths: List[PathLike],
include_glob_pattern: List[str],
inner_data_stream: Type[_TDataStream],
inner_param_factory: Callable[[str], TReaderParams],
as_collection: bool = True,
exclude_glob_pattern: List[str] = list(),
inner_descriptions: dict[str, Optional[str]] = dict(),
)
Bases: Generic[_TDataStream]
Parameters for creating multiple data streams from file paths.
Defines parameters for locating files and creating data streams for each one.
Attributes:
| Name | Type | Description |
|---|---|---|
paths |
List[PathLike]
|
List of directory paths to search for files. |
include_glob_pattern |
List[str]
|
List of glob patterns to match files to include. |
inner_data_stream |
Type[_TDataStream]
|
Type of DataStream to create for each matched file. |
inner_param_factory |
Callable[[str], TReaderParams]
|
Function that creates reader params from file paths. |
as_collection |
bool
|
Whether to return results as a collection. Defaults to True. |
exclude_glob_pattern |
List[str]
|
List of glob patterns for files to exclude. |
inner_descriptions |
dict[str, Optional[str]]
|
Dictionary mapping file stems to descriptions for streams. |
MapFromPaths ¶
MapFromPaths(
name: str,
*,
description: Optional[str] = None,
reader_params: Optional[TReaderParams] = None,
**kwargs,
)
Bases: DataStreamCollectionBase[_TDataStream, MapFromPathsParams]
File path mapper data stream provider.
A data stream implementation for creating multiple child data streams by searching for files matching glob patterns and creating a stream for each.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
DataStreamCollectionBase
|
Base class for data stream collection providers. |
required |
Examples:
from contraqctor.contract import mux, text
# Define a factory function for TextParams
def create_text_params(file_path):
return text.TextParams(path=file_path)
# Create and load a text file collection
params = mux.MapFromPathsParams(
paths=["documents/"],
include_glob_pattern=["*.txt"],
inner_data_stream=text.Text,
inner_param_factory=create_text_params
)
docs = mux.MapFromPaths("documents", reader_params=params).load()
readme = docs["readme"].data
Source code in src/contraqctor/contract/base.py
488 489 490 491 492 493 494 495 496 497 498 499 | |
name
property
¶
name: str
Get the name of the data stream.
Returns:
| Name | Type | Description |
|---|---|---|
str |
str
|
Name identifier of the data stream. |
resolved_name
property
¶
resolved_name: str
Get the full hierarchical name of the data stream.
Generates a path-like name showing the stream's position in the hierarchy, using '::' as a separator between parent and child names.
Returns:
| Name | Type | Description |
|---|---|---|
str |
str
|
The fully resolved name including all parent names. |
description
property
¶
parent
property
¶
parent: Optional[DataStream]
Get the parent data stream.
Returns:
| Type | Description |
|---|---|
Optional[DataStream]
|
Optional[DataStream]: Parent data stream, or None if this is a root stream. |
is_collection
property
¶
is_collection: bool
Check if this data stream is a collection of other streams.
Returns:
| Name | Type | Description |
|---|---|---|
bool |
bool
|
True if this is a collection stream, False otherwise. |
reader_params
property
¶
reader_params: TReaderParams
Get the parameters for the data reader.
Returns:
| Name | Type | Description |
|---|---|---|
TReaderParams |
TReaderParams
|
Parameters for the data reader. |
at
property
¶
at: _At[TDataStream]
Get the accessor for child data streams.
Returns:
| Name | Type | Description |
|---|---|---|
_At |
_At[TDataStream]
|
Accessor object for retrieving child streams by name. |
has_data
property
¶
has_data: bool
Check if the data stream has loaded data.
Returns:
| Name | Type | Description |
|---|---|---|
bool |
bool
|
True if data has been loaded, False otherwise. |
has_error
property
¶
has_error: bool
Check if the data stream encountered an error during loading.
Returns:
| Name | Type | Description |
|---|---|---|
bool |
bool
|
True if an error occurred, False otherwise. |
data
property
¶
data: TData
Get the loaded data.
Returns:
| Name | Type | Description |
|---|---|---|
TData |
TData
|
The loaded data. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If data has not been loaded yet. |
set_parent ¶
set_parent(parent: DataStream) -> None
Set the parent data stream.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
parent
|
DataStream
|
The parent data stream to set. |
required |
Source code in src/contraqctor/contract/base.py
164 165 166 167 168 169 170 | |
read ¶
read(
reader_params: Optional[TReaderParams] = None,
) -> TData
Read data using the configured reader.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
reader_params
|
Optional[TReaderParams]
|
Optional parameters to override the default reader parameters. |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
TData |
TData
|
Data read from the source. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If reader parameters are not set. |
Source code in src/contraqctor/contract/base.py
194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 | |
bind_reader_params ¶
bind_reader_params(params: TReaderParams) -> Self
Bind reader parameters to the data stream.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
params
|
TReaderParams
|
Parameters to bind to the data stream's reader. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
Self |
Self
|
The data stream instance for method chaining. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If reader parameters have already been set. |
Source code in src/contraqctor/contract/base.py
211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 | |
clear ¶
clear() -> Self
Clear the loaded data from the data stream.
Resets the data to an unset state, allowing for reloading.
Returns:
| Name | Type | Description |
|---|---|---|
Self |
Self
|
The data stream instance for method chaining. |
Source code in src/contraqctor/contract/base.py
313 314 315 316 317 318 319 320 321 322 | |
load ¶
load() -> Self
Load data for this collection.
Overrides the base method to add validation that loaded data is a list of DataStreams.
Returns:
| Name | Type | Description |
|---|---|---|
Self |
Self
|
The collection instance for method chaining. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If loaded data is not a list of DataStreams. |
Source code in src/contraqctor/contract/base.py
536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 | |
collect_errors ¶
collect_errors() -> List[ErrorOnLoad]
Collect all errors from this stream and its children.
Performs a depth-first traversal to gather all ErrorOnLoad instances.
Returns:
| Type | Description |
|---|---|
List[ErrorOnLoad]
|
List[ErrorOnLoad]: List of all errors raised on load encountered in the hierarchy. |
Source code in src/contraqctor/contract/base.py
379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 | |
load_all ¶
Recursively load this data stream and all child streams.
Performs depth-first traversal to load all streams in the hierarchy.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
strict
|
bool
|
If True, raises exceptions immediately; otherwise collects and returns them. |
False
|
Returns:
| Name | Type | Description |
|---|---|---|
list |
Self
|
List of tuples containing streams and exceptions that occurred during loading. |
Raises:
| Type | Description |
|---|---|
Exception
|
If strict is True and an exception occurs during loading. |
Examples:
# Load all streams and handle errors
errors = collection.load_all(strict=False)
if errors:
for stream, error in errors:
print(f"Error loading {stream.name}: {error}")
Source code in src/contraqctor/contract/base.py
396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 | |
iter_all ¶
iter_all() -> Generator[DataStream, None, None]
Iterator for all child data streams, including nested collections.
Implements a depth-first traversal of the stream hierarchy.
Yields:
| Name | Type | Description |
|---|---|---|
DataStream |
DataStream
|
All recursively yielded child data streams. |
Source code in src/contraqctor/contract/base.py
601 602 603 604 605 606 607 608 609 610 611 612 613 | |