contract.mux¶
MapFromPathsParams
dataclass
¶
MapFromPathsParams(
paths: List[PathLike],
include_glob_pattern: List[str],
inner_data_stream: Type[_TDataStream],
inner_param_factory: Callable[[str], TReaderParams],
as_collection: bool = True,
exclude_glob_pattern: List[str] = list(),
inner_descriptions: dict[str, Optional[str]] = dict(),
)
Bases: Generic[_TDataStream]
Parameters for creating multiple data streams from file paths.
Defines parameters for locating files and creating data streams for each one.
Attributes:
Name | Type | Description |
---|---|---|
paths |
List[PathLike]
|
List of directory paths to search for files. |
include_glob_pattern |
List[str]
|
List of glob patterns to match files to include. |
inner_data_stream |
Type[_TDataStream]
|
Type of DataStream to create for each matched file. |
inner_param_factory |
Callable[[str], TReaderParams]
|
Function that creates reader params from file paths. |
as_collection |
bool
|
Whether to return results as a collection. Defaults to True. |
exclude_glob_pattern |
List[str]
|
List of glob patterns for files to exclude. |
inner_descriptions |
dict[str, Optional[str]]
|
Dictionary mapping file stems to descriptions for streams. |
MapFromPaths ¶
MapFromPaths(
name: str,
*,
description: Optional[str] = None,
reader_params: Optional[TReaderParams] = None,
**kwargs,
)
Bases: DataStreamCollectionBase[_TDataStream, MapFromPathsParams]
File path mapper data stream provider.
A data stream implementation for creating multiple child data streams by searching for files matching glob patterns and creating a stream for each.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
DataStreamCollectionBase
|
Base class for data stream collection providers. |
required |
Examples:
from contraqctor.contract import mux, text
# Define a factory function for TextParams
def create_text_params(file_path):
return text.TextParams(path=file_path)
# Create and load a text file collection
params = mux.MapFromPathsParams(
paths=["documents/"],
include_glob_pattern=["*.txt"],
inner_data_stream=text.Text,
inner_param_factory=create_text_params
)
docs = mux.MapFromPaths("documents", reader_params=params).load()
readme = docs["readme"].data
Source code in src/contraqctor/contract/base.py
365 366 367 368 369 370 371 372 373 374 375 376 |
|
name
property
¶
name: str
Get the name of the data stream.
Returns:
Name | Type | Description |
---|---|---|
str |
str
|
Name identifier of the data stream. |
resolved_name
property
¶
resolved_name: str
Get the full hierarchical name of the data stream.
Generates a path-like name showing the stream's position in the hierarchy, using '::' as a separator between parent and child names.
Returns:
Name | Type | Description |
---|---|---|
str |
str
|
The fully resolved name including all parent names. |
description
property
¶
parent
property
¶
parent: Optional[DataStream]
Get the parent data stream.
Returns:
Type | Description |
---|---|
Optional[DataStream]
|
Optional[DataStream]: Parent data stream, or None if this is a root stream. |
is_collection
property
¶
is_collection: bool
Check if this data stream is a collection of other streams.
Returns:
Name | Type | Description |
---|---|---|
bool |
bool
|
True if this is a collection stream, False otherwise. |
reader_params
property
¶
reader_params: TReaderParams
Get the parameters for the data reader.
Returns:
Name | Type | Description |
---|---|---|
TReaderParams |
TReaderParams
|
Parameters for the data reader. |
at
property
¶
at: _At[TDataStream]
Get the accessor for child data streams.
Returns:
Name | Type | Description |
---|---|---|
_At |
_At[TDataStream]
|
Accessor object for retrieving child streams by name. |
has_data
property
¶
has_data: bool
Check if the data stream has loaded data.
Returns:
Name | Type | Description |
---|---|---|
bool |
bool
|
True if data has been loaded, False otherwise. |
data
property
¶
data: TData
Get the loaded data.
Returns:
Name | Type | Description |
---|---|---|
TData |
TData
|
The loaded data. |
Raises:
Type | Description |
---|---|
ValueError
|
If data has not been loaded yet. |
read ¶
read(
reader_params: Optional[TReaderParams] = None,
) -> TData
Read data using the configured reader.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
reader_params
|
Optional[TReaderParams]
|
Optional parameters to override the default reader parameters. |
None
|
Returns:
Name | Type | Description |
---|---|---|
TData |
TData
|
Data read from the source. |
Raises:
Type | Description |
---|---|
ValueError
|
If reader parameters are not set. |
Source code in src/contraqctor/contract/base.py
127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 |
|
bind_reader_params ¶
bind_reader_params(params: TReaderParams) -> Self
Bind reader parameters to the data stream.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
params
|
TReaderParams
|
Parameters to bind to the data stream's reader. |
required |
Returns:
Name | Type | Description |
---|---|---|
Self |
Self
|
The data stream instance for method chaining. |
Raises:
Type | Description |
---|---|
ValueError
|
If reader parameters have already been set. |
Source code in src/contraqctor/contract/base.py
144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 |
|
load ¶
load()
Load data for this collection.
Overrides the base method to add validation that loaded data is a list of DataStreams.
Returns:
Name | Type | Description |
---|---|---|
Self |
The collection instance for method chaining. |
Raises:
Type | Description |
---|---|
ValueError
|
If loaded data is not a list of DataStreams. |
Source code in src/contraqctor/contract/base.py
413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 |
|
load_all ¶
load_all(
strict: bool = False,
) -> list[tuple[DataStream, Exception], None, None]
Recursively load this data stream and all child streams.
Performs depth-first traversal to load all streams in the hierarchy.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
strict
|
bool
|
If True, raises exceptions immediately; otherwise collects and returns them. |
False
|
Returns:
Name | Type | Description |
---|---|---|
list |
list[tuple[DataStream, Exception], None, None]
|
List of tuples containing streams and exceptions that occurred during loading. |
Raises:
Type | Description |
---|---|
Exception
|
If strict is True and an exception occurs during loading. |
Examples:
# Load all streams and handle errors
errors = collection.load_all(strict=False)
if errors:
for stream, error in errors:
print(f"Error loading {stream.name}: {error}")
Source code in src/contraqctor/contract/base.py
271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 |
|
iter_all ¶
iter_all() -> Generator[DataStream, None, None]
Iterator for all child data streams, including nested collections.
Implements a depth-first traversal of the stream hierarchy.
Yields:
Name | Type | Description |
---|---|---|
DataStream |
DataStream
|
All recursively yielded child data streams. |
Source code in src/contraqctor/contract/base.py
473 474 475 476 477 478 479 480 481 482 483 484 485 |
|