Skip to content

contract

Dataset

Dataset(
    name: str,
    data_streams: List[DataStream],
    *,
    version: str | Version = "0.0.0",
    description: Optional[str] = None,
)

Bases: DataStreamCollection

A version-tracked collection of data streams.

Extends DataStreamCollection by adding semantic versioning support.

Parameters:

Name Type Description Default
name str

Name identifier for the dataset.

required
data_streams List[DataStream]

List of data streams to include in the dataset.

required
version str | Version

Semantic version string or Version object. Defaults to "0.0.0".

'0.0.0'
description Optional[str]

Optional description of the dataset.

None

Examples:

from contraqctor.contract import text, csv, Dataset

# Create streams
text_stream = text.Text("notes", reader_params=text.TextParams(path="notes.txt"))
csv_stream = csv.Csv("data", reader_params=csv.CsvParams(path="data.csv"))

# Create a versioned dataset
dataset = Dataset(
    "experiment_results",
    [text_stream, csv_stream],
    version="1.2.3"
)

# Load the dataset
dataset.load_all(strict=True)

# Access streams
txt = dataset["notes"].data
csv_data = dataset["data"].data

print(f"Dataset version: {dataset.version}")

Initializes a Dataset with a version and a list of data streams.

Source code in src/contraqctor/contract/base.py
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
@override
def __init__(
    self,
    name: str,
    data_streams: List[DataStream],
    *,
    version: str | Version = "0.0.0",
    description: Optional[str] = None,
) -> None:
    """Initializes a Dataset with a version and a list of data streams."""
    super().__init__(
        name=name,
        data_streams=data_streams,
        description=description,
    )
    self._version = self._parse_semver(version)

name property

name: str

Get the name of the data stream.

Returns:

Name Type Description
str str

Name identifier of the data stream.

resolved_name property

resolved_name: str

Get the full hierarchical name of the data stream.

Generates a path-like name showing the stream's position in the hierarchy, using '::' as a separator between parent and child names.

Returns:

Name Type Description
str str

The fully resolved name including all parent names.

description property

description: Optional[str]

Get the description of the data stream.

Returns:

Type Description
Optional[str]

Optional[str]: Description of the data stream, or None if not provided.

parent property

Get the parent data stream.

Returns:

Type Description
Optional[DataStream]

Optional[DataStream]: Parent data stream, or None if this is a root stream.

is_collection property

is_collection: bool

Check if this data stream is a collection of other streams.

Returns:

Name Type Description
bool bool

True if this is a collection stream, False otherwise.

reader_params property

reader_params: TReaderParams

Get the parameters for the data reader.

Returns:

Name Type Description
TReaderParams TReaderParams

Parameters for the data reader.

at property

at: _At[TDataStream]

Get the accessor for child data streams.

Returns:

Name Type Description
_At _At[TDataStream]

Accessor object for retrieving child streams by name.

has_data property

has_data: bool

Check if the data stream has loaded data.

Returns:

Name Type Description
bool bool

True if data has been loaded, False otherwise.

has_error property

has_error: bool

Check if the data stream encountered an error during loading.

Returns:

Name Type Description
bool bool

True if an error occurred, False otherwise.

data property

data: TData

Get the loaded data.

Returns:

Name Type Description
TData TData

The loaded data.

Raises:

Type Description
ValueError

If data has not been loaded yet.

version property

version: Version

Get the semantic version of the dataset.

Returns:

Name Type Description
Version Version

Semantic version object.

set_parent

set_parent(parent: DataStream) -> None

Set the parent data stream.

Parameters:

Name Type Description Default
parent DataStream

The parent data stream to set.

required
Source code in src/contraqctor/contract/base.py
164
165
166
167
168
169
170
def set_parent(self, parent: "DataStream") -> None:
    """Set the parent data stream.

    Args:
        parent: The parent data stream to set.
    """
    self._parent = parent

read

read(*args, **kwargs) -> List[DataStream]

Read data from the collection.

Returns:

Type Description
List[DataStream]

List[DataStream]: The pre-set data streams.

Raises:

Type Description
ValueError

If data streams have not been set yet.

Source code in src/contraqctor/contract/base.py
677
678
679
680
681
682
683
684
685
686
687
688
689
@override
def read(self, *args, **kwargs) -> List[DataStream]:
    """Read data from the collection.

    Returns:
        List[DataStream]: The pre-set data streams.

    Raises:
        ValueError: If data streams have not been set yet.
    """
    if not self.has_data:
        raise ValueError("Data streams have not been read yet.")
    return self._data

bind_reader_params

bind_reader_params(params: TReaderParams) -> Self

Bind reader parameters to the data stream.

Parameters:

Name Type Description Default
params TReaderParams

Parameters to bind to the data stream's reader.

required

Returns:

Name Type Description
Self Self

The data stream instance for method chaining.

Raises:

Type Description
ValueError

If reader parameters have already been set.

Source code in src/contraqctor/contract/base.py
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
def bind_reader_params(self, params: _typing.TReaderParams) -> Self:
    """Bind reader parameters to the data stream.

    Args:
        params: Parameters to bind to the data stream's reader.

    Returns:
        Self: The data stream instance for method chaining.

    Raises:
        ValueError: If reader parameters have already been set.
    """
    if not _typing.is_unset(self._reader_params):
        raise ValueError("Reader parameters are already set. Cannot bind again.")
    self._reader_params = params
    return self

clear

clear() -> Self

Clear the loaded data from the data stream.

Resets the data to an unset state, allowing for reloading.

Returns:

Name Type Description
Self Self

The data stream instance for method chaining.

Source code in src/contraqctor/contract/base.py
313
314
315
316
317
318
319
320
321
322
def clear(self) -> Self:
    """Clear the loaded data from the data stream.

    Resets the data to an unset state, allowing for reloading.

    Returns:
        Self: The data stream instance for method chaining.
    """
    self._data = _typing.UnsetData
    return self

load

load() -> Self

Load data for this collection.

Overrides the base method to add validation that loaded data is a list of DataStreams.

Returns:

Name Type Description
Self Self

The collection instance for method chaining.

Raises:

Type Description
ValueError

If loaded data is not a list of DataStreams.

Source code in src/contraqctor/contract/base.py
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
@override
def load(self) -> Self:
    """Load data for this collection.

    Overrides the base method to add validation that loaded data is a list of DataStreams.

    Returns:
        Self: The collection instance for method chaining.

    Raises:
        ValueError: If loaded data is not a list of DataStreams.
    """
    super().load()
    if not isinstance(self._data, list):
        self._data = _typing.UnsetData
        raise ValueError("Data must be a list of DataStreams.")
    self._update_data_stream_mapping()
    return self

collect_errors

collect_errors() -> List[ErrorOnLoad]

Collect all errors from this stream and its children.

Performs a depth-first traversal to gather all ErrorOnLoad instances.

Returns:

Type Description
List[ErrorOnLoad]

List[ErrorOnLoad]: List of all errors raised on load encountered in the hierarchy.

Source code in src/contraqctor/contract/base.py
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
def collect_errors(self) -> List[_typing.ErrorOnLoad]:
    """Collect all errors from this stream and its children.

    Performs a depth-first traversal to gather all ErrorOnLoad instances.

    Returns:
        List[ErrorOnLoad]: List of all errors raised on load encountered in the hierarchy.
    """
    errors = []
    if self.has_error:
        errors.append(cast(_typing.ErrorOnLoad, self._data))
    for stream in self:
        if stream is None:
            continue
        errors.extend(stream.collect_errors())
    return errors

load_all

load_all(strict: bool = False) -> Self

Recursively load this data stream and all child streams.

Performs depth-first traversal to load all streams in the hierarchy.

Parameters:

Name Type Description Default
strict bool

If True, raises exceptions immediately; otherwise collects and returns them.

False

Returns:

Name Type Description
list Self

List of tuples containing streams and exceptions that occurred during loading.

Raises:

Type Description
Exception

If strict is True and an exception occurs during loading.

Examples:

# Load all streams and handle errors
errors = collection.load_all(strict=False)

if errors:
    for stream, error in errors:
        print(f"Error loading {stream.name}: {error}")
Source code in src/contraqctor/contract/base.py
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
def load_all(self, strict: bool = False) -> Self:
    """Recursively load this data stream and all child streams.

    Performs depth-first traversal to load all streams in the hierarchy.

    Args:
        strict: If True, raises exceptions immediately; otherwise collects and returns them.

    Returns:
        list: List of tuples containing streams and exceptions that occurred during loading.

    Raises:
        Exception: If strict is True and an exception occurs during loading.

    Examples:
        ```python
        # Load all streams and handle errors
        errors = collection.load_all(strict=False)

        if errors:
            for stream, error in errors:
                print(f"Error loading {stream.name}: {error}")
        ```
    """
    self.load()
    for stream in self:
        if stream is None:
            continue
        stream.load_all(strict=strict)
        if stream.has_error and strict:
            cast(_typing.ErrorOnLoad, stream.data).raise_from_error()
    return self

iter_all

iter_all() -> Generator[DataStream, None, None]

Iterator for all child data streams, including nested collections.

Implements a depth-first traversal of the stream hierarchy.

Yields:

Name Type Description
DataStream DataStream

All recursively yielded child data streams.

Source code in src/contraqctor/contract/base.py
601
602
603
604
605
606
607
608
609
610
611
612
613
def iter_all(self) -> Generator[DataStream, None, None]:
    """Iterator for all child data streams, including nested collections.

    Implements a depth-first traversal of the stream hierarchy.

    Yields:
        DataStream: All recursively yielded child data streams.
    """
    for value in self:
        if isinstance(value, DataStream):
            yield value
        if isinstance(value, DataStreamCollectionBase):
            yield from value.iter_all()

parameters staticmethod

parameters(*args, **kwargs) -> UnsetParamsType

Parameters function to return UnsetParams.

Returns:

Name Type Description
UnsetParamsType UnsetParamsType

Special unset parameters value.

Source code in src/contraqctor/contract/base.py
660
661
662
663
664
665
666
667
@staticmethod
def parameters(*args, **kwargs) -> _typing.UnsetParamsType:
    """Parameters function to return UnsetParams.

    Returns:
        UnsetParamsType: Special unset parameters value.
    """
    return _typing.UnsetParams

bind_data_streams

bind_data_streams(data_streams: List[DataStream]) -> Self

Bind a list of data streams to the collection.

Parameters:

Name Type Description Default
data_streams List[DataStream]

List of data streams to include in the collection.

required

Returns:

Name Type Description
Self Self

The collection instance for method chaining.

Raises:

Type Description
ValueError

If data streams have already been set.

Source code in src/contraqctor/contract/base.py
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
def bind_data_streams(self, data_streams: List[DataStream]) -> Self:
    """Bind a list of data streams to the collection.

    Args:
        data_streams: List of data streams to include in the collection.

    Returns:
        Self: The collection instance for method chaining.

    Raises:
        ValueError: If data streams have already been set.
    """
    if self.has_data:
        raise ValueError("Data streams are already set. Cannot bind again.")
    self._data = data_streams
    self._update_data_stream_mapping()
    return self

add_stream

add_stream(stream: DataStream) -> Self

Add a new data stream to the collection.

Parameters:

Name Type Description Default
stream DataStream

Data stream to add to the collection.

required

Returns:

Name Type Description
Self Self

The collection instance for method chaining.

Raises:

Type Description
KeyError

If a stream with the same name already exists.

Examples:

from contraqctor.contract import json, DataStreamCollection

# Create an empty collection
collection = DataStreamCollection("api_data", [])

# Add streams
collection.add_stream(
    json.Json("config", reader_params=json.JsonParams(path="config.json"))
)

# Load the data
collection.load_all()
Source code in src/contraqctor/contract/base.py
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
def add_stream(self, stream: DataStream) -> Self:
    """Add a new data stream to the collection.

    Args:
        stream: Data stream to add to the collection.

    Returns:
        Self: The collection instance for method chaining.

    Raises:
        KeyError: If a stream with the same name already exists.

    Examples:
        ```python
        from contraqctor.contract import json, DataStreamCollection

        # Create an empty collection
        collection = DataStreamCollection("api_data", [])

        # Add streams
        collection.add_stream(
            json.Json("config", reader_params=json.JsonParams(path="config.json"))
        )

        # Load the data
        collection.load_all()
        ```
    """
    if not self.has_data:
        self._data = [stream]
        self._update_data_stream_mapping()
        return self

    if stream.name in self._data_stream_mapping:
        raise KeyError(f"Stream with name: '{stream.name}' already exists in data streams.")

    self._data.append(stream)
    self._update_data_stream_mapping()
    return self

remove_stream

remove_stream(name: str) -> None

Remove a data stream from the collection.

Parameters:

Name Type Description Default
name str

Name of the data stream to remove.

required

Raises:

Type Description
ValueError

If data streams have not been set yet.

KeyError

If no stream with the given name exists.

Source code in src/contraqctor/contract/base.py
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
def remove_stream(self, name: str) -> None:
    """Remove a data stream from the collection.

    Args:
        name: Name of the data stream to remove.

    Raises:
        ValueError: If data streams have not been set yet.
        KeyError: If no stream with the given name exists.
    """
    if not self.has_data:
        raise ValueError("Data streams have not been read yet. Cannot access data streams.")

    if name not in self._data_stream_mapping:
        raise KeyError(f"Data stream with name '{name}' not found in data streams.")
    self._data.remove(self._data_stream_mapping[name])
    self._update_data_stream_mapping()
    return

from_data_stream classmethod

from_data_stream(data_stream: DataStream) -> Self

Create a DataStreamCollection from a DataStream object.

Factory method to convert a single data stream or collection into a DataStreamCollection.

Parameters:

Name Type Description Default
data_stream DataStream

Source data stream to convert.

required

Returns:

Name Type Description
DataStreamCollection Self

New collection containing the source stream's data.

Raises:

Type Description
TypeError

If the source is not a DataStream.

ValueError

If the source has not been loaded yet.

Source code in src/contraqctor/contract/base.py
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
@classmethod
def from_data_stream(cls, data_stream: DataStream) -> Self:
    """Create a DataStreamCollection from a DataStream object.

    Factory method to convert a single data stream or collection into a DataStreamCollection.

    Args:
        data_stream: Source data stream to convert.

    Returns:
        DataStreamCollection: New collection containing the source stream's data.

    Raises:
        TypeError: If the source is not a DataStream.
        ValueError: If the source has not been loaded yet.
    """
    if not isinstance(data_stream, DataStream):
        raise TypeError("data_stream must be an instance of DataStream.")
    if not data_stream.has_data:
        raise ValueError("DataStream has not been loaded yet. Cannot create DataStreamCollection.")
    data = data_stream._data if data_stream.is_collection else [data_stream._data]
    return cls(name=data_stream.name, data_streams=data, description=data_stream.description)

DataStream

DataStream(
    name: str,
    *,
    description: Optional[str] = None,
    reader_params: TReaderParams = UnsetParams,
    **kwargs,
)

Bases: ABC, Generic[TData, TReaderParams]

Abstract base class for all data streams.

Provides a generic interface for data reading operations with configurable parameters and hierarchical organization.

Parameters:

Name Type Description Default
name str

Name identifier for the data stream.

required
description Optional[str]

Optional description of the data stream.

None
reader_params TReaderParams

Optional parameters for the data reader.

UnsetParams
**kwargs

Additional keyword arguments.

{}

Attributes:

Name Type Description
_is_collection bool

Class variable indicating if this is a collection of data streams.

Raises:

Type Description
ValueError

If name contains '::' characters which are reserved for path resolution.

Source code in src/contraqctor/contract/base.py
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
def __init__(
    self: Self,
    name: str,
    *,
    description: Optional[str] = None,
    reader_params: _typing.TReaderParams = _typing.UnsetParams,
    **kwargs,
) -> None:
    if "::" in name:
        raise ValueError("Name cannot contain '::' character.")
    self._name = name

    self._description = description
    self._reader_params = reader_params
    self._data = _typing.UnsetData
    self._parent: Optional["DataStream"] = None

name property

name: str

Get the name of the data stream.

Returns:

Name Type Description
str str

Name identifier of the data stream.

resolved_name property

resolved_name: str

Get the full hierarchical name of the data stream.

Generates a path-like name showing the stream's position in the hierarchy, using '::' as a separator between parent and child names.

Returns:

Name Type Description
str str

The fully resolved name including all parent names.

description property

description: Optional[str]

Get the description of the data stream.

Returns:

Type Description
Optional[str]

Optional[str]: Description of the data stream, or None if not provided.

parent property

Get the parent data stream.

Returns:

Type Description
Optional[DataStream]

Optional[DataStream]: Parent data stream, or None if this is a root stream.

is_collection property

is_collection: bool

Check if this data stream is a collection of other streams.

Returns:

Name Type Description
bool bool

True if this is a collection stream, False otherwise.

reader_params property

reader_params: TReaderParams

Get the parameters for the data reader.

Returns:

Name Type Description
TReaderParams TReaderParams

Parameters for the data reader.

at property

at: _AtProtocol

Get a child data stream by name.

Parameters:

Name Type Description Default
name

Name of the child data stream to retrieve.

required

Returns:

Name Type Description
DataStream _AtProtocol

The child data stream with the given name.

Raises:

Type Description
NotImplementedError

If the data stream does not support child access.

Examples:

# Access stream in a collection
collection = data_collection.load()
temp_stream = collection.at("temperature")

# Or using dictionary-style syntax
humidity_stream = collection["humidity"]

has_data property

has_data: bool

Check if the data stream has loaded data.

Returns:

Name Type Description
bool bool

True if data has been loaded, False otherwise.

has_error property

has_error: bool

Check if the data stream encountered an error during loading.

Returns:

Name Type Description
bool bool

True if an error occurred, False otherwise.

data property

data: TData

Get the loaded data.

Returns:

Name Type Description
TData TData

The loaded data.

Raises:

Type Description
ValueError

If data has not been loaded yet.

set_parent

set_parent(parent: DataStream) -> None

Set the parent data stream.

Parameters:

Name Type Description Default
parent DataStream

The parent data stream to set.

required
Source code in src/contraqctor/contract/base.py
164
165
166
167
168
169
170
def set_parent(self, parent: "DataStream") -> None:
    """Set the parent data stream.

    Args:
        parent: The parent data stream to set.
    """
    self._parent = parent

read

read(
    reader_params: Optional[TReaderParams] = None,
) -> TData

Read data using the configured reader.

Parameters:

Name Type Description Default
reader_params Optional[TReaderParams]

Optional parameters to override the default reader parameters.

None

Returns:

Name Type Description
TData TData

Data read from the source.

Raises:

Type Description
ValueError

If reader parameters are not set.

Source code in src/contraqctor/contract/base.py
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
def read(self, reader_params: Optional[_typing.TReaderParams] = None) -> _typing.TData:
    """Read data using the configured reader.

    Args:
        reader_params: Optional parameters to override the default reader parameters.

    Returns:
        TData: Data read from the source.

    Raises:
        ValueError: If reader parameters are not set.
    """
    reader_params = reader_params if reader_params is not None else self._reader_params
    if _typing.is_unset(reader_params):
        raise ValueError("Reader parameters are not set. Cannot read data.")
    return self._reader(reader_params)

bind_reader_params

bind_reader_params(params: TReaderParams) -> Self

Bind reader parameters to the data stream.

Parameters:

Name Type Description Default
params TReaderParams

Parameters to bind to the data stream's reader.

required

Returns:

Name Type Description
Self Self

The data stream instance for method chaining.

Raises:

Type Description
ValueError

If reader parameters have already been set.

Source code in src/contraqctor/contract/base.py
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
def bind_reader_params(self, params: _typing.TReaderParams) -> Self:
    """Bind reader parameters to the data stream.

    Args:
        params: Parameters to bind to the data stream's reader.

    Returns:
        Self: The data stream instance for method chaining.

    Raises:
        ValueError: If reader parameters have already been set.
    """
    if not _typing.is_unset(self._reader_params):
        raise ValueError("Reader parameters are already set. Cannot bind again.")
    self._reader_params = params
    return self

clear

clear() -> Self

Clear the loaded data from the data stream.

Resets the data to an unset state, allowing for reloading.

Returns:

Name Type Description
Self Self

The data stream instance for method chaining.

Source code in src/contraqctor/contract/base.py
313
314
315
316
317
318
319
320
321
322
def clear(self) -> Self:
    """Clear the loaded data from the data stream.

    Resets the data to an unset state, allowing for reloading.

    Returns:
        Self: The data stream instance for method chaining.
    """
    self._data = _typing.UnsetData
    return self

load

load() -> Self

Load data into the data stream.

Reads data from the source and stores it in the data stream.

Returns:

Name Type Description
Self Self

The data stream instance for method chaining.

Examples:

from contraqctor.contract import csv

# Create and load a CSV stream
params = csv.CsvParams(path="data/measurements.csv")
csv_stream = csv.Csv("measurements", reader_params=params)
csv_stream.load()

# Access the data
df = csv_stream.data
print(f"Loaded {len(df)} rows")
Source code in src/contraqctor/contract/base.py
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
def load(self) -> Self:
    """Load data into the data stream.

    Reads data from the source and stores it in the data stream.

    Returns:
        Self: The data stream instance for method chaining.

    Examples:
        ```python
        from contraqctor.contract import csv

        # Create and load a CSV stream
        params = csv.CsvParams(path="data/measurements.csv")
        csv_stream = csv.Csv("measurements", reader_params=params)
        csv_stream.load()

        # Access the data
        df = csv_stream.data
        print(f"Loaded {len(df)} rows")
        ```
    """
    try:
        self._data = self.read()
    except Exception as e:  # pylint: disable=broad-except
        self._data = _typing.ErrorOnLoad(self, exception=e)
    return self

collect_errors

collect_errors() -> List[ErrorOnLoad]

Collect all errors from this stream and its children.

Performs a depth-first traversal to gather all ErrorOnLoad instances.

Returns:

Type Description
List[ErrorOnLoad]

List[ErrorOnLoad]: List of all errors raised on load encountered in the hierarchy.

Source code in src/contraqctor/contract/base.py
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
def collect_errors(self) -> List[_typing.ErrorOnLoad]:
    """Collect all errors from this stream and its children.

    Performs a depth-first traversal to gather all ErrorOnLoad instances.

    Returns:
        List[ErrorOnLoad]: List of all errors raised on load encountered in the hierarchy.
    """
    errors = []
    if self.has_error:
        errors.append(cast(_typing.ErrorOnLoad, self._data))
    for stream in self:
        if stream is None:
            continue
        errors.extend(stream.collect_errors())
    return errors

load_all

load_all(strict: bool = False) -> Self

Recursively load this data stream and all child streams.

Performs depth-first traversal to load all streams in the hierarchy.

Parameters:

Name Type Description Default
strict bool

If True, raises exceptions immediately; otherwise collects and returns them.

False

Returns:

Name Type Description
list Self

List of tuples containing streams and exceptions that occurred during loading.

Raises:

Type Description
Exception

If strict is True and an exception occurs during loading.

Examples:

# Load all streams and handle errors
errors = collection.load_all(strict=False)

if errors:
    for stream, error in errors:
        print(f"Error loading {stream.name}: {error}")
Source code in src/contraqctor/contract/base.py
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
def load_all(self, strict: bool = False) -> Self:
    """Recursively load this data stream and all child streams.

    Performs depth-first traversal to load all streams in the hierarchy.

    Args:
        strict: If True, raises exceptions immediately; otherwise collects and returns them.

    Returns:
        list: List of tuples containing streams and exceptions that occurred during loading.

    Raises:
        Exception: If strict is True and an exception occurs during loading.

    Examples:
        ```python
        # Load all streams and handle errors
        errors = collection.load_all(strict=False)

        if errors:
            for stream, error in errors:
                print(f"Error loading {stream.name}: {error}")
        ```
    """
    self.load()
    for stream in self:
        if stream is None:
            continue
        stream.load_all(strict=strict)
        if stream.has_error and strict:
            cast(_typing.ErrorOnLoad, stream.data).raise_from_error()
    return self

DataStreamCollection

DataStreamCollection(
    name: str,
    data_streams: List[DataStream],
    *,
    description: Optional[str] = None,
)

Bases: DataStreamCollectionBase[DataStream, UnsetParamsType]

Collection of data streams with direct initialization.

A specialized collection where child streams are passed directly instead of being created by a reader function.

Parameters:

Name Type Description Default
name str

Name identifier for the collection.

required
data_streams List[DataStream]

List of child data streams to include.

required
description Optional[str]

Optional description of the collection.

None

Examples:

from contraqctor.contract import csv, text, DataStreamCollection

# Create streams
text_stream = text.Text("readme", reader_params=text.TextParams(path="README.md"))
csv_stream = csv.Csv("data", reader_params=csv.CsvParams(path="data.csv"))

# Create the collection
collection = DataStreamCollection("project_files", [text_stream, csv_stream])

# Load and use
collection.load_all()
readme_content = collection["readme"].data

Initializes a special DataStreamGroup where the data streams are passed directly, without a reader.

Source code in src/contraqctor/contract/base.py
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
@override
def __init__(
    self,
    name: str,
    data_streams: List[DataStream],
    *,
    description: Optional[str] = None,
) -> None:
    """Initializes a special DataStreamGroup where the data streams are passed directly, without a reader."""
    super().__init__(
        name=name,
        description=description,
        reader_params=_typing.UnsetParams,
    )
    self.bind_data_streams(data_streams)

name property

name: str

Get the name of the data stream.

Returns:

Name Type Description
str str

Name identifier of the data stream.

resolved_name property

resolved_name: str

Get the full hierarchical name of the data stream.

Generates a path-like name showing the stream's position in the hierarchy, using '::' as a separator between parent and child names.

Returns:

Name Type Description
str str

The fully resolved name including all parent names.

description property

description: Optional[str]

Get the description of the data stream.

Returns:

Type Description
Optional[str]

Optional[str]: Description of the data stream, or None if not provided.

parent property

Get the parent data stream.

Returns:

Type Description
Optional[DataStream]

Optional[DataStream]: Parent data stream, or None if this is a root stream.

is_collection property

is_collection: bool

Check if this data stream is a collection of other streams.

Returns:

Name Type Description
bool bool

True if this is a collection stream, False otherwise.

reader_params property

reader_params: TReaderParams

Get the parameters for the data reader.

Returns:

Name Type Description
TReaderParams TReaderParams

Parameters for the data reader.

at property

at: _At[TDataStream]

Get the accessor for child data streams.

Returns:

Name Type Description
_At _At[TDataStream]

Accessor object for retrieving child streams by name.

has_data property

has_data: bool

Check if the data stream has loaded data.

Returns:

Name Type Description
bool bool

True if data has been loaded, False otherwise.

has_error property

has_error: bool

Check if the data stream encountered an error during loading.

Returns:

Name Type Description
bool bool

True if an error occurred, False otherwise.

data property

data: TData

Get the loaded data.

Returns:

Name Type Description
TData TData

The loaded data.

Raises:

Type Description
ValueError

If data has not been loaded yet.

set_parent

set_parent(parent: DataStream) -> None

Set the parent data stream.

Parameters:

Name Type Description Default
parent DataStream

The parent data stream to set.

required
Source code in src/contraqctor/contract/base.py
164
165
166
167
168
169
170
def set_parent(self, parent: "DataStream") -> None:
    """Set the parent data stream.

    Args:
        parent: The parent data stream to set.
    """
    self._parent = parent

bind_reader_params

bind_reader_params(params: TReaderParams) -> Self

Bind reader parameters to the data stream.

Parameters:

Name Type Description Default
params TReaderParams

Parameters to bind to the data stream's reader.

required

Returns:

Name Type Description
Self Self

The data stream instance for method chaining.

Raises:

Type Description
ValueError

If reader parameters have already been set.

Source code in src/contraqctor/contract/base.py
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
def bind_reader_params(self, params: _typing.TReaderParams) -> Self:
    """Bind reader parameters to the data stream.

    Args:
        params: Parameters to bind to the data stream's reader.

    Returns:
        Self: The data stream instance for method chaining.

    Raises:
        ValueError: If reader parameters have already been set.
    """
    if not _typing.is_unset(self._reader_params):
        raise ValueError("Reader parameters are already set. Cannot bind again.")
    self._reader_params = params
    return self

clear

clear() -> Self

Clear the loaded data from the data stream.

Resets the data to an unset state, allowing for reloading.

Returns:

Name Type Description
Self Self

The data stream instance for method chaining.

Source code in src/contraqctor/contract/base.py
313
314
315
316
317
318
319
320
321
322
def clear(self) -> Self:
    """Clear the loaded data from the data stream.

    Resets the data to an unset state, allowing for reloading.

    Returns:
        Self: The data stream instance for method chaining.
    """
    self._data = _typing.UnsetData
    return self

load

load() -> Self

Load data for this collection.

Overrides the base method to add validation that loaded data is a list of DataStreams.

Returns:

Name Type Description
Self Self

The collection instance for method chaining.

Raises:

Type Description
ValueError

If loaded data is not a list of DataStreams.

Source code in src/contraqctor/contract/base.py
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
@override
def load(self) -> Self:
    """Load data for this collection.

    Overrides the base method to add validation that loaded data is a list of DataStreams.

    Returns:
        Self: The collection instance for method chaining.

    Raises:
        ValueError: If loaded data is not a list of DataStreams.
    """
    super().load()
    if not isinstance(self._data, list):
        self._data = _typing.UnsetData
        raise ValueError("Data must be a list of DataStreams.")
    self._update_data_stream_mapping()
    return self

collect_errors

collect_errors() -> List[ErrorOnLoad]

Collect all errors from this stream and its children.

Performs a depth-first traversal to gather all ErrorOnLoad instances.

Returns:

Type Description
List[ErrorOnLoad]

List[ErrorOnLoad]: List of all errors raised on load encountered in the hierarchy.

Source code in src/contraqctor/contract/base.py
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
def collect_errors(self) -> List[_typing.ErrorOnLoad]:
    """Collect all errors from this stream and its children.

    Performs a depth-first traversal to gather all ErrorOnLoad instances.

    Returns:
        List[ErrorOnLoad]: List of all errors raised on load encountered in the hierarchy.
    """
    errors = []
    if self.has_error:
        errors.append(cast(_typing.ErrorOnLoad, self._data))
    for stream in self:
        if stream is None:
            continue
        errors.extend(stream.collect_errors())
    return errors

load_all

load_all(strict: bool = False) -> Self

Recursively load this data stream and all child streams.

Performs depth-first traversal to load all streams in the hierarchy.

Parameters:

Name Type Description Default
strict bool

If True, raises exceptions immediately; otherwise collects and returns them.

False

Returns:

Name Type Description
list Self

List of tuples containing streams and exceptions that occurred during loading.

Raises:

Type Description
Exception

If strict is True and an exception occurs during loading.

Examples:

# Load all streams and handle errors
errors = collection.load_all(strict=False)

if errors:
    for stream, error in errors:
        print(f"Error loading {stream.name}: {error}")
Source code in src/contraqctor/contract/base.py
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
def load_all(self, strict: bool = False) -> Self:
    """Recursively load this data stream and all child streams.

    Performs depth-first traversal to load all streams in the hierarchy.

    Args:
        strict: If True, raises exceptions immediately; otherwise collects and returns them.

    Returns:
        list: List of tuples containing streams and exceptions that occurred during loading.

    Raises:
        Exception: If strict is True and an exception occurs during loading.

    Examples:
        ```python
        # Load all streams and handle errors
        errors = collection.load_all(strict=False)

        if errors:
            for stream, error in errors:
                print(f"Error loading {stream.name}: {error}")
        ```
    """
    self.load()
    for stream in self:
        if stream is None:
            continue
        stream.load_all(strict=strict)
        if stream.has_error and strict:
            cast(_typing.ErrorOnLoad, stream.data).raise_from_error()
    return self

iter_all

iter_all() -> Generator[DataStream, None, None]

Iterator for all child data streams, including nested collections.

Implements a depth-first traversal of the stream hierarchy.

Yields:

Name Type Description
DataStream DataStream

All recursively yielded child data streams.

Source code in src/contraqctor/contract/base.py
601
602
603
604
605
606
607
608
609
610
611
612
613
def iter_all(self) -> Generator[DataStream, None, None]:
    """Iterator for all child data streams, including nested collections.

    Implements a depth-first traversal of the stream hierarchy.

    Yields:
        DataStream: All recursively yielded child data streams.
    """
    for value in self:
        if isinstance(value, DataStream):
            yield value
        if isinstance(value, DataStreamCollectionBase):
            yield from value.iter_all()

parameters staticmethod

parameters(*args, **kwargs) -> UnsetParamsType

Parameters function to return UnsetParams.

Returns:

Name Type Description
UnsetParamsType UnsetParamsType

Special unset parameters value.

Source code in src/contraqctor/contract/base.py
660
661
662
663
664
665
666
667
@staticmethod
def parameters(*args, **kwargs) -> _typing.UnsetParamsType:
    """Parameters function to return UnsetParams.

    Returns:
        UnsetParamsType: Special unset parameters value.
    """
    return _typing.UnsetParams

read

read(*args, **kwargs) -> List[DataStream]

Read data from the collection.

Returns:

Type Description
List[DataStream]

List[DataStream]: The pre-set data streams.

Raises:

Type Description
ValueError

If data streams have not been set yet.

Source code in src/contraqctor/contract/base.py
677
678
679
680
681
682
683
684
685
686
687
688
689
@override
def read(self, *args, **kwargs) -> List[DataStream]:
    """Read data from the collection.

    Returns:
        List[DataStream]: The pre-set data streams.

    Raises:
        ValueError: If data streams have not been set yet.
    """
    if not self.has_data:
        raise ValueError("Data streams have not been read yet.")
    return self._data

bind_data_streams

bind_data_streams(data_streams: List[DataStream]) -> Self

Bind a list of data streams to the collection.

Parameters:

Name Type Description Default
data_streams List[DataStream]

List of data streams to include in the collection.

required

Returns:

Name Type Description
Self Self

The collection instance for method chaining.

Raises:

Type Description
ValueError

If data streams have already been set.

Source code in src/contraqctor/contract/base.py
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
def bind_data_streams(self, data_streams: List[DataStream]) -> Self:
    """Bind a list of data streams to the collection.

    Args:
        data_streams: List of data streams to include in the collection.

    Returns:
        Self: The collection instance for method chaining.

    Raises:
        ValueError: If data streams have already been set.
    """
    if self.has_data:
        raise ValueError("Data streams are already set. Cannot bind again.")
    self._data = data_streams
    self._update_data_stream_mapping()
    return self

add_stream

add_stream(stream: DataStream) -> Self

Add a new data stream to the collection.

Parameters:

Name Type Description Default
stream DataStream

Data stream to add to the collection.

required

Returns:

Name Type Description
Self Self

The collection instance for method chaining.

Raises:

Type Description
KeyError

If a stream with the same name already exists.

Examples:

from contraqctor.contract import json, DataStreamCollection

# Create an empty collection
collection = DataStreamCollection("api_data", [])

# Add streams
collection.add_stream(
    json.Json("config", reader_params=json.JsonParams(path="config.json"))
)

# Load the data
collection.load_all()
Source code in src/contraqctor/contract/base.py
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
def add_stream(self, stream: DataStream) -> Self:
    """Add a new data stream to the collection.

    Args:
        stream: Data stream to add to the collection.

    Returns:
        Self: The collection instance for method chaining.

    Raises:
        KeyError: If a stream with the same name already exists.

    Examples:
        ```python
        from contraqctor.contract import json, DataStreamCollection

        # Create an empty collection
        collection = DataStreamCollection("api_data", [])

        # Add streams
        collection.add_stream(
            json.Json("config", reader_params=json.JsonParams(path="config.json"))
        )

        # Load the data
        collection.load_all()
        ```
    """
    if not self.has_data:
        self._data = [stream]
        self._update_data_stream_mapping()
        return self

    if stream.name in self._data_stream_mapping:
        raise KeyError(f"Stream with name: '{stream.name}' already exists in data streams.")

    self._data.append(stream)
    self._update_data_stream_mapping()
    return self

remove_stream

remove_stream(name: str) -> None

Remove a data stream from the collection.

Parameters:

Name Type Description Default
name str

Name of the data stream to remove.

required

Raises:

Type Description
ValueError

If data streams have not been set yet.

KeyError

If no stream with the given name exists.

Source code in src/contraqctor/contract/base.py
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
def remove_stream(self, name: str) -> None:
    """Remove a data stream from the collection.

    Args:
        name: Name of the data stream to remove.

    Raises:
        ValueError: If data streams have not been set yet.
        KeyError: If no stream with the given name exists.
    """
    if not self.has_data:
        raise ValueError("Data streams have not been read yet. Cannot access data streams.")

    if name not in self._data_stream_mapping:
        raise KeyError(f"Data stream with name '{name}' not found in data streams.")
    self._data.remove(self._data_stream_mapping[name])
    self._update_data_stream_mapping()
    return

from_data_stream classmethod

from_data_stream(data_stream: DataStream) -> Self

Create a DataStreamCollection from a DataStream object.

Factory method to convert a single data stream or collection into a DataStreamCollection.

Parameters:

Name Type Description Default
data_stream DataStream

Source data stream to convert.

required

Returns:

Name Type Description
DataStreamCollection Self

New collection containing the source stream's data.

Raises:

Type Description
TypeError

If the source is not a DataStream.

ValueError

If the source has not been loaded yet.

Source code in src/contraqctor/contract/base.py
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
@classmethod
def from_data_stream(cls, data_stream: DataStream) -> Self:
    """Create a DataStreamCollection from a DataStream object.

    Factory method to convert a single data stream or collection into a DataStreamCollection.

    Args:
        data_stream: Source data stream to convert.

    Returns:
        DataStreamCollection: New collection containing the source stream's data.

    Raises:
        TypeError: If the source is not a DataStream.
        ValueError: If the source has not been loaded yet.
    """
    if not isinstance(data_stream, DataStream):
        raise TypeError("data_stream must be an instance of DataStream.")
    if not data_stream.has_data:
        raise ValueError("DataStream has not been loaded yet. Cannot create DataStreamCollection.")
    data = data_stream._data if data_stream.is_collection else [data_stream._data]
    return cls(name=data_stream.name, data_streams=data, description=data_stream.description)

DataStreamCollectionBase

DataStreamCollectionBase(
    name: str,
    *,
    description: Optional[str] = None,
    reader_params: Optional[TReaderParams] = None,
    **kwargs,
)

Bases: DataStream[List[TDataStream], TReaderParams], Generic[TDataStream, TReaderParams]

Base class for collections of data streams.

Provides functionality for managing and accessing multiple child data streams.

Parameters:

Name Type Description Default
name str

Name identifier for the collection.

required
description Optional[str]

Optional description of the collection.

None
reader_params Optional[TReaderParams]

Optional parameters for the reader.

None
**kwargs

Additional keyword arguments.

{}
Source code in src/contraqctor/contract/base.py
488
489
490
491
492
493
494
495
496
497
498
499
def __init__(
    self: Self,
    name: str,
    *,
    description: Optional[str] = None,
    reader_params: Optional[_typing.TReaderParams] = None,
    **kwargs,
) -> None:
    super().__init__(name=name, description=description, reader_params=reader_params, **kwargs)
    self._data_stream_mapping: Dict[str, TDataStream] = {}
    self._update_data_stream_mapping()
    self._at = _At(self)

name property

name: str

Get the name of the data stream.

Returns:

Name Type Description
str str

Name identifier of the data stream.

resolved_name property

resolved_name: str

Get the full hierarchical name of the data stream.

Generates a path-like name showing the stream's position in the hierarchy, using '::' as a separator between parent and child names.

Returns:

Name Type Description
str str

The fully resolved name including all parent names.

description property

description: Optional[str]

Get the description of the data stream.

Returns:

Type Description
Optional[str]

Optional[str]: Description of the data stream, or None if not provided.

parent property

Get the parent data stream.

Returns:

Type Description
Optional[DataStream]

Optional[DataStream]: Parent data stream, or None if this is a root stream.

is_collection property

is_collection: bool

Check if this data stream is a collection of other streams.

Returns:

Name Type Description
bool bool

True if this is a collection stream, False otherwise.

reader_params property

reader_params: TReaderParams

Get the parameters for the data reader.

Returns:

Name Type Description
TReaderParams TReaderParams

Parameters for the data reader.

has_data property

has_data: bool

Check if the data stream has loaded data.

Returns:

Name Type Description
bool bool

True if data has been loaded, False otherwise.

has_error property

has_error: bool

Check if the data stream encountered an error during loading.

Returns:

Name Type Description
bool bool

True if an error occurred, False otherwise.

data property

data: TData

Get the loaded data.

Returns:

Name Type Description
TData TData

The loaded data.

Raises:

Type Description
ValueError

If data has not been loaded yet.

at property

at: _At[TDataStream]

Get the accessor for child data streams.

Returns:

Name Type Description
_At _At[TDataStream]

Accessor object for retrieving child streams by name.

set_parent

set_parent(parent: DataStream) -> None

Set the parent data stream.

Parameters:

Name Type Description Default
parent DataStream

The parent data stream to set.

required
Source code in src/contraqctor/contract/base.py
164
165
166
167
168
169
170
def set_parent(self, parent: "DataStream") -> None:
    """Set the parent data stream.

    Args:
        parent: The parent data stream to set.
    """
    self._parent = parent

read

read(
    reader_params: Optional[TReaderParams] = None,
) -> TData

Read data using the configured reader.

Parameters:

Name Type Description Default
reader_params Optional[TReaderParams]

Optional parameters to override the default reader parameters.

None

Returns:

Name Type Description
TData TData

Data read from the source.

Raises:

Type Description
ValueError

If reader parameters are not set.

Source code in src/contraqctor/contract/base.py
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
def read(self, reader_params: Optional[_typing.TReaderParams] = None) -> _typing.TData:
    """Read data using the configured reader.

    Args:
        reader_params: Optional parameters to override the default reader parameters.

    Returns:
        TData: Data read from the source.

    Raises:
        ValueError: If reader parameters are not set.
    """
    reader_params = reader_params if reader_params is not None else self._reader_params
    if _typing.is_unset(reader_params):
        raise ValueError("Reader parameters are not set. Cannot read data.")
    return self._reader(reader_params)

bind_reader_params

bind_reader_params(params: TReaderParams) -> Self

Bind reader parameters to the data stream.

Parameters:

Name Type Description Default
params TReaderParams

Parameters to bind to the data stream's reader.

required

Returns:

Name Type Description
Self Self

The data stream instance for method chaining.

Raises:

Type Description
ValueError

If reader parameters have already been set.

Source code in src/contraqctor/contract/base.py
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
def bind_reader_params(self, params: _typing.TReaderParams) -> Self:
    """Bind reader parameters to the data stream.

    Args:
        params: Parameters to bind to the data stream's reader.

    Returns:
        Self: The data stream instance for method chaining.

    Raises:
        ValueError: If reader parameters have already been set.
    """
    if not _typing.is_unset(self._reader_params):
        raise ValueError("Reader parameters are already set. Cannot bind again.")
    self._reader_params = params
    return self

clear

clear() -> Self

Clear the loaded data from the data stream.

Resets the data to an unset state, allowing for reloading.

Returns:

Name Type Description
Self Self

The data stream instance for method chaining.

Source code in src/contraqctor/contract/base.py
313
314
315
316
317
318
319
320
321
322
def clear(self) -> Self:
    """Clear the loaded data from the data stream.

    Resets the data to an unset state, allowing for reloading.

    Returns:
        Self: The data stream instance for method chaining.
    """
    self._data = _typing.UnsetData
    return self

collect_errors

collect_errors() -> List[ErrorOnLoad]

Collect all errors from this stream and its children.

Performs a depth-first traversal to gather all ErrorOnLoad instances.

Returns:

Type Description
List[ErrorOnLoad]

List[ErrorOnLoad]: List of all errors raised on load encountered in the hierarchy.

Source code in src/contraqctor/contract/base.py
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
def collect_errors(self) -> List[_typing.ErrorOnLoad]:
    """Collect all errors from this stream and its children.

    Performs a depth-first traversal to gather all ErrorOnLoad instances.

    Returns:
        List[ErrorOnLoad]: List of all errors raised on load encountered in the hierarchy.
    """
    errors = []
    if self.has_error:
        errors.append(cast(_typing.ErrorOnLoad, self._data))
    for stream in self:
        if stream is None:
            continue
        errors.extend(stream.collect_errors())
    return errors

load_all

load_all(strict: bool = False) -> Self

Recursively load this data stream and all child streams.

Performs depth-first traversal to load all streams in the hierarchy.

Parameters:

Name Type Description Default
strict bool

If True, raises exceptions immediately; otherwise collects and returns them.

False

Returns:

Name Type Description
list Self

List of tuples containing streams and exceptions that occurred during loading.

Raises:

Type Description
Exception

If strict is True and an exception occurs during loading.

Examples:

# Load all streams and handle errors
errors = collection.load_all(strict=False)

if errors:
    for stream, error in errors:
        print(f"Error loading {stream.name}: {error}")
Source code in src/contraqctor/contract/base.py
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
def load_all(self, strict: bool = False) -> Self:
    """Recursively load this data stream and all child streams.

    Performs depth-first traversal to load all streams in the hierarchy.

    Args:
        strict: If True, raises exceptions immediately; otherwise collects and returns them.

    Returns:
        list: List of tuples containing streams and exceptions that occurred during loading.

    Raises:
        Exception: If strict is True and an exception occurs during loading.

    Examples:
        ```python
        # Load all streams and handle errors
        errors = collection.load_all(strict=False)

        if errors:
            for stream, error in errors:
                print(f"Error loading {stream.name}: {error}")
        ```
    """
    self.load()
    for stream in self:
        if stream is None:
            continue
        stream.load_all(strict=strict)
        if stream.has_error and strict:
            cast(_typing.ErrorOnLoad, stream.data).raise_from_error()
    return self

load

load() -> Self

Load data for this collection.

Overrides the base method to add validation that loaded data is a list of DataStreams.

Returns:

Name Type Description
Self Self

The collection instance for method chaining.

Raises:

Type Description
ValueError

If loaded data is not a list of DataStreams.

Source code in src/contraqctor/contract/base.py
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
@override
def load(self) -> Self:
    """Load data for this collection.

    Overrides the base method to add validation that loaded data is a list of DataStreams.

    Returns:
        Self: The collection instance for method chaining.

    Raises:
        ValueError: If loaded data is not a list of DataStreams.
    """
    super().load()
    if not isinstance(self._data, list):
        self._data = _typing.UnsetData
        raise ValueError("Data must be a list of DataStreams.")
    self._update_data_stream_mapping()
    return self

iter_all

iter_all() -> Generator[DataStream, None, None]

Iterator for all child data streams, including nested collections.

Implements a depth-first traversal of the stream hierarchy.

Yields:

Name Type Description
DataStream DataStream

All recursively yielded child data streams.

Source code in src/contraqctor/contract/base.py
601
602
603
604
605
606
607
608
609
610
611
612
613
def iter_all(self) -> Generator[DataStream, None, None]:
    """Iterator for all child data streams, including nested collections.

    Implements a depth-first traversal of the stream hierarchy.

    Yields:
        DataStream: All recursively yielded child data streams.
    """
    for value in self:
        if isinstance(value, DataStream):
            yield value
        if isinstance(value, DataStreamCollectionBase):
            yield from value.iter_all()

FilePathBaseParam dataclass

FilePathBaseParam(path: PathLike)

Bases: ABC

Abstract base class for file-based reader parameters.

Base parameter class for readers that access files by path.

Attributes:

Name Type Description
path PathLike

Path to the file or directory to read from.

implicit_loading

implicit_loading(value: bool = True)

Context manager to control whether streams automatically load data on access.

When enabled, data streams will automatically load their data when accessed. When disabled, accessing a data stream without prior loading will raise an error. Call load() explicitly instead.

Parameters:

Name Type Description Default
value bool

True to enable auto-loading, False to disable. Default is True.

True

Examples:

# Assume you have nested collections already created
# collection.at("sensors").at("temperature") -> temperature sensor data
# collection.at("sensors").at("humidity") -> humidity sensor data
# collection.at("logs").at("error_log") -> error log file

# With implicit loading enabled (default behavior)
with implicit_loading(True):
    # Data loads automatically on access
    temp_data = collection.at("sensors").at("temperature").data
    humidity_data = collection.at("sensors").at("humidity").data

# With implicit loading disabled - requires explicit loading
with implicit_loading(False):
    # This would raise ValueError: "Data has not been loaded yet"
    try:
        temp_data = collection.at("sensors").at("temperature").data
    except ValueError:
        # Must load explicitly first
        collection.load_all()
        temp_data = collection.at("sensors").at("temperature").data
Source code in src/contraqctor/contract/base.py
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
@contextmanager
def implicit_loading(value: bool = True):
    """Context manager to control whether streams automatically load data on access.

    When enabled, data streams will automatically load their data when accessed. When disabled,
    accessing a data stream without prior loading will raise an error. Call `load()` explicitly
    instead.

    Args:
        value: True to enable auto-loading, False to disable. Default is True.

    Examples:
        ```python
        # Assume you have nested collections already created
        # collection.at("sensors").at("temperature") -> temperature sensor data
        # collection.at("sensors").at("humidity") -> humidity sensor data
        # collection.at("logs").at("error_log") -> error log file

        # With implicit loading enabled (default behavior)
        with implicit_loading(True):
            # Data loads automatically on access
            temp_data = collection.at("sensors").at("temperature").data
            humidity_data = collection.at("sensors").at("humidity").data

        # With implicit loading disabled - requires explicit loading
        with implicit_loading(False):
            # This would raise ValueError: "Data has not been loaded yet"
            try:
                temp_data = collection.at("sensors").at("temperature").data
            except ValueError:
                # Must load explicitly first
                collection.load_all()
                temp_data = collection.at("sensors").at("temperature").data
        ```
    """
    token = _implicit_loading.set(value)
    try:
        yield
    finally:
        _implicit_loading.reset(token)

print_data_stream_tree

print_data_stream_tree(
    node: DataStream,
    prefix: str = "",
    is_last: bool = True,
    parents: list[bool] = [],
    show_params: bool = False,
    show_type: bool = False,
    show_missing_indicator: bool = True,
) -> str

Generates a tree representation of a data stream hierarchy.

Creates a formatted string displaying the hierarchical structure of a data stream and its children as a tree with branch indicators and icons.

Parameters:

Name Type Description Default
node DataStream

The data stream node to start printing from.

required
prefix str

Prefix string to prepend to each line, used for indentation.

''
is_last bool

Whether this node is the last child of its parent.

True
parents list[bool]

List tracking whether each ancestor was a last child, used for drawing branches.

[]
show_params bool

Whether to render parameters of the datastream.

False
show_type bool

Whether to render the class name of the datastream.

False
show_missing_indicator bool

Whether to render the missing data indicator.

True

Returns:

Name Type Description
str str

A formatted string representing the data stream tree.

Examples:

from contraqctor.contract import Dataset, csv, json
from contraqctor.contract.utils import print_data_stream_tree

csv_stream = csv.Csv("data", reader_params=csv.CsvParams(path="data.csv"))
json_stream = json.Json("config", reader_params=json.JsonParams(path="config.json"))
dataset = Dataset("experiment", [csv_stream, json_stream], version="1.0.0")

tree = print_data_stream_tree(dataset)
print(tree)
Source code in src/contraqctor/contract/utils.py
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
def print_data_stream_tree(
    node: DataStream,
    prefix: str = "",
    is_last: bool = True,
    parents: list[bool] = [],
    show_params: bool = False,
    show_type: bool = False,
    show_missing_indicator: bool = True,
) -> str:
    """Generates a tree representation of a data stream hierarchy.

    Creates a formatted string displaying the hierarchical structure of a data stream
    and its children as a tree with branch indicators and icons.

    Args:
        node: The data stream node to start printing from.
        prefix: Prefix string to prepend to each line, used for indentation.
        is_last: Whether this node is the last child of its parent.
        parents: List tracking whether each ancestor was a last child, used for drawing branches.
        show_params: Whether to render parameters of the datastream.
        show_type: Whether to render the class name of the datastream.
        show_missing_indicator: Whether to render the missing data indicator.

    Returns:
        str: A formatted string representing the data stream tree.

    Examples:
        ```python
        from contraqctor.contract import Dataset, csv, json
        from contraqctor.contract.utils import print_data_stream_tree

        csv_stream = csv.Csv("data", reader_params=csv.CsvParams(path="data.csv"))
        json_stream = json.Json("config", reader_params=json.JsonParams(path="config.json"))
        dataset = Dataset("experiment", [csv_stream, json_stream], version="1.0.0")

        tree = print_data_stream_tree(dataset)
        print(tree)
        ```
    """
    node_icon = _get_node_icon(node, show_missing_indicator)
    line_prefix = _build_line_prefix(parents, is_last)
    node_label = _build_node_label(node, show_type, show_params)

    tree_representation = f"{line_prefix}{node_icon} {node_label}\n"

    if node.is_collection and node.has_data:
        for i, child in enumerate(node.data):
            child_is_last = i == len(node.data) - 1
            tree_representation += print_data_stream_tree(
                child,
                prefix="",
                is_last=child_is_last,
                parents=parents + [is_last],
                show_params=show_params,
                show_type=show_type,
                show_missing_indicator=show_missing_indicator,
            )

    return tree_representation