contract¶
Dataset ¶
Dataset(
name: str,
data_streams: List[DataStream],
*,
version: str | Version = "0.0.0",
description: Optional[str] = None,
)
Bases: DataStreamCollection
A version-tracked collection of data streams.
Extends DataStreamCollection by adding semantic versioning support.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Name identifier for the dataset. |
required |
data_streams
|
List[DataStream]
|
List of data streams to include in the dataset. |
required |
version
|
str | Version
|
Semantic version string or Version object. Defaults to "0.0.0". |
'0.0.0'
|
description
|
Optional[str]
|
Optional description of the dataset. |
None
|
Examples:
from contraqctor.contract import text, csv, Dataset
# Create streams
text_stream = text.Text("notes", reader_params=text.TextParams(path="notes.txt"))
csv_stream = csv.Csv("data", reader_params=csv.CsvParams(path="data.csv"))
# Create a versioned dataset
dataset = Dataset(
"experiment_results",
[text_stream, csv_stream],
version="1.2.3"
)
# Load the dataset
dataset.load_all(strict=True)
# Access streams
txt = dataset["notes"].data
csv_data = dataset["data"].data
print(f"Dataset version: {dataset.version}")
Initializes a Dataset with a version and a list of data streams.
Source code in src/contraqctor/contract/base.py
829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 | |
name
property
¶
name: str
Get the name of the data stream.
Returns:
| Name | Type | Description |
|---|---|---|
str |
str
|
Name identifier of the data stream. |
resolved_name
property
¶
resolved_name: str
Get the full hierarchical name of the data stream.
Generates a path-like name showing the stream's position in the hierarchy, using '::' as a separator between parent and child names.
Returns:
| Name | Type | Description |
|---|---|---|
str |
str
|
The fully resolved name including all parent names. |
description
property
¶
parent
property
¶
parent: Optional[DataStream]
Get the parent data stream.
Returns:
| Type | Description |
|---|---|
Optional[DataStream]
|
Optional[DataStream]: Parent data stream, or None if this is a root stream. |
is_collection
property
¶
is_collection: bool
Check if this data stream is a collection of other streams.
Returns:
| Name | Type | Description |
|---|---|---|
bool |
bool
|
True if this is a collection stream, False otherwise. |
reader_params
property
¶
reader_params: TReaderParams
Get the parameters for the data reader.
Returns:
| Name | Type | Description |
|---|---|---|
TReaderParams |
TReaderParams
|
Parameters for the data reader. |
at
property
¶
at: _At[TDataStream]
Get the accessor for child data streams.
Returns:
| Name | Type | Description |
|---|---|---|
_At |
_At[TDataStream]
|
Accessor object for retrieving child streams by name. |
has_data
property
¶
has_data: bool
Check if the data stream has loaded data.
Returns:
| Name | Type | Description |
|---|---|---|
bool |
bool
|
True if data has been loaded, False otherwise. |
has_error
property
¶
has_error: bool
Check if the data stream encountered an error during loading.
Returns:
| Name | Type | Description |
|---|---|---|
bool |
bool
|
True if an error occurred, False otherwise. |
data
property
¶
data: TData
Get the loaded data.
Returns:
| Name | Type | Description |
|---|---|---|
TData |
TData
|
The loaded data. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If data has not been loaded yet. |
version
property
¶
version: Version
Get the semantic version of the dataset.
Returns:
| Name | Type | Description |
|---|---|---|
Version |
Version
|
Semantic version object. |
set_parent ¶
set_parent(parent: DataStream) -> None
Set the parent data stream.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
parent
|
DataStream
|
The parent data stream to set. |
required |
Source code in src/contraqctor/contract/base.py
164 165 166 167 168 169 170 | |
read ¶
read(*args, **kwargs) -> List[DataStream]
Read data from the collection.
Returns:
| Type | Description |
|---|---|
List[DataStream]
|
List[DataStream]: The pre-set data streams. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If data streams have not been set yet. |
Source code in src/contraqctor/contract/base.py
677 678 679 680 681 682 683 684 685 686 687 688 689 | |
bind_reader_params ¶
bind_reader_params(params: TReaderParams) -> Self
Bind reader parameters to the data stream.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
params
|
TReaderParams
|
Parameters to bind to the data stream's reader. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
Self |
Self
|
The data stream instance for method chaining. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If reader parameters have already been set. |
Source code in src/contraqctor/contract/base.py
211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 | |
clear ¶
clear() -> Self
Clear the loaded data from the data stream.
Resets the data to an unset state, allowing for reloading.
Returns:
| Name | Type | Description |
|---|---|---|
Self |
Self
|
The data stream instance for method chaining. |
Source code in src/contraqctor/contract/base.py
313 314 315 316 317 318 319 320 321 322 | |
load ¶
load() -> Self
Load data for this collection.
Overrides the base method to add validation that loaded data is a list of DataStreams.
Returns:
| Name | Type | Description |
|---|---|---|
Self |
Self
|
The collection instance for method chaining. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If loaded data is not a list of DataStreams. |
Source code in src/contraqctor/contract/base.py
536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 | |
collect_errors ¶
collect_errors() -> List[ErrorOnLoad]
Collect all errors from this stream and its children.
Performs a depth-first traversal to gather all ErrorOnLoad instances.
Returns:
| Type | Description |
|---|---|
List[ErrorOnLoad]
|
List[ErrorOnLoad]: List of all errors raised on load encountered in the hierarchy. |
Source code in src/contraqctor/contract/base.py
379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 | |
load_all ¶
Recursively load this data stream and all child streams.
Performs depth-first traversal to load all streams in the hierarchy.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
strict
|
bool
|
If True, raises exceptions immediately; otherwise collects and returns them. |
False
|
Returns:
| Name | Type | Description |
|---|---|---|
list |
Self
|
List of tuples containing streams and exceptions that occurred during loading. |
Raises:
| Type | Description |
|---|---|
Exception
|
If strict is True and an exception occurs during loading. |
Examples:
# Load all streams and handle errors
errors = collection.load_all(strict=False)
if errors:
for stream, error in errors:
print(f"Error loading {stream.name}: {error}")
Source code in src/contraqctor/contract/base.py
396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 | |
iter_all ¶
iter_all() -> Generator[DataStream, None, None]
Iterator for all child data streams, including nested collections.
Implements a depth-first traversal of the stream hierarchy.
Yields:
| Name | Type | Description |
|---|---|---|
DataStream |
DataStream
|
All recursively yielded child data streams. |
Source code in src/contraqctor/contract/base.py
601 602 603 604 605 606 607 608 609 610 611 612 613 | |
parameters
staticmethod
¶
parameters(*args, **kwargs) -> UnsetParamsType
Parameters function to return UnsetParams.
Returns:
| Name | Type | Description |
|---|---|---|
UnsetParamsType |
UnsetParamsType
|
Special unset parameters value. |
Source code in src/contraqctor/contract/base.py
660 661 662 663 664 665 666 667 | |
bind_data_streams ¶
bind_data_streams(data_streams: List[DataStream]) -> Self
Bind a list of data streams to the collection.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data_streams
|
List[DataStream]
|
List of data streams to include in the collection. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
Self |
Self
|
The collection instance for method chaining. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If data streams have already been set. |
Source code in src/contraqctor/contract/base.py
691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 | |
add_stream ¶
add_stream(stream: DataStream) -> Self
Add a new data stream to the collection.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
stream
|
DataStream
|
Data stream to add to the collection. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
Self |
Self
|
The collection instance for method chaining. |
Raises:
| Type | Description |
|---|---|
KeyError
|
If a stream with the same name already exists. |
Examples:
from contraqctor.contract import json, DataStreamCollection
# Create an empty collection
collection = DataStreamCollection("api_data", [])
# Add streams
collection.add_stream(
json.Json("config", reader_params=json.JsonParams(path="config.json"))
)
# Load the data
collection.load_all()
Source code in src/contraqctor/contract/base.py
709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 | |
remove_stream ¶
remove_stream(name: str) -> None
Remove a data stream from the collection.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Name of the data stream to remove. |
required |
Raises:
| Type | Description |
|---|---|
ValueError
|
If data streams have not been set yet. |
KeyError
|
If no stream with the given name exists. |
Source code in src/contraqctor/contract/base.py
749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 | |
from_data_stream
classmethod
¶
from_data_stream(data_stream: DataStream) -> Self
Create a DataStreamCollection from a DataStream object.
Factory method to convert a single data stream or collection into a DataStreamCollection.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data_stream
|
DataStream
|
Source data stream to convert. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
DataStreamCollection |
Self
|
New collection containing the source stream's data. |
Raises:
| Type | Description |
|---|---|
TypeError
|
If the source is not a DataStream. |
ValueError
|
If the source has not been loaded yet. |
Source code in src/contraqctor/contract/base.py
768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 | |
DataStream ¶
DataStream(
name: str,
*,
description: Optional[str] = None,
reader_params: TReaderParams = UnsetParams,
**kwargs,
)
Bases: ABC, Generic[TData, TReaderParams]
Abstract base class for all data streams.
Provides a generic interface for data reading operations with configurable parameters and hierarchical organization.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Name identifier for the data stream. |
required |
description
|
Optional[str]
|
Optional description of the data stream. |
None
|
reader_params
|
TReaderParams
|
Optional parameters for the data reader. |
UnsetParams
|
**kwargs
|
Additional keyword arguments. |
{}
|
Attributes:
| Name | Type | Description |
|---|---|---|
_is_collection |
bool
|
Class variable indicating if this is a collection of data streams. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If name contains '::' characters which are reserved for path resolution. |
Source code in src/contraqctor/contract/base.py
103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 | |
name
property
¶
name: str
Get the name of the data stream.
Returns:
| Name | Type | Description |
|---|---|---|
str |
str
|
Name identifier of the data stream. |
resolved_name
property
¶
resolved_name: str
Get the full hierarchical name of the data stream.
Generates a path-like name showing the stream's position in the hierarchy, using '::' as a separator between parent and child names.
Returns:
| Name | Type | Description |
|---|---|---|
str |
str
|
The fully resolved name including all parent names. |
description
property
¶
parent
property
¶
parent: Optional[DataStream]
Get the parent data stream.
Returns:
| Type | Description |
|---|---|
Optional[DataStream]
|
Optional[DataStream]: Parent data stream, or None if this is a root stream. |
is_collection
property
¶
is_collection: bool
Check if this data stream is a collection of other streams.
Returns:
| Name | Type | Description |
|---|---|---|
bool |
bool
|
True if this is a collection stream, False otherwise. |
reader_params
property
¶
reader_params: TReaderParams
Get the parameters for the data reader.
Returns:
| Name | Type | Description |
|---|---|---|
TReaderParams |
TReaderParams
|
Parameters for the data reader. |
at
property
¶
at: _AtProtocol
Get a child data stream by name.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
Name of the child data stream to retrieve. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
DataStream |
_AtProtocol
|
The child data stream with the given name. |
Raises:
| Type | Description |
|---|---|
NotImplementedError
|
If the data stream does not support child access. |
Examples:
# Access stream in a collection
collection = data_collection.load()
temp_stream = collection.at("temperature")
# Or using dictionary-style syntax
humidity_stream = collection["humidity"]
has_data
property
¶
has_data: bool
Check if the data stream has loaded data.
Returns:
| Name | Type | Description |
|---|---|---|
bool |
bool
|
True if data has been loaded, False otherwise. |
has_error
property
¶
has_error: bool
Check if the data stream encountered an error during loading.
Returns:
| Name | Type | Description |
|---|---|---|
bool |
bool
|
True if an error occurred, False otherwise. |
data
property
¶
data: TData
Get the loaded data.
Returns:
| Name | Type | Description |
|---|---|---|
TData |
TData
|
The loaded data. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If data has not been loaded yet. |
set_parent ¶
set_parent(parent: DataStream) -> None
Set the parent data stream.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
parent
|
DataStream
|
The parent data stream to set. |
required |
Source code in src/contraqctor/contract/base.py
164 165 166 167 168 169 170 | |
read ¶
read(
reader_params: Optional[TReaderParams] = None,
) -> TData
Read data using the configured reader.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
reader_params
|
Optional[TReaderParams]
|
Optional parameters to override the default reader parameters. |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
TData |
TData
|
Data read from the source. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If reader parameters are not set. |
Source code in src/contraqctor/contract/base.py
194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 | |
bind_reader_params ¶
bind_reader_params(params: TReaderParams) -> Self
Bind reader parameters to the data stream.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
params
|
TReaderParams
|
Parameters to bind to the data stream's reader. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
Self |
Self
|
The data stream instance for method chaining. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If reader parameters have already been set. |
Source code in src/contraqctor/contract/base.py
211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 | |
clear ¶
clear() -> Self
Clear the loaded data from the data stream.
Resets the data to an unset state, allowing for reloading.
Returns:
| Name | Type | Description |
|---|---|---|
Self |
Self
|
The data stream instance for method chaining. |
Source code in src/contraqctor/contract/base.py
313 314 315 316 317 318 319 320 321 322 | |
load ¶
load() -> Self
Load data into the data stream.
Reads data from the source and stores it in the data stream.
Returns:
| Name | Type | Description |
|---|---|---|
Self |
Self
|
The data stream instance for method chaining. |
Examples:
from contraqctor.contract import csv
# Create and load a CSV stream
params = csv.CsvParams(path="data/measurements.csv")
csv_stream = csv.Csv("measurements", reader_params=params)
csv_stream.load()
# Access the data
df = csv_stream.data
print(f"Loaded {len(df)} rows")
Source code in src/contraqctor/contract/base.py
324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 | |
collect_errors ¶
collect_errors() -> List[ErrorOnLoad]
Collect all errors from this stream and its children.
Performs a depth-first traversal to gather all ErrorOnLoad instances.
Returns:
| Type | Description |
|---|---|
List[ErrorOnLoad]
|
List[ErrorOnLoad]: List of all errors raised on load encountered in the hierarchy. |
Source code in src/contraqctor/contract/base.py
379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 | |
load_all ¶
Recursively load this data stream and all child streams.
Performs depth-first traversal to load all streams in the hierarchy.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
strict
|
bool
|
If True, raises exceptions immediately; otherwise collects and returns them. |
False
|
Returns:
| Name | Type | Description |
|---|---|---|
list |
Self
|
List of tuples containing streams and exceptions that occurred during loading. |
Raises:
| Type | Description |
|---|---|
Exception
|
If strict is True and an exception occurs during loading. |
Examples:
# Load all streams and handle errors
errors = collection.load_all(strict=False)
if errors:
for stream, error in errors:
print(f"Error loading {stream.name}: {error}")
Source code in src/contraqctor/contract/base.py
396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 | |
DataStreamCollection ¶
DataStreamCollection(
name: str,
data_streams: List[DataStream],
*,
description: Optional[str] = None,
)
Bases: DataStreamCollectionBase[DataStream, UnsetParamsType]
Collection of data streams with direct initialization.
A specialized collection where child streams are passed directly instead of being created by a reader function.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Name identifier for the collection. |
required |
data_streams
|
List[DataStream]
|
List of child data streams to include. |
required |
description
|
Optional[str]
|
Optional description of the collection. |
None
|
Examples:
from contraqctor.contract import csv, text, DataStreamCollection
# Create streams
text_stream = text.Text("readme", reader_params=text.TextParams(path="README.md"))
csv_stream = csv.Csv("data", reader_params=csv.CsvParams(path="data.csv"))
# Create the collection
collection = DataStreamCollection("project_files", [text_stream, csv_stream])
# Load and use
collection.load_all()
readme_content = collection["readme"].data
Initializes a special DataStreamGroup where the data streams are passed directly, without a reader.
Source code in src/contraqctor/contract/base.py
644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 | |
name
property
¶
name: str
Get the name of the data stream.
Returns:
| Name | Type | Description |
|---|---|---|
str |
str
|
Name identifier of the data stream. |
resolved_name
property
¶
resolved_name: str
Get the full hierarchical name of the data stream.
Generates a path-like name showing the stream's position in the hierarchy, using '::' as a separator between parent and child names.
Returns:
| Name | Type | Description |
|---|---|---|
str |
str
|
The fully resolved name including all parent names. |
description
property
¶
parent
property
¶
parent: Optional[DataStream]
Get the parent data stream.
Returns:
| Type | Description |
|---|---|
Optional[DataStream]
|
Optional[DataStream]: Parent data stream, or None if this is a root stream. |
is_collection
property
¶
is_collection: bool
Check if this data stream is a collection of other streams.
Returns:
| Name | Type | Description |
|---|---|---|
bool |
bool
|
True if this is a collection stream, False otherwise. |
reader_params
property
¶
reader_params: TReaderParams
Get the parameters for the data reader.
Returns:
| Name | Type | Description |
|---|---|---|
TReaderParams |
TReaderParams
|
Parameters for the data reader. |
at
property
¶
at: _At[TDataStream]
Get the accessor for child data streams.
Returns:
| Name | Type | Description |
|---|---|---|
_At |
_At[TDataStream]
|
Accessor object for retrieving child streams by name. |
has_data
property
¶
has_data: bool
Check if the data stream has loaded data.
Returns:
| Name | Type | Description |
|---|---|---|
bool |
bool
|
True if data has been loaded, False otherwise. |
has_error
property
¶
has_error: bool
Check if the data stream encountered an error during loading.
Returns:
| Name | Type | Description |
|---|---|---|
bool |
bool
|
True if an error occurred, False otherwise. |
data
property
¶
data: TData
Get the loaded data.
Returns:
| Name | Type | Description |
|---|---|---|
TData |
TData
|
The loaded data. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If data has not been loaded yet. |
set_parent ¶
set_parent(parent: DataStream) -> None
Set the parent data stream.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
parent
|
DataStream
|
The parent data stream to set. |
required |
Source code in src/contraqctor/contract/base.py
164 165 166 167 168 169 170 | |
bind_reader_params ¶
bind_reader_params(params: TReaderParams) -> Self
Bind reader parameters to the data stream.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
params
|
TReaderParams
|
Parameters to bind to the data stream's reader. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
Self |
Self
|
The data stream instance for method chaining. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If reader parameters have already been set. |
Source code in src/contraqctor/contract/base.py
211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 | |
clear ¶
clear() -> Self
Clear the loaded data from the data stream.
Resets the data to an unset state, allowing for reloading.
Returns:
| Name | Type | Description |
|---|---|---|
Self |
Self
|
The data stream instance for method chaining. |
Source code in src/contraqctor/contract/base.py
313 314 315 316 317 318 319 320 321 322 | |
load ¶
load() -> Self
Load data for this collection.
Overrides the base method to add validation that loaded data is a list of DataStreams.
Returns:
| Name | Type | Description |
|---|---|---|
Self |
Self
|
The collection instance for method chaining. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If loaded data is not a list of DataStreams. |
Source code in src/contraqctor/contract/base.py
536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 | |
collect_errors ¶
collect_errors() -> List[ErrorOnLoad]
Collect all errors from this stream and its children.
Performs a depth-first traversal to gather all ErrorOnLoad instances.
Returns:
| Type | Description |
|---|---|
List[ErrorOnLoad]
|
List[ErrorOnLoad]: List of all errors raised on load encountered in the hierarchy. |
Source code in src/contraqctor/contract/base.py
379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 | |
load_all ¶
Recursively load this data stream and all child streams.
Performs depth-first traversal to load all streams in the hierarchy.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
strict
|
bool
|
If True, raises exceptions immediately; otherwise collects and returns them. |
False
|
Returns:
| Name | Type | Description |
|---|---|---|
list |
Self
|
List of tuples containing streams and exceptions that occurred during loading. |
Raises:
| Type | Description |
|---|---|
Exception
|
If strict is True and an exception occurs during loading. |
Examples:
# Load all streams and handle errors
errors = collection.load_all(strict=False)
if errors:
for stream, error in errors:
print(f"Error loading {stream.name}: {error}")
Source code in src/contraqctor/contract/base.py
396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 | |
iter_all ¶
iter_all() -> Generator[DataStream, None, None]
Iterator for all child data streams, including nested collections.
Implements a depth-first traversal of the stream hierarchy.
Yields:
| Name | Type | Description |
|---|---|---|
DataStream |
DataStream
|
All recursively yielded child data streams. |
Source code in src/contraqctor/contract/base.py
601 602 603 604 605 606 607 608 609 610 611 612 613 | |
parameters
staticmethod
¶
parameters(*args, **kwargs) -> UnsetParamsType
Parameters function to return UnsetParams.
Returns:
| Name | Type | Description |
|---|---|---|
UnsetParamsType |
UnsetParamsType
|
Special unset parameters value. |
Source code in src/contraqctor/contract/base.py
660 661 662 663 664 665 666 667 | |
read ¶
read(*args, **kwargs) -> List[DataStream]
Read data from the collection.
Returns:
| Type | Description |
|---|---|
List[DataStream]
|
List[DataStream]: The pre-set data streams. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If data streams have not been set yet. |
Source code in src/contraqctor/contract/base.py
677 678 679 680 681 682 683 684 685 686 687 688 689 | |
bind_data_streams ¶
bind_data_streams(data_streams: List[DataStream]) -> Self
Bind a list of data streams to the collection.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data_streams
|
List[DataStream]
|
List of data streams to include in the collection. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
Self |
Self
|
The collection instance for method chaining. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If data streams have already been set. |
Source code in src/contraqctor/contract/base.py
691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 | |
add_stream ¶
add_stream(stream: DataStream) -> Self
Add a new data stream to the collection.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
stream
|
DataStream
|
Data stream to add to the collection. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
Self |
Self
|
The collection instance for method chaining. |
Raises:
| Type | Description |
|---|---|
KeyError
|
If a stream with the same name already exists. |
Examples:
from contraqctor.contract import json, DataStreamCollection
# Create an empty collection
collection = DataStreamCollection("api_data", [])
# Add streams
collection.add_stream(
json.Json("config", reader_params=json.JsonParams(path="config.json"))
)
# Load the data
collection.load_all()
Source code in src/contraqctor/contract/base.py
709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 | |
remove_stream ¶
remove_stream(name: str) -> None
Remove a data stream from the collection.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Name of the data stream to remove. |
required |
Raises:
| Type | Description |
|---|---|
ValueError
|
If data streams have not been set yet. |
KeyError
|
If no stream with the given name exists. |
Source code in src/contraqctor/contract/base.py
749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 | |
from_data_stream
classmethod
¶
from_data_stream(data_stream: DataStream) -> Self
Create a DataStreamCollection from a DataStream object.
Factory method to convert a single data stream or collection into a DataStreamCollection.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data_stream
|
DataStream
|
Source data stream to convert. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
DataStreamCollection |
Self
|
New collection containing the source stream's data. |
Raises:
| Type | Description |
|---|---|
TypeError
|
If the source is not a DataStream. |
ValueError
|
If the source has not been loaded yet. |
Source code in src/contraqctor/contract/base.py
768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 | |
DataStreamCollectionBase ¶
DataStreamCollectionBase(
name: str,
*,
description: Optional[str] = None,
reader_params: Optional[TReaderParams] = None,
**kwargs,
)
Bases: DataStream[List[TDataStream], TReaderParams], Generic[TDataStream, TReaderParams]
Base class for collections of data streams.
Provides functionality for managing and accessing multiple child data streams.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Name identifier for the collection. |
required |
description
|
Optional[str]
|
Optional description of the collection. |
None
|
reader_params
|
Optional[TReaderParams]
|
Optional parameters for the reader. |
None
|
**kwargs
|
Additional keyword arguments. |
{}
|
Source code in src/contraqctor/contract/base.py
488 489 490 491 492 493 494 495 496 497 498 499 | |
name
property
¶
name: str
Get the name of the data stream.
Returns:
| Name | Type | Description |
|---|---|---|
str |
str
|
Name identifier of the data stream. |
resolved_name
property
¶
resolved_name: str
Get the full hierarchical name of the data stream.
Generates a path-like name showing the stream's position in the hierarchy, using '::' as a separator between parent and child names.
Returns:
| Name | Type | Description |
|---|---|---|
str |
str
|
The fully resolved name including all parent names. |
description
property
¶
parent
property
¶
parent: Optional[DataStream]
Get the parent data stream.
Returns:
| Type | Description |
|---|---|
Optional[DataStream]
|
Optional[DataStream]: Parent data stream, or None if this is a root stream. |
is_collection
property
¶
is_collection: bool
Check if this data stream is a collection of other streams.
Returns:
| Name | Type | Description |
|---|---|---|
bool |
bool
|
True if this is a collection stream, False otherwise. |
reader_params
property
¶
reader_params: TReaderParams
Get the parameters for the data reader.
Returns:
| Name | Type | Description |
|---|---|---|
TReaderParams |
TReaderParams
|
Parameters for the data reader. |
has_data
property
¶
has_data: bool
Check if the data stream has loaded data.
Returns:
| Name | Type | Description |
|---|---|---|
bool |
bool
|
True if data has been loaded, False otherwise. |
has_error
property
¶
has_error: bool
Check if the data stream encountered an error during loading.
Returns:
| Name | Type | Description |
|---|---|---|
bool |
bool
|
True if an error occurred, False otherwise. |
data
property
¶
data: TData
Get the loaded data.
Returns:
| Name | Type | Description |
|---|---|---|
TData |
TData
|
The loaded data. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If data has not been loaded yet. |
at
property
¶
at: _At[TDataStream]
Get the accessor for child data streams.
Returns:
| Name | Type | Description |
|---|---|---|
_At |
_At[TDataStream]
|
Accessor object for retrieving child streams by name. |
set_parent ¶
set_parent(parent: DataStream) -> None
Set the parent data stream.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
parent
|
DataStream
|
The parent data stream to set. |
required |
Source code in src/contraqctor/contract/base.py
164 165 166 167 168 169 170 | |
read ¶
read(
reader_params: Optional[TReaderParams] = None,
) -> TData
Read data using the configured reader.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
reader_params
|
Optional[TReaderParams]
|
Optional parameters to override the default reader parameters. |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
TData |
TData
|
Data read from the source. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If reader parameters are not set. |
Source code in src/contraqctor/contract/base.py
194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 | |
bind_reader_params ¶
bind_reader_params(params: TReaderParams) -> Self
Bind reader parameters to the data stream.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
params
|
TReaderParams
|
Parameters to bind to the data stream's reader. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
Self |
Self
|
The data stream instance for method chaining. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If reader parameters have already been set. |
Source code in src/contraqctor/contract/base.py
211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 | |
clear ¶
clear() -> Self
Clear the loaded data from the data stream.
Resets the data to an unset state, allowing for reloading.
Returns:
| Name | Type | Description |
|---|---|---|
Self |
Self
|
The data stream instance for method chaining. |
Source code in src/contraqctor/contract/base.py
313 314 315 316 317 318 319 320 321 322 | |
collect_errors ¶
collect_errors() -> List[ErrorOnLoad]
Collect all errors from this stream and its children.
Performs a depth-first traversal to gather all ErrorOnLoad instances.
Returns:
| Type | Description |
|---|---|
List[ErrorOnLoad]
|
List[ErrorOnLoad]: List of all errors raised on load encountered in the hierarchy. |
Source code in src/contraqctor/contract/base.py
379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 | |
load_all ¶
Recursively load this data stream and all child streams.
Performs depth-first traversal to load all streams in the hierarchy.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
strict
|
bool
|
If True, raises exceptions immediately; otherwise collects and returns them. |
False
|
Returns:
| Name | Type | Description |
|---|---|---|
list |
Self
|
List of tuples containing streams and exceptions that occurred during loading. |
Raises:
| Type | Description |
|---|---|
Exception
|
If strict is True and an exception occurs during loading. |
Examples:
# Load all streams and handle errors
errors = collection.load_all(strict=False)
if errors:
for stream, error in errors:
print(f"Error loading {stream.name}: {error}")
Source code in src/contraqctor/contract/base.py
396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 | |
load ¶
load() -> Self
Load data for this collection.
Overrides the base method to add validation that loaded data is a list of DataStreams.
Returns:
| Name | Type | Description |
|---|---|---|
Self |
Self
|
The collection instance for method chaining. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If loaded data is not a list of DataStreams. |
Source code in src/contraqctor/contract/base.py
536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 | |
iter_all ¶
iter_all() -> Generator[DataStream, None, None]
Iterator for all child data streams, including nested collections.
Implements a depth-first traversal of the stream hierarchy.
Yields:
| Name | Type | Description |
|---|---|---|
DataStream |
DataStream
|
All recursively yielded child data streams. |
Source code in src/contraqctor/contract/base.py
601 602 603 604 605 606 607 608 609 610 611 612 613 | |
implicit_loading ¶
implicit_loading(value: bool = True)
Context manager to control whether streams automatically load data on access.
When enabled, data streams will automatically load their data when accessed. When disabled,
accessing a data stream without prior loading will raise an error. Call load() explicitly
instead.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
value
|
bool
|
True to enable auto-loading, False to disable. Default is True. |
True
|
Examples:
# Assume you have nested collections already created
# collection.at("sensors").at("temperature") -> temperature sensor data
# collection.at("sensors").at("humidity") -> humidity sensor data
# collection.at("logs").at("error_log") -> error log file
# With implicit loading enabled (default behavior)
with implicit_loading(True):
# Data loads automatically on access
temp_data = collection.at("sensors").at("temperature").data
humidity_data = collection.at("sensors").at("humidity").data
# With implicit loading disabled - requires explicit loading
with implicit_loading(False):
# This would raise ValueError: "Data has not been loaded yet"
try:
temp_data = collection.at("sensors").at("temperature").data
except ValueError:
# Must load explicitly first
collection.load_all()
temp_data = collection.at("sensors").at("temperature").data
Source code in src/contraqctor/contract/base.py
29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 | |
print_data_stream_tree ¶
print_data_stream_tree(
node: DataStream,
prefix: str = "",
is_last: bool = True,
parents: list[bool] = [],
show_params: bool = False,
show_type: bool = False,
show_missing_indicator: bool = True,
) -> str
Generates a tree representation of a data stream hierarchy.
Creates a formatted string displaying the hierarchical structure of a data stream and its children as a tree with branch indicators and icons.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
node
|
DataStream
|
The data stream node to start printing from. |
required |
prefix
|
str
|
Prefix string to prepend to each line, used for indentation. |
''
|
is_last
|
bool
|
Whether this node is the last child of its parent. |
True
|
parents
|
list[bool]
|
List tracking whether each ancestor was a last child, used for drawing branches. |
[]
|
show_params
|
bool
|
Whether to render parameters of the datastream. |
False
|
show_type
|
bool
|
Whether to render the class name of the datastream. |
False
|
show_missing_indicator
|
bool
|
Whether to render the missing data indicator. |
True
|
Returns:
| Name | Type | Description |
|---|---|---|
str |
str
|
A formatted string representing the data stream tree. |
Examples:
from contraqctor.contract import Dataset, csv, json
from contraqctor.contract.utils import print_data_stream_tree
csv_stream = csv.Csv("data", reader_params=csv.CsvParams(path="data.csv"))
json_stream = json.Json("config", reader_params=json.JsonParams(path="config.json"))
dataset = Dataset("experiment", [csv_stream, json_stream], version="1.0.0")
tree = print_data_stream_tree(dataset)
print(tree)
Source code in src/contraqctor/contract/utils.py
39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 | |