Harp¶
Version¶
0.1.0-draft
Introduction¶
While Harp data is largely used for behavior experiments, it is not limited to this modality. As a result, the current standard is scoped to the logging at the level of single Harp devices. The reason for this decision will hopefully become clear as we describe the format and some of the rationale behind it.
Most of the Harp-related concepts mentioned here will be expanded in the documentation of the protocol.
We will strictly follow the logging standards defined by Harp. This decision is justified by the following reasons:
Affords the ability to not only use the same data format within our organization but also to share data with other groups that use Harp;
Affords the ability to reuse common data ingesting tools maintained by others. Since it is an open-source community standard, we also have the ability to contribute to the development of these tools if we so wish;
Affords re-usability of data acquisition, QC and processing pipelines that can be centralized and validated by us and potentially used by others.
Format specification¶
One of the main advantages of using a standardized binary communication protocol is that logging data from harp devices can be largely generalized. In theory, we could simply dump the binary data from the device into a single file and call it a day. However this is not always the most convenient way to log data. For instance, if one is interested in ingesting only a subset of messages (e.g. only the messages from a particular sensor or pin), then the previous approach would require a post-processing step to filter out the messages of interest.
Furthermore, each address, as per harp protocol spec, has potentially different data formats (e.g. U8
vs U16
) or even different lengths if array registers are involved. This can make it very tedious to parse and analyze a binary file offline, since we will have to examine the header of each and every message in the file to determine how to extract its contents.
This analysis could be entirely eliminated if we knew that all messages in the binary file had the same format. For any Harp device, the payload stored in a specific register will have a fixed type and length. This means that to ensure our simplifying assumption it is enough to save each message from a specific register into a different file (aka de-multiplexing strategy).
Thus, for each device, the container of all data will be a single directory with the extension <>.harp
. This directory will contain the following files:
📦<Device>.harp
┣ 📜<DeviceName>_0.bin
┣ 📜<DeviceName>_1.bin
┣ ...
┣ 📜<DeviceName>_<Reg>.bin
┗ 📜device.yml (Optional)
where:
<DeviceName>
will be derived from thedevice.yml
metadata file that fully defines the device and can be found in the repository of each device (e.g.). This file can be seen as the “ground truth” specification of the device. It is used to automatically generate documentation, interfaces and data ingestion tools.<Device>
should match the name of the device in therig.json
schema file;<Reg>
is the register number that is logged in the binary file.
The optional device.yml
file¶
Including the device.yml
file that corresponds to the device interface used to log the device’s data is recommended. Currently, this is not mandatory, but as ecosystem adoption progresses and tools improve, it will likely become a standard requirement. Note that while the device.yml
file specifies the targeted hardware, firmware, and core versions, it does not guarantee that the device from which data was acquired is running those versions. This metadata should instead be queried directly from the corresponding device’s registers(see protocol core registers)
Optional logging of commands¶
A critical aspect of using the Harp protocol is that for each Write
message received by the device from the PC host, the client will echo back a Write message timestamped by the embedded device. This assumes that all messages issued by the host are received by the device and not lost in transmission. However, this is not guaranteed. In case of a lost message, the host will not receive the echo back and cannot confirm that the message was received by the device.
To perform post-hoc quality control, we recommend also logging Commands
. Since Commands
are also HarpMessage types, they can be logged in the same format as data from devices. We also recommend appending a software timestamp to each Command
message to facilitate pairing requests with responses post-hoc.
Application notes¶
All harp devices, regardless of their specific application, are expected to be logged according to the following standards:
Clock Synchronization¶
We will only consider two operational modes for devices used in the AIND: Standalone
and Synchronized
In
Standalone
mode the Harp device is not subordinate to a distributed clock, and all messages are timestamped by the device’s internal clock. This mode can be used if only a single Harp device is used during experiment acquisition.In
Synchronized
mode, the Harp device is subordinate to a distributed clock, and all messages are timestamped by the distributed clock. This mode should be used when multiple Harp devices are used during experiment acquisition. In each experiment, there shall be a single clock generator source (e.g. WhiteRabbit device) to which all devices are connected. This source device is logged like any other Harp device.
Interfacing with Bonsai¶
The following points will describe recommendations and recipes for logging data from harp devices using Bonsai programming language. We will assume a basic understanding of the Bonsai programming language, and how to interface with Harp devices from it.
Instructions on how to log data from a Harp device using Bonsai can be found in the Harp Bonsai interface docs.
Warning
In your experiments, always validate that your logging routine has fully initialized before requesting a reading dump from the device. Failure to do so may result in missing data.
Note
In the future we will update these recipes to also provide AIND specific examples.
It is critical that the messages logged from the device are sufficient to reconstruct its state history. For that to be true, we need to know the initial state of all registers. This can be asked via a special register in the protocol core: OperationControl. This register has a single bit that, when set, will trigger the device to send a dump all the values of all its registers.
To the previous example, in a different branch:
Add a
Timer
operator with itsDueTime
property set to 2 seconds. This will mimic the delayed start of an experiment.Add a
CreateMessage(Bonsai.Harp)
operator after theTimer
Select
OperationControlPayload
underPayload
. Depending on your use case, you might want to change some of the settings, but we recommend: -DumpRegisters
set toTrue
(Required for the dump) -Heartbeat
set toTrue
(Useful to know the device is still alive) -MuteReplies
set toFalse
-OperationLed
set toTrue
-OperationMode
set toActive
-VisualIndicator
set toOn
Add a
Multicast
operator to send the message to the device
Bonsai example workflow
<?xml version="1.0" encoding="utf-8"?>
<WorkflowBuilder Version="2.8.1"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:rx="clr-namespace:Bonsai.Reactive;assembly=Bonsai.Core"
xmlns:harp="clr-namespace:Bonsai.Harp;assembly=Bonsai.Harp"
xmlns="https://bonsai-rx.org/2018/workflow">
<Workflow>
<Nodes>
<Expression xsi:type="Combinator">
<Combinator xsi:type="rx:Timer">
<rx:DueTime>PT2S</rx:DueTime>
<rx:Period>PT0S</rx:Period>
</Combinator>
</Expression>
<Expression xsi:type="harp:CreateMessage">
<harp:MessageType>Write</harp:MessageType>
<harp:Payload xsi:type="harp:CreateOperationControlPayload">
<harp:OperationMode>Active</harp:OperationMode>
<harp:DumpRegisters>false</harp:DumpRegisters>
<harp:MuteReplies>false</harp:MuteReplies>
<harp:VisualIndicators>Off</harp:VisualIndicators>
<harp:OperationLed>Off</harp:OperationLed>
<harp:Heartbeat>Disabled</harp:Heartbeat>
</harp:Payload>
</Expression>
<Expression xsi:type="MulticastSubject">
<Name>BehaviorCommands</Name>
</Expression>
</Nodes>
<Edges>
<Edge From="0" To="1" Label="Source1" />
<Edge From="1" To="2" Label="Source1" />
</Edges>
</Workflow>
</WorkflowBuilder>
Finally, commands to the device can be logged in the exact same way as replies. However, in order to facilitate post-hoc quality control, we recommend appending a software timestamp to each Command
message. This can be done by “injecting” a timestamp into the message payload before logging. We recommend using high frequency events from a single device as a source of “the latest timestamp” to be used in the Command message. We should stress that these timestamps should not be used for analysis that require precise and accurate synchronization, as they are not synchronized with the distributed clock.
Relationship to aind-data-schema¶
Most fields tracked in rig.json
can be easily extracted from the device’s read-dump. It is likely that helper methods will be provided in the future to automate this conversion. For now, refer to the protocol’s core registers to extract the necessary information.
File Quality Assurances¶
By virtue of implementing the Harp communication and synchronization protocol the following should be true:
Each data set should, at most, have a device as a source of the synchronized clock.
All messages from the device to the computer host should be logged. Once a message is successfully parsed, no more processing and/or filtering of the data stream will be done prior to logging.
All data from a single device will include the initial state of all registers. This can be achieved by setting the
DumpRegisters
bit in theOperationControl
register. Given that this is true, inside the container folder, one file per register of the device is expected to be found with a minimum of one message in each file.If Commands are logged, for each message sent to the device, a corresponding message should exist in the logged data from the harp device. The type of the message in the Command will match the type of the reply from the device.
If multiple devices are used, all data is assumed to be synchronized at acquisition time.