Understanding Time-Series DB Structure
This page aims to provide a comprehensive understanding of data structure used and supported by Wizata inside the connected time-series databases.
Data on Wizata can be stored either on the Time-Series DB or the Backend for metadata (see Architecture key components and hosting choices). The data inside the time-series must respect certain principles and format to be compatible with the the Data Hub.
Internal data structure
This concerns data inside Wizata, if your source data format differs you may need to apply transformation during your connect logic to fit the format supported by Wizata.
General Principles
Any data within the Time-series DB will be composed in any case of a minimum of:
- A timestamp - single point in time
- A hardware ID - unique string identifier of your time-series across all systems
- A value - the value of your data, by default a float
Integer and Boolean are considered automatically as float in order to facilitate performance at query time for AI system.
Since version 11.0.2, Wizata supports additional format inside time-series: event data ( e.g. anomalies, alerts, tracking IDs, ... ) and string data.
Fields and Tags
Data inside the Time-Series DB are identified by fields and tags. Fields are generally used to store values evolving across times and tags to store identifier and dimensions.
The hardware ID is stored inside a sensorId tag by default and the value is stored inside a value field.
Buckets
Depending on your platform Architecture your system might use different solution or version for the Time-Series DB. For example, the common ones are:
- InfluxDB 1.x
- InfluxDB 2.x
See https://www.influxdata.com/ for more information.
The version InfluxDB 2.x and above supports usage of multiple buckets. Default bucket configured on the system is used by default, Data Points stored on a different one are identified within the hardware_id with brackets:
- e.g. on default, "my_datapoint_id" refers to a time-series where a tag sensorId = my_datapoint_id and _field = value
- e.g. on another bucket [bucket1][measurement1]my_datapoint_id refers to a time-series inside the speficied bucket bucket1 and measurement measurement1 where _field = my_datapoint_id
! Default bucket and other bucket format differs !
In summary, default bucket use an extra tag sensorId to identify the datapoint while other buckets directly stores the datapoint hardware ID inside the default tag.
Business Type
The desired format must be declared as the business type field of your Data Points. Each type always identifies the datapoint ID either as extra tag sensorId or default tag.
Business Type | field | _tags |
---|---|---|
Telemetry | value: float | _field/sensorId: hardware_id |
Set Points | value: float | _field/sensorId: hardware_id |
Logical | value: float | _field/sensorId: hardware_id |
Measurements | value: float | _field/sensorId: hardware_id |
Event * | value: float , eventId: string | _field/sensorId: hardware_id, eventStatus: event_status (**) |
Text * | valueStr: string | _field/sensorId: hardware_id |
(*) Not supported on InfluxDB 1.x
(**) Status can contains "On" or "Off" to determine if event is starting or stopping.
Message Formats
Use the right business type
The Business Type used have a lot of implication on how the system interpret your value on queries and explorer.
It can be determined automatically from data you send but it can be changed manually from Data Points. Make sure you have the right one or you might experience unexpected behaviours.
Default
This format applies to : Telemetry, Set Points, Logical & Measurements formats. On automatic data point creation this format will creates Telemetry.
- Timestamp must be ISO formatted and always expressed on UTC time-zone.
- The SensorValue should be numerical: boolean, integer or float.
- And the HardwareId is the tag identifier of your data point.
{
"Timestamp": "2023-10-01T09:00:58Z",
"HardwareId": "your_tag_id",
"SensorValue": 15.5412
}
Events
EventId is mandatory
EventId is required for all events messages and presence of that field triggers creation of event data point if not existing (in that case a group system is not required but will be need to be set by users on interface or DS API). The presence of EventId automatically implies system is dealing with event data and apply the below logic.
There's two kinds of events: with status inside the message and without.
EventStatus in message
EventStatus is inside the message as a string, but SensorValue is not mandatory.
{
"Timestamp": "2023-10-01T09:00:58Z",
"HardwareId": "your_tag_id",
"EventId": "Anomaly_A01",
"EventStatus": "On"
}
If SensorValue is not present in the message, automatically 0.0 will be assigned for "Off" status and 1.0 for "On". EventStatus must be On or Off.
SensorValue must be a value between 0.0 and 1.0 but can be set as a probability. If transmitted, the value is used.
{
"Timestamp": "2023-10-01T09:00:58Z",
"HardwareId": "your_tag_id",
"EventId": "Anomaly_A01",
"EventStatus": "On",
"SensorValue": 0.85
}
Without EventStatus
If value is transmitted without a specific status EventStatus is not set. Those type of event are used for events without a starting and stopping point.
{
"Timestamp": "2023-10-01T09:00:58Z",
"HardwareId": "your_tag_id",
"EventId": "Anomaly_A01",
"SensorValue": 1.0
}
It's possible to send a message with only the EventId
{
"Timestamp": "2023-10-01T09:00:58Z",
"HardwareId": "your_tag_id",
"EventId": "Anomaly_A01"
}
Text
Text data point are determined by the field TextValue which is inserted as valueStr field in Time-Series. The data point Text is created if not existing.
{
"Timestamp": "2023-10-01T09:00:58Z",
"HardwareId": "your_tag_id",
"TextValue": "CATEGORY_A"
}
Updated 2 months ago