Data Items

UBStores basic form of data input, output and requests is the so called data item, defined in the IDataItem interface. It aims to be unified representation for all kinds data, regardless of their original form. The goal is to be able to map everything into it: key-value data, (semi-)relational data, binary blobs, you name it.

The data item also doubles as the “query language”; it is used to request data items from UBStores persistence layer, the IDataAccess. You can find more details for this in the data access documentation.

Structure

The data item consists of five components: the traditional key and value, a type, metadata, and temporal information.

Key

The data item key is a string that intuitively serves as a unique identifier (more precisely, it only is in combination with the type, but more on that later).

Value

The Value, as the name suggests, holds the actual value or payload of a data item. Since the value is supposed to be able to contain arbitrary data in arbitrary size and arbitrary form, the value is abstracted into an interface called IValue. The IValue contains multiple methods to enable accessing the payload in different forms: as a string, by copying the contents into a byte array, or by providing a Java InputStream.

Type

The type string serves as a way of providing a namespace, scope or schema to UBStores data. Although we defined the key to be a unique identifier, a key actually is only really unique within the set of all data items with the same type. In result, the combination of key and type is globally unique in a UBStore system.

To understand the concept, it can be said that in its easiest form, the type resembles Amazon S3s bucket concept.

However, the type does not necessarily have to be flat, but can contain a hierarchical structure. UBStore defines a separator (the colon character :) that can be used to divide the type into several parts. For example, a type string "ubstore:users" indicates that there is a hierarchical structure, and there is a set of users contained that belongs to an entity called ubstore. Mapped to a relational database, this could realized for example with a table users in the schema ubstore. Together with the data item key, a unique record can be identified and found in such a database.

Metadata

Each data item can be attached arbitrary metadata in the form of a set (String) key value pairs. Possibilities can include data attributes like file size, encoding, or format, but also things like information that is needed for custom replication strategies, e.g. redundancy level etc.

Temporal Information

The IDataItem supports the bitemporal data model, i.e. each data items contains the two valid and transaction time intervals.

Implementation wise, points in time are represented as long values, and intervals are a tuple of two points in time that represent the left and right bound. Both bounds can be open (inclusive) or closed (exclusive).