Dataset Compression¶
- class ptp.compression.Codec(ds={}, filename='', compressed=False)[source]¶
Dataset compression coder/decoder (codec)
- Parameters
ds – (dictionary) dataset
filename – (string) dataset file name
compressed – (bool) whether the supplied dataset is compressed
- compress()[source]¶
Reorganize dataset more efficiently for storing into files
The data[‘data’] member of the dataset holds a list of dictionaries, each containing several metrics. This is is very inefficient for storage, since the keys (strings) are repeated on every dictionary.
Some of the metrics in the dataset are present on all dictionaries. Hence, they can be stored in lists directly. Other metrics are not present in all dictionaries, in which case they should be stored in a pair of lists, one containing the actual time-series, the other containing the indexes where the elements are present in the dataset.
- Parameters
data – Dataset dictionary formatted as {‘metadata’: x, ‘data’: y}, i.e., as a dictionary containing the metadata and data keys.
- Returns
(dict) The compressed dataset