**SZ: Fast Error-Bounded Lossy HPC Data Compressor**

Today’s HPC applications are producing extremely large amounts of data, thus it is necessary to use an efficient compression before storing them to parallel file systems.

We developed the error-bounded HPC data compressor, by proposing a novel HPC data compression method that works very effectively on compressing large-scale HPC data sets.

The compression method starts by linearizing multi-dimensional snapshot data. The key idea is to fit/predict the successive data points with the bestfit selection of curve fitting models. The data that can be predicted precisely will be replaced by the code of the corresponding curve-fitting model. As for the unpredictable data that cannot be approximated by curve-fitting models, we perform an optimized lossy compression via a binary representation analysis.

The key features of SZ are listed below.

1. Input: a data set (or a floating-point array with any dimensions) ; Output: the compressed byte stream

2. SZ supports C, Fortran, and Java.

3. SZ supports two types of error bounds. The users can set either* absolute error bound* or *relative error bound. *

he absolute error bound (denoted δ) is a constant, such as 1E-6. That is, the decompressed data Di′ must be in the range [Di − δ,Di + δ], where Di′ is referred as the decompressed value and Di is the original data value. As for the relative error bound, it is a linear function of the global data value range size, i.e., ∆=λr, where λ(∈(0,1)) and r refer to* error bound ratio* and range size respectively.

For example, given a set of data, the range size r is equal to max (Di )− min (Di ), and the error bound can be written as λ( max (Di )− min (Di )).

i=1...M

i=1...M

The relative error bound allows to make sure that the

compression error for any data point must be no greater than

λ×100 percentage of the global data value range size.