**Keywords: floating point da=
ta compressor, lossy compressor, error bounded compression**<=
/p>

__Key developers__: Sheng Di, Dingwen=
Tao, Xin Liang ; __Other contributors__: Jiannan Tian (GP=
U,FPGA), Sian Jin (compression for DNN), Xiangyu Zou (accelerate PWR compre=
ssion), Ali M. Gok (Pastri version), Sihuan Li (Time-based compression for =
HACC simulation) ;

* Supervisor: Franck Cappello*<=
/p>

Today=E2=80=99s HPC applications are producing extremely large amounts o= f data, thus it is necessary to use an efficient compression before st= oring them to parallel file systems.

We developed the error-bounded HPC data compressor, by proposing a = novel HPC data compression method that works very effectively on compr= essing large-scale HPC data sets.

The key features of SZ are listed below.

1. **Usage: **

Compression: Input: a data set (or a floating-point array = with any dimensions); Output: the compressed byte stream

Decompression: input: the compressed byte stream; Output: = the original data set with the compression error of each data point being w= ithin a pre-specified error bound =E2=88=86.

2. **Environment**: SZ supports C, Fortran, and Java. It ha=
s been tested on Linux and Mac, with different architectures (x86, x64, ppc=
, etc.).

3. **Error control**: SZ supports many types of error bound=
s. The users can set* absolute error bound, **value-ran=
ge based relative error bound, *or* a combination of the two bo=
unds=EF=BC=88with *operator* AND *or* OR). The users can also=
error bound mode to be PNSR-fixed, the point-wise relative error bound, et=
c. More details can be found in the configuration file (sz.config). *

- The absolute error bound (denoted =CE=B4) is a constant, such as 1= E-6. That is, the decompressed data Di=E2=80=B2 must be in the r= ange [Di =E2=88=92 =CE=B4,Di + =CE=B4], where Di=E2=80= =B2 is referred as the decompressed value and Di is the original data = value.
- As for the relative error bound, it is a linear function of the gl=
obal data value range size, i.e., =E2=88=86=3D=CE=BBr, where =CE=BB(=E2=88=
=88(0,1)) and r refer to
*error bound ratio*and range size res= pectively. For example, given a set of data, the range size r is = equal to max (Di )=E2=88=92 min (Di ), and the error bound can be writ= ten as =CE=BB( max (Di )=E2=88=92 min (Di )). The relative error bound= allows making sure that the compression error for any data point must= be no greater than =CE=BB=C3=97100 percentage of the global data valu= e range size. - PSNR-fixed compression allows users to set a PSNR value, based on which= the compressor will compress the data.

4. **Parallelism**: OpenMP version is included in the packa=
ge. We implemented OpenCL version simply based on OpenMP, but it is a depre=
cated version for GPU. An optimized GPU version is under development, to be=
released later.

5. SZ supports two compression modes (similar to Gzip): SZ_BEST_SPEED an= d SZ_BEST_COMPRESSION. SZ_BEST_SPEED results in the fastest compression. Th= e best compression factor will be reached when using SZ_BEST_COMPRESSI= ON and ZSTD_FAST_SPEED meanwhile. The default setting is SZ_BEST_COMPRESSIO= N + Zstd.

6. **User guide:** More detailed usage and examples can be =
found under the directories doc/user-guide.pdf and example/ respectively, i=
n the package.

7. **Citations**: If you mention SZ in your paper, please c=
ite the following references.

**SZ 0.1-1.0**: Sheng Di, Franck Cappello, "Fast Error-bou= nded Lossy HPC Data Compression with SZ," in International Parallel and Dis= tributed Processing Symposium (IEEE/ACM IPDPS 2016), 2016.**SZ 1.2-1.4.13**: Dingwen Tao, Sheng Di, Franck Cappello,= "A Novel Algorithm for Significantly Improving Lossy Compression of Scient= ific Data Sets, " in International Parallel and Distributed Processing Symp= osium (IEEE/ACM IPDPS 2017), Orlando, Florida, 2017.**SZ 2.0+**: Xin Liang, Sheng Di, Dingwen Tao, Zizhon= g Chen, Franck Cappello, "Error-Controlled Lossy Compression Optimized for = High Compression Ratios of Scientific Datasets", in IEEE Bigdata2018, 2018.=- As for the point-wise relative error bound mode (i.e., PW_REL), our CLU= STER18 paper describes the key design: Xin Liang, Sheng Di, Dingwen Ta= o, Zizhong Chen, Franck Cappello, "Efficient Transformation Scheme for Loss= y Data Compression with Point-wise Relative Error Bound", in IEEE CLUSTER 2= 018. (best paper)

8. **Download**

**Version SZ 2.1.8.3**

**-->>> Package Download (including everything) <<<=
;--**

-->>> ** <<<--**

-->>> __User Gui=
de__ , __hands-on-document__ <<<---

(__Contact: disheng222@gmail.com __or__ sdi1@anl.gov__)

If you download the code, please let us know who you a= re. We are very keen of helping you using the SZ library.

9. **Publications:** <=
/span>

Sheng Di, Franck Cappello, "Fast Error-bounded Lossy HPC Data C= ompression with SZ," to appear in International Parallel and Distributed Pr= ocessing Symposium (IEEE/ACM

**IPDPS 2016**), 2016. [= download]=Dingwen Tao, Sheng Di, Franck Cappello, "A Novel Algorithm for = Significantly Improving Lossy Compression of Scientific Data Sets, " to app= ear in International Parallel and Distributed Processing Symposium (IEEE/ACM

**IPDPS 2017**), Orlando, Florida, 2017. [= download]Dingwen Tao, Sheng Di, Zizhong Chen, and Franck Capello, "Explo= ration of Pattern-Matching Techniques for Lossy Compression on Cosmology Si= mulation Data Sets ", Proceedings of the 1st International Workshop on Data= Reduction for Big Scientific Data (

**DRBSD1**) in Conjunction with&nb= sp;ISC'17, Frankfurt, Germany, June 22, 2017.Ian T. Foster, Mark Ainsworth, Bryce Allen, Julie Bes= sac, Franck Cappello, Jong Youl Choi, Emil M. Constantinescu= , Philip E. Davis, Sheng Di, et al., "Computing Just What Yo= u Need: Online Data Analysis and Reduction at Extreme Scales", in 23rd Inte= rnational European Conference on Parallel and Distributed Computing (

), 2017. pp. 3-19.Euro-Par 2017 Sheng Di, Franck Cappello, "Optimization of Error-Bounded Lossy= Compression for Hard-to-Compress HPC Data," in IEEE Transactions on P= arallel and Distributed Systems (IEEE

**TPDS<= /strong>****), 2017.**Ali Murat Gok, Dingwen Tao, Sheng Di, Vladimir Mironov, Yuri Al= exeev, Franck Cappello, "PaSTRI: A Novel Data Compression Algorithm for Two= -Electron Integrals in Quantum Chemistry", in IEEE/ACM 29th The Internation= al Conference for High Performance computing, Networking, Storage and Analy= sis (

). [poster]__SC201____7__Dingwen Tao, Sheng Di, Zizhong Chen, and Franck Cappello, "In-D= epth Exploration of Single-Snapshot Lossy Compression Techniques for N-Body= Simulations", Proceedings of the 2017 IEEE International Conference on Big= Data (

), Boston, MA, USA, December 11 -= 14, 2017. [short paper]__BigData2017__Dingwen Tao, Sheng Di, Hanqi Guo, Zizhong Chen, and Franck Capp= ello, "Z-checker: A Framework for Assessing Lossy Compression of Scientific= Data", in The International Journal of High Performance Computing Applicat= ions (

**IJHPCA**), 2017. [d= ownload]**Sheng Di, Dingwen Tao, Xin Liang, and Franck Cappello, "Efficient Lossy= Compression for Scientific Data based on Pointwise Relative Error Bound", = in IEEE Transactions on Parallel and Distributed Systems (IEEE**), 2018.**TPDS**-
Dingwen Tao, Sheng Di, Xin Liang, Zizhong Chen and Franck Cappello,= "Optimization of Fault Tolerance for Iterative Methods with Lossy Che= ckpointing", in 27th ACM Symposium on High-Performance Parallel and Distrib= uted Computing (

__ACM__**HPDC2018****), 2018.** Ali Murat Gok, Sheng Di, Yuri Alexeev, Dingwen Tao, V. Mironov, Xin = Liang, Franck Cappello, "PaSTRI: Error-bounded Lossy Compression for Two-El= ectron Integrals in Quantum Chemistry", in IEEE

**CLUSTER2018, 2018. [best paper award <= span style=3D"color: rgb(0,51,102);">(in the application, algorithms and libraries track)]****Xin Liang, Sheng Di, Dingwen Tao, Zizhong Chen, and Franck Cappello,= "Efficient Transformation Scheme for Lossy Data Compression with Point-wis= e Relative Error Bound", in IEEE**. [best= paper award (in the Data, Stora= ge, and Visualization track)]~~CLUSTER2018~~Dingwen Tao, Sheng Di, Xin Liang, Zizhong Chen, and F. Cappello, "Fi= xed-PSNR Lossy Compression for Scientific Data", in IEEE CLUSTER 2018. (short paper)

- Xin Liang, Sheng Di, Dingwen Tao, Zizhong Chen, Franck Cappello, "Error=
-Controlled Lossy Compression Optimized for High Compression Ratios of Scie=
ntific Datasets", in IEEE
**Bigdata2018**, 2018. - Sihuan Li, Sheng Di, Xin Liang, Zizhong Chen, Franck Cappello, "Optimiz=
ing Lossy Compression with Adjacent Snapshots for N-body Simulation", in IE=
EE
**Bigdata2018**, 2018. - Xin Liang, Sheng Di, Dingwen Tao, Sihuan Li, Zizhong Chen, Franck Cappe=
llo, "Improving In-situ Lossy Compression with Spatio-Temporal Decimation b=
ased on SZ Model", in Proceedings of the 4th International Workshop on Data=
Reduction for Big Scientific Data (
**DRBSD-4**), in conjuncti= on with IEEE/ACM 29th The International Conference for High Performance com= puting, Networking, Storage and Analysis (**SC2018**). - Xin-Chuan Wu, Sheng Di, Franck Cappello, Hal Finkel, Yuri Alexeev, Fred=
eric T. Chong, "Amplitude-Aware Lossy Compression for Quantum Circuit Simul=
ation", in Proceedings of the 4th International Workshop on Data Reduction =
for Big Scientific Data (
**DRBSD-4**), in conjunction with IEE= E/ACM 29th The International Conference for High Performance computing, Net= working, Storage and Analysis (**SC2018**). - Xin-Chuan Wu, Sheng Di, Franck Cappello, Hal Finkel, Yuri Alexeev , Fre=
deric T. Chong, "Memory-Efficient Quantum Circuit Simulation by Using Lossy=
Data Compression", The 3rd International Workshop on Post-Moore Era Superc=
omputing (
**PME**) in conjunction with IEEE/ACM 29th The Inter= national Conference for High Performance computing, Networking, Storage and= Analysis (**SC2018**). - Dingwen Tao, Sheng Di, Xin Liang, Zizhong Chen, Franck Cappello, "Optim=
izing Lossy Compression Rate-Distortion from Automatic Online Selection bet=
ween SZ and ZFP", in IEEE Transactions on Parallel and Distributed Systems =
(IEEE
**TPDS**), 2019. - XiangYu Zou, Tao Lu, Wen Xia, Xuan Wang, Weizhe Zhang, Sheng Di, Dingwe=
n Tao, Franck Cappello, "Accelerating Relative-error Bounded Lossy Compress=
ion for HPC datasets with Precomputation-Based Mechanisms", in Proceedings =
of the 35th International Conference on Massive Storage Systems and Technol=
ogy (
**MSST19**), 2019. **XiangYu Zou, Tao Lu, Sheng Di, Dingwen Tao, Wen Xia, Xuan Wang, Weizhe = Zhang, Qing Liao, "Accelerating Lossy Compression on HPC datasets via Parti= tioning Computation for Parallel Processing", in The 21st IEEE Internationa= l Conference on High Performance Computing and Communications (IEEE**), 2019.**HPCC19****Sian Jin, Sheng Di, Xin Liang, Jiannan Tian, Dingwen Tao, Franck Cappel= lo, "DeepSZ: A Novel Framework to Compress Deep Neural Networks by Using Er= ror-Bounded Lossy Compression", Proceedings of the 28th ACM International S= ymposium on High-Performance Parallel and Distributed Computing (ACM**), Phoenix, AZ, USA, June 24 - 28, 2019.HPDC19 - Xin-Chuan Wu, Sheng Di, Emma Maitreyee Dasgupta, Franck Cappello, Yuri =
Alexeev, Hal Finkel, Frederic T. Chong, "Full State Quantum Circuit Simulat=
ion by Using Data Compression", in IEEE/ACM 30th The International Conferen=
ce for High Performance computing, Networking, Storage and Analysis (IEEE/A=
CM
**SC2019**), 2019. - Xin Liang, Sheng Di, Sihuan Li, Dingwen Tao, Bogdan Nicolae, Zizhong Ch=
en, Franck Cappello, "Significantly Improving Lossy Compression Quality bas=
ed on An Optimized Hybrid Prediction Model", in IEEE/ACM 30th The Internati=
onal Conference for High Performance computing, Networking, Storage and Ana=
lysis (IEEE/ACM
**SC2019**), 2019. - Xin Liang, Sheng Di, Dingwen Tao, Sihuan Li, Bogdan Nicolae, Zizhong Ch=
en, Franck Cappello, "Improving Performance of Data Dumping with Lossy Comp=
ression for Scientific Simulation," in IEEE
**CLUSTER2019**, 2= 019. - Franck Cappello, Sheng Di, Sihuan Li, Xin Liang, Ali M. Gok, Dingwen Ta=
o, Chun Hong Yoon , Xin-Chuan Wu, Yuri Alexeev, Federic T. Chong, "Use case=
s of lossy compression for floating-point data in scientific datasets", in =
The International Journal of High Performance Computing Applications (
IJHPCA), 2019. - Tasmia Reza, Kristopher Keipert, Sheng Di, Xin Liang, Jon C. Calho= un, Franck Cappello, "Analyzing the Performance and Accuracy of LossyCheckp= ointing on Sub-iteration of NWChem", in Proceedings of the 5th Interna= tional Workshop on Data Reduction for Big Scientific Data (DRBSD-5), in con= junction with IEEE/ACM 29th The International Conference for High Performan= ce computing, Networking, Storage and Analysis (IEEE/ACMSC2019)
- Xiangyu Zou, Tao Lu, Wen Xia, Xuan Wang, Weizhe Zhang, Haijun Zhang,&nb=
sp;Sheng Di, Dingwen Tao, and Franck Cappello, "Performance Optimization fo=
r Relative-Error-Bounded Lossy Compression on Scientific Data", IEEE Transa=
ctions on Parallel and Distributed Systems (
**IEEE TPDS**), 20= 20. - Xin Liang, Hanqi Guo, Sheng Di, Franck Cappello, Mukund Raj, Chunh=
ui Liu, Kenji Ono, Zizhong Chen and Tom Peterka, "Towards Feature Preservin=
g 2D and 3D Vector Field Compression", in the 13rd IEEE Pacific Visualizati=
on Symposium (
**IEEE PacificVis2020**), Tianjin, China, Apr 14= -17, 2020. - Jiannan Tian, Sheng Di, Chengming Zhang, Xin Liang, Sian Jin, Dazh=
ao Cheng, Dingwen Tao, and Franck Cappello, "waveSZ: A Hardware-Algorithm C=
o-Design of Efficient Lossy Compression for Scientific Data", Proceedings of the 25th ACM SIGPLAN Symposium on Principles and Practi=
ce of Parallel Programming (
**ACM PPoPP2020**), San= Diego, California, USA, February 22-26, 2020. - Robert Underwood, Sheng Di, Jon Calhoun, Franck Cappello, "FRaZ: A=
Generic High-Fidelity Fixed-Ratio Lossy Compression Framework for Scientif=
ic Floating-point Data", in Proceedings of the 34th IEEE International Para=
llel and Distributed Symposium (
**IEEE IPDPS2020), New Orleans, LA, May 18-22, 2020.**

10. **Version history**: We recommend the latest version.

- SZ 2.1: Significantly improve the compression rate and decompression ra= te for point-wise relative error bounded compression. See our paper publish= ed in MSST19 for details.
- SZ 2.0.1.0: Significantly improve rate-distortion (compression ratio vs= . PSNR) for many datasets in high-compression cases.
- SZ 1.4.13: (1) support openMP version for both single-precision and dou= ble-precision floating-point data compression. (2) support Pastri algorithm= customized for GAMESS data (two-electron integral data)
- SZ 1.4.12.1: (1) Fix the bug the segmentation fault may happe= n when the error bound is greater than the value range size.
- SZ 1.4.12: (1) Support thresholding-based strategy for 1D data compress= ion based on point-wise relative error bound. (In order to test it, please = select errBoundMode =3D PW_REL, and set the point-wise relative error bound= using the parameter pw_relBoundRatio in the sz.config.) For other dimensio= ns of data, point-wise relative error based compression is using block-base= d strategy (see our DRBSD-2 paper for details) (2) fix the bug in the callZ= lib.c (previously, segmentation fault might happen when using best_compress= ion mode). (2) Fix a small bug that happened when the data size is extremel= y huge (nbEle>4G) and the compression mode is SZ_BEST_COMPRSSION. Specif= ically, the previous call to zlib functions has one potential bug that may = lead to segmentation fault, which has been fixed.
- SZ 1.4.11-beta: (1) Support HDF5 (using HDF5 filter : id (32017)); (2) = Support integer data compression (see testint_compress.c in example/ for de= tails).
- SZ 1.4.10-beta: (1) Support direct sub-block data compression; (2) Supp= ort compression of large data file directly (i.e., the number of data point= s could be up to as large as LONG size, unlike the previous version that ca= n only compress 2^{32} data points each time); (3) separate the internel fu= nctions from the sz.h;
- SZ 1.4.9-beta: allows users to switch off/on the Fortran compilation on= demand (Fortran compile is off by default). Support lossy compression with= 'point-wise relative error bound ratio'. For example, given a relative err= or bound ratio (such as 0.001), SZ can make sure the compression error for = each data point be limited within {the relative error ratio}*{the data poin= t's value} (e.g., err_bound=3D0.001*{data_value}). For details, please set = the errBoundMode to PW_REL in the configuration file.
- SZ 1.4.8-beta Increase the max number of quantization intervals (from 6= 5536 to 2^30), which will lead to better compression ratio on high-precisio= n data compression. This version also allows users to specify the maximum n= umber of quantization intervals in the configuration file. Fix the iss= ue of possible non-identical compression output with multiple runs.
- SZ 1.4.7-beta Fix some memory leakage bugs (related to Huffman enc= oding). Fix the bugs about memory crash or segmentation faults when the num= ber of data points is pretty large. Fix the sementation fault bug happening= when the data size is super small. Fix the issue that decompressed data ma= y be largely skewed from the original data in some special cases (especiall= y when the data are not smooth at all). - Dec. 17th, 2016.
- SZ 1.4.6-beta The compression ratio and speed are further improved than= SZ 1.3 in most cases. We also provide three compression modes: SZ_BEST_SPE= ED, SZ_DEFAULT_COMPRESSION, and SZ_BEST_COMPRESSION. Please read the user g= uide for details.
- SZ 1.3 The compression ratio and speed are further improved than SZ 1.2= .
- SZ 1.2 The compression ratio is improved significantly compared with SZ= 1.1.
- SZ 1.1 This version improved the compression performance by 50% compare= d to SZ1.0. A few bugs that may make the compression disrespect the error-b= ound are also fixed.
- SZ 1.0 This version is coded in C programming language, unlike the prev= ious version coded in Java. It also allows setting the endianType for the d= ata to compress.
- SZ 0.5.14 fixed a design bug, which improves the compression ratio furt= her.
- SZ 0.5.13 improves compression performance, by replacing the imple= mentation with classes by that of primitive data types.
- SZ 0.5.12 allows users to set "offset" parameter in the configurat= ion file sz.config. The value of the offset is an integer in [1,7]. Ge= nerally, we recommend offset=3D2 or 3, while we also find that some other s= ettings (such as offset=3D7) may lead to better compression ratios in some = cases. How to automize/optimize the selection of offset value would be the = future work. In addition, the compression speed is improved, by replacing j= ava List by array implementation in the code.
- SZ 0.5.11 improved SZ 0.5.10 on the level of guaranteeing user-specifie= d error bounds. In very few cases, SZ 0.5.10 cannot guarantee the error-bou= nds to a certain user-specified level. For example, when absolute error bou= nd =3D 1E-6, the maximum decompression error may be 0.01(>>1E-6) beca= use of the huge value range even in the optimized segments such that the no= rmalized data cannot reach the required precision even storing all of the 6= 4 or 32 mantissa bits. SZ 0.5.11 fixed the problem well, with compression r= atio degraded by less than 1% in that case.
- SZ 0.5.10 optimizes the offset by using the optimized formula of c= omputing the median_value based on optimized right-shifting method. Anyway,= this version improves compression ratio a lot for hard-to-compress da= tasets. (Hard-to-compress datasets refer to the cases whose compression rat= ios are usually very limited)
- SZ 0.5.9 optimize the offset by using the simple right-shifting method.= Experiments show that this cannot improve compression ratio actually becau= se simple right-shifting actually make each data be multiplied by 2^{-k}, w= here k is # right-shifting bits. The pros is to save bits because of more l= eading-zero bytes, but the cons are much more required bits to save. See SZ= 0.5.10 for the better solution on this issue!
- SZ 0.5.8 Refine the leading-zero granularity (change it from byte to bi= ts based on the distribution). For example, in SZ0.5.7, the leading-zero is= always in bytes, 0, 1, 2, or 3. In SZ0.5.8, the leading-zero part could be= xxxx xxxx xx xx xx xx xxxx xxxx (where each x means a bit in the leading-z= ero part)
- SZ 0.5.7 improve the decompression speed for some cases
- SZ 0.5.6 improve compression ratio for some cases (when the values in s= ome segmentation are always the same, this segment will be merged forward)<= /li>
- SZ 0.5.5 runtime memory is shrunk (by changing int xxx to byte xxx in t= he codes. The bug that writing decompressed data may encounter excepti= ons is fixed. Memory leaking bug for ppc architecture is fixed.
- SZ 0.5.4 Gzip_mode: default --> fast_mode ; Support reserved value
- SZ 0.5.3 Integrate with the dynamic segmentation support
- SZ 0.5.2 finer compression granularity for unpredictable data, and also= remove redundant Java storage bytes.
- SZ 0.5.1 Support version checking
- SZ 0.2-0.4 Compression ratio is the same as SZ 0.5. The key difference = is different implementation ways, such that SZ 0.5 is much faster than SZ 0= .2-0.4.

11. Other versions are available upon request