On the Rate-Distortion-Perception Tradeoff for Lossy Compression

Qian, Jingjing

Please use this identifier to cite or link to this item: http://hdl.handle.net/11375/28976

Title:	On the Rate-Distortion-Perception Tradeoff for Lossy Compression
Authors:	Qian, Jingjing
Advisor:	Chen, Jun
Department:	Electrical and Computer Engineering
Keywords:	Lossy compression;Information Theory;Rate-Distortion-Perception Tradeoff
Publication Date:	2023
Abstract:	Deep generative models when utilized in lossy image compression tasks can reconstruct realistic looking outputs even at extremely low bit-rates, while traditional compression methods often exhibit noticeable artifacts under similar conditions. As a result, there has been a substantial surge of interest in both the information theoretic aspects and the practical architectures of deep learning based image compression. This thesis makes contributions to the emerging framework of rate-distortion-perception theory. The main results are summarized as follows: 1. We investigate the tradeoff among rate, distortion, and perception for binary sources. The distortion considered here is the Hamming distortion and the perception quality is measured by the total variation distance. We first derive a closed-form expression for the rate-distortion-perception tradeoff in the one-shot setting. This is followed by a complete characterization of the achievable distortion-perception region for a general representation. We then consider the universal setting in which the encoder is one-size-fits-all, and derive upper and lower bounds on the minimum rate penalty. Finally, we study successive refinement for both point-wise and set-wise versions of perception-constrained lossy compression. A necessary and sufficient condition for point-wise successive refinement and a sufficient condition for the successive refinability of universal representations are provided. 2. Next, we characterize the expression for the rate-distortion-perception function of vector Gaussian sources, which extends the result in the scalar counterpart, and show that in the high-perceptual-quality regime, each component of the reconstruction (including high-frequency components) is strictly correlated with that of the source, which is in contrast to the traditional water-filling solution. This result is obtained by optimizing over all possible encoder-decoder pairs subject to the distortion and perception constraints. We then consider the notion of universal representation where the encoder is fixed and the decoder is adapted to achieve different distortion-perception pairs. We characterize the achievable distortion-perception region for a fixed representation and demonstrate that the corresponding distortion-perception tradeoff is approximately optimal. Our findings significantly enrich the nascent rate-distortion-perception theory, establishing a solid foundation for the field of learned image compression.
URI:	http://hdl.handle.net/11375/28976
Appears in Collections:	Open Access Dissertations and Theses

Files in This Item:

File	Description	Size	Format
Qian_Jingjing_202309_PhD.pdf Open Access		2.13 MB	Adobe PDF	View/Open

Show full item record