Please use this identifier to cite or link to this item:
http://hdl.handle.net/11375/32258
Title: | Prior-guided Neural Compression of Visual Data |
Authors: | Xu, Hao |
Advisor: | Wu, Xiaolin |
Department: | Electrical and Computer Engineering |
Publication Date: | 2025 |
Abstract: | In recent years, deep learning methods have been widely applied in the field of visual data compression, some of which have delivered the best rate-distortion performances in history. The target domain of neural data compression started from that of still images and then rapidly expanded into various modalities, including videos, point clouds, and the emerging 3D Gaussian Splatting (3DGS) representations. However, steady progresses of the neural compression research aside, some technical challenges still remain. One of them is the high computational costs of current neural compression models. Although more complex neural network architectures often yield improved compression performance, the marginal benefit of blindly increasing model complexity is diminishing. For most applications the rapidly increased cost cannot be justified for ever small coding gains. Unless better cost-performance ratio is achieved, neural visual data compression methods are unlikely to replace traditional compression standards that are deeply entrenched in real world. To address this issue, we advocate to incorporate known priors, such as signal sparsity and efficient space covering structures in quantization, into the design of neural compression architecture. Compared with the mainstream pure big data-driven black box learning approach, our prior-guided design approach can lead to significant coding gains with no or very small overheads in either model size or computational complexity, improving cost-effectiveness in practical deployment. With such motivations we design, implement and experiment with a series of neural compression models for three visual data modalities: images, 3D point clouds, and 3DGS scene representations. Extensive experimental results demonstrate that our prior-guided neural compression models can deliver rate-distortion performance comparable to state-of-the-art methods at a much reduced resource level. In future research, we plan to further optimize the proposed paradigm to discover even more efficient and practical neural compression models, and in parallel we will expand its applications to other data modalities, such as volumetric video and panoramic video. |
URI: | http://hdl.handle.net/11375/32258 |
Appears in Collections: | Open Access Dissertations and Theses |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Xu_Hao_202508_PhD.pdf | 75.59 MB | Adobe PDF | View/Open |
Items in MacSphere are protected by copyright, with all rights reserved, unless otherwise indicated.