Skip navigation
  • Home
  • Browse
    • Communities
      & Collections
    • Browse Items by:
    • Publication Date
    • Author
    • Title
    • Subject
    • Department
  • Sign on to:
    • My MacSphere
    • Receive email
      updates
    • Edit Profile


McMaster University Home Page
  1. MacSphere
  2. Open Access Dissertations and Theses Community
  3. Open Access Dissertations and Theses
Please use this identifier to cite or link to this item: http://hdl.handle.net/11375/29430
Title: Improving Robustness of Deep Learning Based Image Restoration
Authors: Yanhui, Guo
Advisor: Xiaolin, Wu
Department: Electrical and Computer Engineering
Publication Date: 2024
Abstract: Image restoration is a fundamental task in image processing that aims to restore degraded images to their underlying artifact-free counterpart. Recent years have witnessed significant advancements in addressing image restoration problems through the application of deep learning techniques. However, such black-box data-driven methods still face challenges in terms of their reliability and robustness, particularly in the context of domain mismatches between training and testing data. Even minor deviations between the two data domains can result in significant differences in the outcome, adversely affecting the performance of image restoration networks. Furthermore, this adverse effect becomes more pronounced when deep learning based restoration models are applied to extremely low-quality images. To address the aforementioned limitations of existing methods, we have developed three novel techniques to improve the robustness of image restoration networks from the perspectives of dataset acquisition, representation learning, and side information guidance. Specifically, the proposed techniques include 1) a low-cost and efficient automatic system for gathering real-world training datasets, offering a practical way of tailoring a super-resolution model to each given camera, 2) a degradation-independent representation learning technique, improving the robustness of blind image restoration, and 3) a video restoration network that integrates multi-modality information to robustly restore extremely compressed videos of talking heads. Our first work is presented in Chapter 2, where we address the challenges posed by the scarcity of real-world datasets and data domain mismatches between the training and testing phases of the super-resolution task. For this purpose, we propose an efficient data acquisition system that enables the highly efficient collection of a substantial number of low-resolution and high-resolution real-world image pairs. These data pairs are obtained by capturing real-world scenes displayed on an ultra-high-quality screen. In this specific context, our approach allows for not only cost-effective data collection but also the precise alignment of the captured image pairs with sub-pixel accuracy through our proposed spatial-frequency dual-domain registration method. In the following chapter, Chapter 3, we solve the issues of the aforementioned data domain mismatches from the viewpoint of representation learning and verify the effectiveness of the proposed technique on the task of restoring complex and unknown image degradations caused by camera image signal processing (ISP) pipelines. In a digital camera, the ISP pipeline serves as a crucial module responsible for converting raw Bayer sensor data into RGB images. However, images generated by the ISP often exhibit imperfections due to a variety of factors, including erroneous ISP settings and coupled degradations composed of sensor noise, demosaicing, and compression. These imperfections can be generally classified as ISP degradations, and they not only negatively affect the quality of perception but also obstruct the successful application of ISP-generated images in subsequent computer vision tasks. To address these ISP degradations, we propose a degradation-independent representation learning method that enables learning robust image representations from degraded images in a self-supervised manner by a multivariate mutual information maximization method. Moreover, we devise a representation alignment network to enhance the self-supervised learned representations for targeted end tasks, including image restoration, object detection, and image segmentation. Subsequently, in Chapter 4, we address the challenging problem of robustly restoring very low-bit-rate videos of talking heads. In this context, we propose a novel deep multi-modality neural network by leveraging three modalities: the emotional state, the audio, and the video of the speaker. The first modality, the emotion of the speaker, can be detected using existing emotion detection algorithms and almost freely transmitted to the decoder part as the emotion information is very compact. This emotion modality can help restore facial expressions. The audio signal, as the second modality, has a strong correlation to the motion of the facial muscles, particularly those in the lips, shaping the sound and air stream into speech. The robust correlations among these two modalities and face videos inspire the development of a multi-modality soft-decoding neural network for joint compression artifact removal and super-resolution of downsampled face videos. We adopt this innovative approach to act as a video post-processor, substantially enhancing the perceptual quality of heavily compressed talking head videos, all while remaining fully compatible with existing video compression standards. Finally, in Chapter 5, we summarize the proposed techniques in this thesis and discuss future work.
URI: http://hdl.handle.net/11375/29430
Appears in Collections:Open Access Dissertations and Theses

Files in This Item:
File Description SizeFormat 
Guo_Yanhui_Finalsubmission202401_PhD.pdf
Open Access
104.2 MBAdobe PDFView/Open
Show full item record Statistics


Items in MacSphere are protected by copyright, with all rights reserved, unless otherwise indicated.

Sherman Centre for Digital Scholarship     McMaster University Libraries
©2022 McMaster University, 1280 Main Street West, Hamilton, Ontario L8S 4L8 | 905-525-9140 | Contact Us | Terms of Use & Privacy Policy | Feedback

Report Accessibility Issue