Please use this identifier to cite or link to this item:
http://hdl.handle.net/11375/31892
Title: | Improving classification performance for the Functional Map of the World Using Siamese and LiteDenseNet Architectures |
Authors: | Turchenko, Andrii |
Advisor: | McNicholas, Paul D. |
Department: | Computational Engineering and Science |
Keywords: | Convolutional Neural Networks;Functional Map of the World;Satellite Imagery;Siamese architecture;DenseNet;LiteDenseNet;Computational Complexity;High Performance Computing;Machine Learning;Protection of environment;Natural crisis management;Agricultural research;Logistics;Smart urban planning;Cancer research |
Publication Date: | 2025 |
Abstract: | This thesis presents an improvement of the classification performance for the Functional Map of the World dataset. The Functional Map of the World is a large-scale and complex classification problem. A version of the dataset with 426,994 and 76,988 images for the training and testing, respectively, is used herein. The dataset contains 63 very unbalanced classes of real-world satellite images geographically covering 207 ISO Alpha-3 country codes, collected by several state-of-the-art satellites such as QuickBird-2, GeoEye-1, WorldView-2, and WorldView-3. We propose to improve classification performance by the application of the well-known semi-supervised Siamese architecture for such regular deep convolutional neural networks as resnet50, densenet161, efficientnetb4, and regnet_y_1_6gf. Our results show that the developed Siamese architectures learn their internal representations differently from their regular counterparts. Siamese representations are wider, richer, more context-based, and semantically connected. Siamese architectures can have better attention to detail because they can be more focused on smaller surrounding informative objects, which are complementary to the main class-discriminative areas found on images. Despite a semi-supervised approach not providing direct model training as a classifier, we find that, in some cases, the performance of an individual Siamese architecture is even higher than the appropriate regular classification model. Also, we find that Siamese architectures are very good candidates for the ensemble with their regular counterparts. Our ensemble results show that using Siamese architectures improves the classification performance by up to 2%pt compared with the appropriate baselines. The 10-model ensemble with Siamese architectures provides the F1 score of 0.746, which outperforms a baseline with only corresponding regular convolutional neural network (CNN) models by 0.94%pt. Also, such an ensemble can improve any single result of a regular CNN model by at least 5%pt. The development of the LiteDenseNet architecture is presented in this thesis, which is a simplified version of the famous DenseNet architecture. Our architectural changes are: (i) on one hand we slightly complicate the dense layer by adding the Squeeze and Excitation layer and (ii) on another hand we simplify the Dense block in a way that concatenation layer inside each current dense layer concatenates its internally processed features with output information from only one previous dense layer, not from all of them available in the hierarchy before the current layer. Our results show that the LiteDenseNet architecture allows for a considerable decrease in computational complexity compared to the regular DenseNet architecture without severe losses in classification performance. For example, one of the developed models, the ldn268_00_32_32, provides even better (by 1.1%pt) classification performance (F1 score of 0.592) than the densenet161 model (F1 score of 0.581) on the fMoW dataset. In comparison with densenet161, the ldn268_00_32_32 has (i) less total memory is required to store all model’s trainable parameters by 2.4 times, (ii) lower amount of memory is required to store all the data and the involved computations during the model's operation by 6.9 times, (iii) lower total number of memory operations are involved in reading and writing data during the model's computation by 4.3 times. We also find that the representations learned by the LiteDenseNet architecture are different from the regular DenseNet architecture. They are more context-based, they have better attention to details by focusing on smaller surrounding “informative” objects, and they are complementary to the main class-discriminative areas that identify the class in question. |
URI: | http://hdl.handle.net/11375/31892 |
Appears in Collections: | Open Access Dissertations and Theses |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Turchenko_Andrii_202506_PhD.pdf | 15.11 MB | Adobe PDF | View/Open |
Items in MacSphere are protected by copyright, with all rights reserved, unless otherwise indicated.