Please use this identifier to cite or link to this item:
http://hdl.handle.net/11375/32494
Title: | Accelerating Object Detection and Tracking Pipelines for Efficient Edge Video Analytics |
Authors: | Xu, Renjie |
Advisor: | Zheng, Rong Razavi, Saiedeh |
Department: | Computing and Software |
Keywords: | Edge Computing;Video Analytics;Object Detection;Object Tracking;Computer Vision;Deep Learning |
Publication Date: | 2025 |
Abstract: | Edge computing enables rapid video analytics by processing data closer to the source, thereby reducing end-to-end latency. This gives rise to the paradigm of edge video analytics (EVA). Object detection and object tracking are key building blocks of video analytics pipelines (VAPs), as their outputs directly impact the performance of downstream tasks. In real-world applications like traffic monitoring, timely and accurate responses are critical, as delayed or inaccurate results can compromise safety. However, achieving such an accuracy-efficiency balance at the edge is particularly challenging due to two main factors: the compute-intensive nature of modern Convolutional Neural Network (CNN)- or Vision Transformer (ViT)-based models, and the limited computational and communication resources on edge devices. This thesis aims to improve the efficiency of object detection and tracking pipelines without sacrificing accuracy, enabling efficient and reliable EVA. Conventional pipelines often adopt fixed configurations (e.g., frame resolution and backbone model) or process entire frames uniformly, overlooking the dynamic and spatially diverse nature of video content, resulting in considerable resource waste. To address these limitations, we propose three novel approaches: FastTuner, a model-agnostic framework that dynamically selects the optimal frame resolution and backbone model at runtime to accelerate multi-object tracking (MOT) pipelines; BlockHybrid, which leverages a policy network to classify each frame into “hard” and “easy” blocks, and processes them with either a block-wise detector or a lightweight tracker accordingly; and SEED, an end-to-end framework that couples block selection with block execution, enabling unified and efficient selection and execution of informative blocks in ViT-based object detectors. Extensive evaluations across multiple datasets and deployment scenarios demonstrate the effectiveness and generality of the proposed methods. Together, these contributions pave the way for more adaptive and scalable video analytics in real-world edge environments. |
URI: | http://hdl.handle.net/11375/32494 |
Appears in Collections: | Open Access Dissertations and Theses |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Xu_Renjie_202509_PhD.pdf | 7.16 MB | Adobe PDF | View/Open |
Items in MacSphere are protected by copyright, with all rights reserved, unless otherwise indicated.