Stream-Based Intelligent Memory Architectures for High-Performance and Predictable Real-Time Systems

Abotaleb, Abdelrhman Mohamed Ibrahim Sayed

Stream-Based Intelligent Memory Architectures for High-Performance and Predictable Real-Time Systems

Files

Primary Phd_Thesis_Abotaleb_Final.pdf (29.25 MB)

Date

2026

Authors

Abotaleb, Abdelrhman Mohamed Ibrahim Sayed

Abstract

Emerging cyber-physical platforms such as autonomous vehicles and unmanned aerial systems increasingly integrate high-throughput workloads (e.g., perception and machine learning) with safety-critical real-time control on the same multicore, multi-channel Dynamic Random Access Memory (DRAM)-based system. While processor cores continue to scale, the memory system remains a major bottleneck: conventional hardware prefetchers and memory controllers are largely oblivious to program structure, leading to poor bandwidth utilization, high energy, and highly variable memory access latency that undermines real-time guarantees. This dissertation proposes a unified Hardware/Software (HW/SW) interface that leverages software-provided information, known prior to execution, to describe future memory access behavior to the hardware. A stream defines the underlying large, array-like data structure over which this access behavior, whether regular or irregular, occurs. Using compact stream descriptors, the software communicates future access sequences to a centralized hardware engine, which tags last-level cache misses and coordinates stream-aware optimizations across the memory hierarchy. Leveraging this interface, the thesis introduces three architectures: First, InterStellar and its multi-channel extension InterStellar 2.0 implement stream-aware DRAM controllers that perform intelligent page management and proactive DRAM-aware batching, substantially improving effective bandwidth and reducing row conflicts. Second, InterStellarRT adapts the same principles to real-time systems by forming analyzable real-time batches and applying a predictable scheduling policy, enabling tight worst-case memory-latency bounds for stream-based memory patterns. Third, COMPASS co-designs a stream-aware last-level cache prefetcher with a stream-aware memory controller, coordinating prefetch issuance with DRAM batching to reduce effective miss latency while sustaining high throughput. Evaluated across a broad set of scientific and high-performance computing workloads, these three architectures deliver substantial performance and energy improvements over state-of-the-art baselines. InterStellarRT, in addition, provides significantly tighter and formally analyzable worst-case latency bounds compared to contemporary real-time memory controllers. Collectively, the contributions demonstrate that stream-based memory intelligence is an effective approach to mitigating the memory-system vibottleneck in modern multicore platforms that integrate cache prefetching mechanisms and multi-channel DRAM subsystems The implementations of InterStellar 2.0, InterStellarRT, and COMPASS are available in the project repository: https://gitlab.com/fanosteam/fanosgem5. Each architecture is provided in a separate branch: 1) InterStellar 2.0 : https://gitlab.com/fanosteam/fanosgem5/-/tree/InterStellar-2.0 2) InterStellarRT https://gitlab.com/fanosteam/fanosgem5/-/tree/InterStellarRT 3) COMPASS https://gitlab.com/fanosteam/fanosgem5/-/tree/COMPASS

URI

https://hdl.handle.net/11375/32807

Collections

Open Access Dissertations and Theses

Full item page

Stream-Based Intelligent Memory Architectures for High-Performance and Predictable Real-Time Systems

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By