Browsing by Subject "Memory bottleneck"

Now showing 1 - 2 of 2

Open Access
Reducing MLFMA memory with out-of-core implementation and data-structure parallelization
(IEEE, 2013) Hidayetoğlu, Mert; Karaosmanoğlu, Barışcan; Gürel, Levent
We present two memory-reduction methods for the parallel multilevel fast multipole algorithm (MLFMA). The first method implements out-of-core techniques and the second method parallelizes the pre-processing data structures. Together, these methods decrease parallel MLFMA memory bottlenecks, and hence fast and accurate solutions can be achieved for largescale electromagnetics problems.
Open Access
Using data compression for increasing memory system utilization
(Institute of Electrical and Electronics Engineers, 2009-06) Ozturk, O.; Kandemir, M.; Irwin, M. J.
The memory system presents one of the critical challenges in embedded system design and optimization. This is mainly due to the ever-increasing code complexity of embedded applications and the exponential increase seen in the amount of data they manipulate. The memory bottleneck is even more important for multiprocessor-system-on-a-chip (MPSoC) architectures due to the high cost of off-chip memory accesses in terms of both energy and performance. As a result, reducing the memory-space occupancy of embedded applications is very important and will be even more important in the next decade. While it is true that the on-chip memory capacity of embedded systems is continuously increasing, the increases in the complexity of embedded applications and the sizes of the data sets they process are far greater. Motivated by this observation, this paper presents and evaluates a compiler-driven approach to data compression for reducing memory-space occupancy. Our goal is to study how automated compiler support can help in deciding the set of data elements to compress/ decompress and the points during execution at which these compressions/decompressions should be performed. We first study this problem in the context of single-core systems and then extend it to MPSoCs where we schedule compressions and decompressions intelligently such that they do not conflict with application execution as much as possible. Particularly, in MPSoCs, one needs to decide which processors should participate in the compression and decompression activities at any given point during the course of execution. We propose both static and dynamic algorithms for this purpose. In the static scheme, the processors are divided into two groups: those performing compression/ decompression and those executing the application, and this grouping is maintained throughout the execution of the application. In the dynamic scheme, on the other hand, the execution starts with some grouping but this grouping can change during the course of execution, depending on the dynamic variations in the data access pattern. Our experimental results show that, in a single-core system, the proposed approach reduces maximum memory occupancy by 47.9% and average memory occupancy by 48.3% when averaged over all the benchmarks. Our results also indicate that, in an MPSoC, the average energy saving is 12.7% when all eight benchmarks are considered. While compressions and decompressions and related bookkeeping activities take extra cycles and memory space and consume additional energy, we found that the improvements they bring from the memory space, execution cycles, and energy perspectives are much higher than these overheads. © 2009 IEEE.