Most contemporary architectures for high-performance low-power computing systems consist of
multicore processors, where tasks are distributed among multiple cores to improve processing speed and the
system runs at a lower frequency to reduce the total power consumption. However, multilevel caches in multicore
architectures multiply the timing unpredictability and require significant amount of power to be operated. Cache
locking techniques are used in single-core systems to improve predictability by locking useful blocks in the
cache. The success of cache locking primarily depends on the effective selection of the right blocks to be locked.
In prior work, we introduced an efficient block selection methodology and a Miss Table based cache locking
scheme where information about the blocks and cache misses are stored in the Miss Table to facilitate the cache
locking. Cache locking in multicore is more challenging because of the complexity introduced by the architecture.
In this chapter, we investigate the impact of the particular placement of the Miss Table, i.e. whether at the level-1
cache (CL1) or at level-2 cache (CL2), on the system’s predictability and performance/power ratio. Using
VisualSim and Heptane simulation tools, we simulate an 8-core architecture, where each core’s private CL1 is
split into instruction (I1) and data (D1) caches and the CL2 is unified and shared by the cores. Experimental
results using MPEG4 decoding and FFT algorithms show that Miss Table based cache locking at level-1 is more
beneficiary than Miss Table based cache locking at level-2 for MPEG4; a maximum reduction of 38% in mean
delay per task and a maximum reduction of 32% in total power consumption are achieved by locking one-fourth
of the I1 cache size. For FFT, the impact of locking at level-1 and level-2 is almost the same.