Tuesday, June 8, 2021 – 1:00pm to 2:00pm
Virtual Presentation – ET Remote Access – Zoom
ZIQI WANG, Ph.D. Student https://wangziqi2013.github.io/ziqiw/
On Building a Multiversioned Cache Hierarchy with Page Overlays
On modern multi-core architectures, the cache hierarchy serves as both a fast storage for frequently accessed data, and a communication channel between processor cores. Recent advancements in software and hardware, however, have motivated many interesting use cases of the cache that are not handled very well by today’s cache hierarchy. This thesis proposal investigates into versioning, one of the most generally observed paradigms in daily programming but largely neglected in cache system designs. We observe that many common problems from a wide range of applications can be addressed with hardware support for managing logically related data (“versions”). Current cache hierarchy struggles with these problems and forces software designs to adopt sub-optimal solutions, which leaves much space for improvement.
This proposal presents a systematic solution for several real-world problems that fall into the versioning category. We base our design on Page Overlays, a virtual memory framework that enables fine-grained address remapping. As part of our preliminary work towards this grand goal, we present OverlayTM, a Hardware Transactional Memory (HTM) design running the multi-versioned, serializable concurrency control protocol with a hardware commit queue. We also present NVOverlay to leverage versions on different levels of the hierarchy to perform background redo logging onto NVM at millisecond-scale frequency. We next present MBC-2D as an inter-block cache compression architecture operating on the versioned “2D address space”, which achieves higher compression ratio with simpler metadata management.
This proposal also seeks to extend our preliminary works in the following major directions. First, we are eager to explore malloc-less object allocation using fast incache duplication of versions without the overhead of a software allocator. Second, we propose to extend the 2D inter-block compression domain from the last-level cache to the main memory, such that blocks are also organized in 2D compressed fashion in the main memory for more logical storage and less bus bandwidth consumption. Lastly, we also plan to investigate into page table compression in a virtualized environment. This is of high value to containers and microservice platforms where the page tables, as well as the processes themselves, are short-lived.
Todd C. Mowry (Chair)
Mike Kozuch (Intel Labs)
Gennady Pekhimenko (University of Toronto)
Zoom Participation. See announcement.