As with all my handwritten notes, this has the usual disclaimer: these posts are just so I can use nice indexed search to find my notes, which are sufficient for me to recall talks and papers but probably not much use to anyone else. Paper here. Patents here and here
We present Adaptive Stream Detection, a simple technique for modulating the aggressiveness of a stream prefetcher to match a workload’s observed spatial locality. We use this concept to design a prefetcher that resides on an on-chip memory controller. The result is a prefetcher with small hardware costs that can exploit workloads with low amounts of spatial locality. Using highly accurate simulators for the IBM Power5+, we show that this prefetcher improves performance of the SPEC2006fp benchmarks by an average of 32.7% when compared against a Power5+ that performs no prefetching. On a set of 5 commercial benchmarks that have low spatial locality, this prefetcher improves performance by an average of 15.1%. When compared against a typical Power5+ that does perform processor-side prefetching, the average performance improvement of these benchmark suites is 10.2% and 8.4%. We also evaluate the power and energy impact of our technique. For the same benchmark suites, DRAM power consumption increases by less than 3%, while energy usage decreases by 9.8% and 8.2%, respectively. Moreover, the power consumption of the prefetcher itself is low; it is estimated to increase the power consumption of the Power5+ chip by 0.06%.