The last ten years have seen the rise of a new parallel computing paradigm with diverse hardware architectures and software interfaces. One of the common architectures, known as 'non-uniform memory access' (NUMA), structures parallel computers so cores can access certain parts of memory faster than others. In our work, we sought to model a specific NUMA machine and use that model to inform optimizations for performing the Conjugate Gradient method. We used the model to come up with a segmented design that puts data that a core needs in memory where it can access it fast. Our segmented solution proved to be effective over the control with a maximum speed-up of 11.1x faster.
Hemstad, Jacob, "Modeling a non-uniform memory access architecture for optimizing conjugate gradient performance with sparse matrices" (2013). Honors Theses. 59.