Reducing Communication in Sparse Iterative and Direct Solvers

Research topic and goals

The focus of this collaboration is on the development of topology-aware sparse solvers at extreme scales. The first phase made some initial steps in forming a communications model. Communication costs dominate most sparse solvers leading to sharp strong scaling limitations. The specific topology of the computation plays an important role in these costs and current sparse solvers do not take this into account — e.g. algebraic multigrid. Since algebraic methods only approximate a solution, there is an opportunity to reduce communication costs by limiting algebraic decisions to efficient paths in the topology. Moreover, aggregating node-level communication can lead to additional performance gains. The primary goal is to develop solvers in this direction. The partnership between Illinois and INRIA through the JointLab will directly enhance this effort.

The project involves Illinois graduate student Amanda Bienz, Illinois professors Bill Gropp and Luke Olson, and INRIA professor Laura Grigori. The work has highlighted several aspects in moving toward topology-based algorithm decisions. Here, we investigated the communication overhead of an algebraic multigrid method at scale, where coarse grids push the strong scaling limit and exhibit irregular memory access patterns. As a result, high communication lead to reduced efficiency. We developed an alpha-beta type communication model that attempts to account for two aspects in the communication: message distance and message contention in the network. The result is a model that may help make point-wise decisions in the algorithm. For example, sparse entries could be strengthened or weakened depending on their communication burden.

Results for 2015/2016

Recent efforts have focused on overlapping communication with computation in the sparse matrix-vector multiplication operation. This work led to an asynchronous SpMV operation that was presented at Supercomputing15 in November 2015 as well as the Copper Mountain Conference on Iterative Methods in March 2016.

Results for 2016/2017

The work was extended to aggregate communication at on the nodes and was presented at SC16 and at Copper Multigrid 2017.

The next steps of the collaboration will look at adding pushing the topology aware process to sparse algorithms at INRIA.

Visits and meetings

Completed: JointLab Meeting in Barcelona, June 2015.

Completed: JointLab Meeting in Bonn, December 2015.

Completed: Amanda Bienz visit to INRIA, Spring 2016 for 4 months.

Planned: JointLab Meeting in Champaign, July 2017.

Impact and publications

    Future plans

    Going forward, we have outlined a plan to work with a broader collection of sparse solvers by collaborating with the expertise at INRIA. We continue to anticipate meetings at the upcoming Joint Lab meetings.

    References

    1. Bienz, Amanda, William D. Gropp, and Luke N. Olson. 2017. “TAPSpMV: Topology-Aware Parallel Sparse Matrix Vector Multiplication.” SIAM Journal On Scientific Computing.
      @article{BienzEtAl2017,
        author = {Bienz, Amanda and Gropp, William D. and Olson, Luke N.},
        journal = {SIAM Journal on Scientific Computing},
        notes = {submitted},
        title = {TAPSpMV: Topology-Aware Parallel Sparse Matrix Vector Multiplication},
        year = {2017}
      }
      
    2. Bienz, Amanda, Robert D. Falgout, William Gropp, Luke N. Olson, and Jacob B. Schroder. 2016. “Reducing Parallel Communication In Algebraic Multigrid through Sparsification.” SIAM Journal On Scientific Computing 38 (5): S332–S357. doi:10.1137/15M1026341.
      @article{BienzEtAl2016,
        author = {Bienz, Amanda and Falgout, Robert D. and Gropp, William and Olson, Luke N. and Schroder, Jacob B.},
        doi = {10.1137/15M1026341},
        journal = {SIAM Journal on Scientific Computing},
        number = {5},
        pages = {S332-S357},
        title = {Reducing Parallel Communication in Algebraic Multigrid through Sparsification},
        volume = {38},
        year = {2016}
      }