Reducing Communication in Sparse Iterative and Direct Solvers

Research topic and goals

The focus of this collaboration is on the development of topology-aware sparse solvers at extreme scales. The first phase made some initial steps in forming a communications model. Communication costs dominate most sparse solvers leading to sharp strong scaling limitations. The specific topology of the computation plays an important role in these costs and current sparse solvers do not take this into account — e.g. algebraic multigrid. Since algebraic methods only approximate a solution, there is an opportunity to reduce communication costs by limiting algebraic decisions to efficient paths in the topology. Moreover, aggregating node-level communication can lead to additional performance gains. The primary goal is to develop solvers in this direction. The partnership between Illinois and INRIA through the JointLab will directly enhance this effort.

The project involves Illinois graduate student Amanda Bienz, Illinois professors Bill Gropp and Luke Olson, and INRIA professor Laura Grigori. The work has highlighted several aspects in moving toward topology-based algorithm decisions. Here, we investigated the communication overhead of an algebraic multigrid method at scale, where coarse grids push the strong scaling limit and exhibit irregular memory access patterns. As a result, high communication lead to reduced efficiency. We continue to develop a communication model that attempts to account for two aspects in the communication: message distance and message contention in the network. The result is a model that may help make point-wise decisions in the algorithm. For example, sparse entries could be strengthened or weakened depending on their communication burden.

Results for 2015/2016

Recent efforts have focused on overlapping communication with computation in the sparse matrix-vector multiplication operation. This work led to an asynchronous SpMV operation that was presented at Supercomputing15 in November 2015 as well as the Copper Mountain Conference on Iterative Methods in March

Results for 2016/2017

The work was extended to aggregate communication at on the nodes and was presented at SC16 and at Copper Multigrid 2017.

The next steps of the collaboration will look at adding pushing the topology aware process to sparse algorithms at INRIA.

Results for 2017/2018

The work resulted in presentations at the 7th JLESC Workshop in July, 2017, along with presentation of multigrid and sparse-matrix multiplication results at SC17 and Copper Mountain Multigrid (2017) and Copper Mountain Multigrid (2018). In addition, the main code Raptor, was released as open source.

Grigori (INRIA) serves on Bienz’s thesis committee; Bienz is expected to finish in 2018.

The next steps include completing contention modeling.

Visits and meetings

Completed: JointLab Meeting in Barcelona, June 2015.

Completed: JointLab Meeting in Bonn, December 2015.

Completed: Amanda Bienz visit to INRIA, Spring 2016 for 4 months.

Completed: JointLab Meeting in Champaign, July 2017.

Planned: JointLab Meeting in Barcelona, April 2018.

Impact and publications

    Future plans

    Going forward, we have outlined a plan to work with a broader collection of sparse solvers by collaborating with the expertise at INRIA. We continue to anticipate meetings at the upcoming Joint Lab meetings.

    References

    1. Bienz, Amanda, William D. Gropp, and Luke N. Olson. 2018. “Node Aware Sparse Matrix-Vector Multiplication.” (Submitted). https://arxiv.org/abs/1612.08060.
      @article{BienzEtAl2018a,
        author = {Bienz, Amanda and Gropp, William D. and Olson, Luke N.},
        title = {Node Aware Sparse Matrix-Vector Multiplication},
        journal = {(submitted)},
        url = {https://arxiv.org/abs/1612.08060},
        year = {2018}
      }
      
    2. ———. 2018. “NAPSpGEMM: Node Aware Sparse Matrix-Matrix Multiplication.” (In Progress).
      @article{BienzEtAl2018b,
        author = {Bienz, Amanda and Gropp, William D. and Olson, Luke N.},
        title = {NAPSpGEMM: Node Aware Sparse Matrix-Matrix Multiplication},
        journal = {(In progress)},
        year = {2018}
      }
      
    3. Bienz, Amanda. 2018. “Parallel Algebraic Multigrid with Node-Aware Communication.” Student Paper Competition, Copper Mountain Conference.
      @article{Bienz2018,
        author = {Bienz, Amanda},
        title = {Parallel Algebraic Multigrid with Node-Aware Communication},
        journal = {Student Paper Competition, Copper Mountain
                Conference},
        year = {2018}
      }
      
    4. Bienz, Amanda, William D. Gropp, and Luke N. Olson. 2017. “NAPSpMV: Topology-Aware Parallel Sparse Matrix Vector Multiplication.” SIAM Journal on Scientific Computing.
      @article{BienzEtAl2017,
        author = {Bienz, Amanda and Gropp, William D. and Olson, Luke N.},
        journal = {SIAM Journal on Scientific Computing},
        notes = {submitted},
        title = {NAPSpMV: Topology-Aware Parallel Sparse Matrix Vector Multiplication},
        year = {2017}
      }
      
    5. Bienz, Amanda, and Luke N. Olson. 2017. “RAPtor: Parallel Algebraic Multigrid v0.1.” https://github.com/lukeolson/raptor.
      @misc{BienzOlson2017,
        author = {Bienz, Amanda and Olson, Luke N.},
        title = {{RAPtor}: parallel algebraic multigrid v0.1},
        year = {2017},
        url = {https://github.com/lukeolson/raptor},
        note = {Release 0.1}
      }
      
    6. Bienz, Amanda, William Gropp, and Luke N. Olson. 2017. “Reducing Parallel Communication Costs in Algebraic Multigrid.” http://meetings.siam.org/sess/dsp_talk.cfm?p=82795.
      @unpublished{BienzEtAl2017b,
        title = {Reducing Parallel Communication Costs in Algebraic Multigrid},
        author = {Bienz, Amanda and Gropp, William and Olson, Luke N.},
        month = feb,
        year = {2017},
        note = {SIAM Conference on Computational Science and Engineering},
        url = {http://meetings.siam.org/sess/dsp_talk.cfm?p=82795}
      }
      
    7. ———. 2017. “Reducing Communication Costs in Sparse Matrix-Vector Multiplication.” http://easychair.org/smart-program/Copper2017/2017-03-27.html#talk:36549.
      @unpublished{BienzEtAl2017c,
        title = {Reducing Communication Costs in Sparse Matrix-Vector Multiplication},
        author = {Bienz, Amanda and Gropp, William and Olson, Luke N.},
        month = mar,
        year = {2017},
        note = {$18^{\texttt{th}}$ Copper Mountain Conference on Multigrid Methods},
        url = {http://easychair.org/smart-program/Copper2017/2017-03-27.html#talk:36549}
      }
      
    8. ———. 2017. “Reducing Communication Costs in Parallel Algebraic Multigrid.” http://sc17.supercomputing.org/presentation/?id=drs106&sess=sess282.
      @unpublished{BienzEtAl2017d,
        title = {Reducing Communication Costs in Parallel Algebraic Multigrid},
        author = {Bienz, Amanda and Gropp, William and Olson, Luke N.},
        month = nov,
        year = {2017},
        note = {Doctoral Showcase at Supercomputing 2016},
        url = {http://sc17.supercomputing.org/presentation/?id=drs106&sess=sess282}
      }
      
    9. Bienz, Amanda, Robert D. Falgout, William Gropp, Luke N. Olson, and Jacob B. Schroder. 2016. “Reducing Parallel Communication in Algebraic Multigrid through Sparsification.” SIAM Journal on Scientific Computing 38 (5): S332–S357. doi:10.1137/15M1026341.
      @article{BienzEtAl2016,
        author = {Bienz, Amanda and Falgout, Robert D. and Gropp, William and Olson, Luke N. and Schroder, Jacob B.},
        doi = {10.1137/15M1026341},
        journal = {SIAM Journal on Scientific Computing},
        number = {5},
        pages = {S332-S357},
        title = {Reducing Parallel Communication in Algebraic Multigrid through Sparsification},
        volume = {38},
        year = {2016}
      }
      
    10. Bienz, Amanda, Jon Calhoun, William Gropp, and Luke N. Olson. 2016. “Hiding Communication Costs in SpMVs and Algebraic Multigrid.” http://grandmaster.colorado.edu/copper/2016/abstract/bienz_amanda___097808/.
      @unpublished{BienzEtAl2018,
        title = {Hiding Communication Costs in SpMVs and Algebraic Multigrid},
        author = {Bienz, Amanda and Calhoun, Jon and Gropp, William and Olson, Luke N.},
        month = mar,
        year = {2016},
        note = {$14^{\texttt{th}}$ Copper Mountain Conference on Iterative Methods},
        url = {http://grandmaster.colorado.edu/copper/2016/abstract/bienz_amanda___097808/}
      }
      
    11. Bienz, Amanda, and Luke N. Olson. 2016. “Topology-Aware Performance Modeling in Parallel SpMVs.” http://meetings.siam.org/sess/dsp_talk.cfm?p=75934.
      @unpublished{BienzOlson2016,
        title = {Topology-Aware Performance Modeling in Parallel SpMVs},
        author = {Bienz, Amanda and Olson, Luke N.},
        month = nov,
        year = {2016},
        note = {$17^{\texttt{th}}$ SIAM Conference on Parallel Processing for Scientific Computing},
        url = {http://meetings.siam.org/sess/dsp_talk.cfm?p=75934}
      }
      
    12. Bienz, Amanda, Laura Grigori, William Gropp, and Luke N. Olson. 2016. “Reducing Communication in Sparse Iterative and Direct Solvers.” https://jlesc.github.io/downloads/5th-workshop/talks.pdf.
      @unpublished{BienzEtAl2016b,
        title = {Reducing Communication in Sparse Iterative and Direct Solvers},
        author = {Bienz, Amanda and Grigori, Laura and Gropp, William and Olson, Luke N.},
        month = jun,
        year = {2016},
        note = {$5^{\texttt{th}}$ JLESC Workshop},
        url = {https://jlesc.github.io/downloads/5th-workshop/talks.pdf}
      }
      
    13. Bienz, Amanda, William Gropp, and Luke N. Olson. 2016. “Reducing Communication Costs in the Parallel SpMV.” http://sc16.supercomputing.org/sc-archive/tech_poster/tech_poster_pages/post211.html.
      @unpublished{BienzEtAl2016c,
        title = {Reducing Communication Costs in the Parallel SpMV},
        author = {Bienz, Amanda and Gropp, William and Olson, Luke N.},
        month = nov,
        year = {2016},
        note = {Research Poster at Supercomputing 2016},
        url = {http://sc16.supercomputing.org/sc-archive/tech_poster/tech_poster_pages/post211.html}
      }
      
    14. Bienz, Amanda, Robert D. Falgout, William Gropp, Luke N. Olson, and Jacob B. Schroder. 2015. “Reducing Communication Costs in Parallel Algebraic Multigrid.” http://grandmaster.colorado.edu/copper/2015/abstract/bienz_amanda___094984/.
      @unpublished{BienzEtAl2015,
        title = {Reducing Communication Costs in Parallel Algebraic Multigrid},
        author = {Bienz, Amanda and Falgout, Robert D. and Gropp, William and Olson, Luke N. and Schroder, Jacob B.},
        month = mar,
        year = {2015},
        note = {$17^{\texttt{th}}$ Copper Mountain Conference on Multigrid Methods},
        url = {http://grandmaster.colorado.edu/copper/2015/abstract/bienz_amanda___094984/}
      }
      
    15. Bienz, Amanda, Laura Grigori, William Gropp, and Luke N. Olson. 2015. “Topology-Aware Performance Modeling.” https://jlesc.github.io/events/3rd-jlesc-workshop/schedule/.
      @unpublished{BienzEtAl2015b,
        title = {Topology-Aware Performance Modeling},
        author = {Bienz, Amanda and Grigori, Laura and Gropp, William and Olson, Luke N.},
        month = jun,
        year = {2015},
        note = {$3^{\texttt{rd}}$ JLESC Workshop},
        url = {https://jlesc.github.io/events/3rd-jlesc-workshop/schedule/}
      }
      
    16. Bienz, Amanda, Jon Calhoun, Luke Olson, Marc Snir, and William D. Gropp. 2015. “Analyzing the Performance of a SpMV for Extreme Scale Computers.” http://sc15.supercomputing.org/schedule/event_detail-evid=post327.html.
      @unpublished{BienzEtAl2015c,
        title = {Analyzing the Performance of a SpMV for Extreme Scale Computers},
        author = {Bienz, Amanda and Calhoun, Jon and Olson, Luke and Snir, Marc and Gropp, William D.},
        month = nov,
        year = {2015},
        note = {Research Poster at Supercomputing 2015},
        url = {http://sc15.supercomputing.org/schedule/event_detail-evid=post327.html}
      }
      
    17. Bienz, Amanda, Laura Grigori, William Gropp, and Luke N. Olson. 2015. “Topology-Aware Asynchronous Methods and the Sparse Matrix-Vector Multiply.” http://www.fz-juelich.de/ias/jsc/EN/Expertise/Workshops/Conferences/JLESC-4/Programme/Abstracts/bienz-s.html?nn=1893370.
      @unpublished{BienzEtAl2015d,
        title = {Topology-Aware Asynchronous Methods and the Sparse Matrix-Vector
                Multiply},
        author = {Bienz, Amanda and Grigori, Laura and Gropp, William and Olson, Luke N.},
        month = nov,
        year = {2015},
        note = {$4^{\texttt{th}}$ JLESC Workshop},
        url = {http://www.fz-juelich.de/ias/jsc/EN/Expertise/Workshops/Conferences/JLESC-4/Programme/Abstracts/bienz-s.html?nn=1893370}
      }
      
    18. Bienz, Amanda, Robert D. Falgout, William Gropp, Luke N. Olson, and Jacob B. Schroder. 2014. “Scalability of Non-Galerkin Parallel Algebraic Multigrid.” http://grandmaster.colorado.edu/copper/2014/abstract/bienz_amandaj__087692/.
      @unpublished{BienzEtAl2014,
        title = {Scalability of non-Galerkin Parallel Algebraic Multigrid},
        author = {Bienz, Amanda and Falgout, Robert D. and Gropp, William and Olson, Luke N. and Schroder, Jacob B.},
        month = apr,
        year = {2014},
        note = {$13^{\texttt{th}}$ Copper Mountain Conference on Iterative Methods},
        url = {http://grandmaster.colorado.edu/copper/2014/abstract/bienz_amandaj__087692/}
      }
      
    19. Bienz, Amanda, and Luke N. Olson. 2014. “Reducing Network Contention Associated with Parallel Algebraic Multigrid.” http://sc14.supercomputing.org/schedule/event_detail-evid=spost141.html.
      @unpublished{BienzOlson2014,
        title = {Reducing Network Contention Associated with Parallel Algebraic Multigrid},
        author = {Bienz, Amanda and Olson, Luke N.},
        month = nov,
        year = {2014},
        note = {ACM Student Research Competition at Supercomputing 2014},
        url = {http://sc14.supercomputing.org/schedule/event_detail-evid=spost141.html}
      }