Reducing Communication in Sparse Iterative and Direct Solvers

Research topic and goals

The focus of this collaboration is on reducing communication in sparse solvers at extreme scales. The first phase made some initial steps in forming a communications model. Communication costs dominate most sparse solvers leading to sharp strong scaling limitations. The specific topology of the computation plays an important role in these costs and current sparse solvers do not take this into account — e.g. algebraic multigrid. Aggregating communication to align with machine topology can lead to performance gains. The primary goal is to develop advanced solvers in this direction. The partnership between Illinois and INRIA through the JointLab will directly enhance this effort.

The project involves Illinois graduate student Shelby Lockhart, postdoc Amanda Bienz, Illinois professors Bill Gropp and Luke Olson, and INRIA professor Laura Grigori. The work has highlighted several aspects in moving toward topology-based algorithm decisions. The next steps in the project will take a closer look at the performance opporunitities in enlarged Krylov methods. Currently we have extended the performance modeling to the ECG algorithm and we are detailing the measured performance.

Results for 2015/2016

Recent efforts have focused on overlapping communication with computation in the sparse matrix-vector multiplication operation. This work led to an asynchronous SpMV operation that was presented at Supercomputing15 in November 2015 as well as the Copper Mountain Conference on Iterative Methods in March 2016.

Results for 2016/2017

The work was extended to aggregate communication at on the nodes and was presented at SC16 and at Copper Multigrid 2017.

The next steps of the collaboration will look at adding pushing the topology aware process to sparse algorithms at INRIA.

Results for 2017/2018

The work resulted in presentations at the 7th JLESC Workshop in July, 2017, along with presentation of multigrid and sparse-matrix multiplication results at SC17 and Copper Mountain Multigrid (2017) and Copper Mountain Multigrid (2018). In addition, the main code Raptor, was released as open source.

Grigori (INRIA) serves on Bienz’s thesis committee; Bienz is expected to finish in 2018.

The next steps include completing contention modeling.

Results for 2018/2019

Recent progress has included a paper in review, significant updates to the code base, and a beta version of a node aware communication package. Bienz completed a PhD in summer 2018, with Grigori (INRIA) serving on the final defense committee. Shelby Lockhart has joined the collaboration and is making some initial steps toward performance analysis of enlarged Krylov methods.

Results for 2019/2020

The recent work as included the extension of the node aware algorithm to the full algebraic multigrid setup and solve phases. In addition, Bienz’s work has been extended to collectives. Lockhart extended the performance analysis to the enlarged Krylov setting and developed a new communication strategy.

Visits and meetings

Completed: JointLab Meeting in Barcelona, June 2015.

Completed: JointLab Meeting in Bonn, December 2015.

Completed: Amanda Bienz visit to INRIA, Spring 2016 for 4 months.

Completed: JointLab Meeting in Champaign, July 2017.

Completed: JointLab Meeting in Barcelona, April 2018.

Completed: JointLab Meeting in Tennessee, April 2019.

Impact and publications

    Future plans

    We continue to anticipate meetings at the upcoming Joint Lab meetings. Future work includes continued work on high performance Kryolv methods.

    References

    1. Bienz, Amanda, William D. Gropp, and Luke N. Olson. 2018. “Node Aware Sparse Matrix-Vector Multiplication.” (Submitted). https://arxiv.org/abs/1612.08060.
      @article{BienzEtAl2018a,
        author = {Bienz, Amanda and Gropp, William D. and Olson, Luke N.},
        title = {Node Aware Sparse Matrix-Vector Multiplication},
        journal = {(submitted)},
        url = {https://arxiv.org/abs/1612.08060},
        year = {2018}
      }
      
    2. ———. 2018. “NAPSpGEMM: Node Aware Sparse Matrix-Matrix Multiplication.” (In Progress).
      @article{BienzEtAl2018b,
        author = {Bienz, Amanda and Gropp, William D. and Olson, Luke N.},
        title = {NAPSpGEMM: Node Aware Sparse Matrix-Matrix Multiplication},
        journal = {(In progress)},
        year = {2018}
      }
      
    3. Bienz, Amanda. 2018. “Parallel Algebraic Multigrid with Node-Aware Communication.” Student Paper Competition, Copper Mountain Conference.
      @article{Bienz2018,
        author = {Bienz, Amanda},
        title = {Parallel Algebraic Multigrid with Node-Aware Communication},
        journal = {Student Paper Competition, Copper Mountain
                Conference},
        year = {2018}
      }
      
    4. Bienz, Amanda, William D. Gropp, and Luke N. Olson. 2017. “NAPSpMV: Topology-Aware Parallel Sparse Matrix Vector Multiplication.” SIAM Journal on Scientific Computing.
      @article{BienzEtAl2017,
        author = {Bienz, Amanda and Gropp, William D. and Olson, Luke N.},
        journal = {SIAM Journal on Scientific Computing},
        notes = {submitted},
        title = {NAPSpMV: Topology-Aware Parallel Sparse Matrix Vector Multiplication},
        year = {2017}
      }
      
    5. Bienz, Amanda, and Luke N. Olson. 2017. “RAPtor: Parallel Algebraic Multigrid v0.1.” https://github.com/lukeolson/raptor.
      @misc{BienzOlson2017,
        author = {Bienz, Amanda and Olson, Luke N.},
        title = {{RAPtor}: parallel algebraic multigrid v0.1},
        year = {2017},
        url = {https://github.com/lukeolson/raptor},
        note = {Release 0.1}
      }
      
    6. Bienz, Amanda, William Gropp, and Luke N. Olson. 2017. “Reducing Parallel Communication Costs in Algebraic Multigrid.” http://meetings.siam.org/sess/dsp_talk.cfm?p=82795.
      @unpublished{BienzEtAl2017b,
        title = {Reducing Parallel Communication Costs in Algebraic Multigrid},
        author = {Bienz, Amanda and Gropp, William and Olson, Luke N.},
        month = feb,
        year = {2017},
        note = {SIAM Conference on Computational Science and Engineering},
        url = {http://meetings.siam.org/sess/dsp_talk.cfm?p=82795}
      }
      
    7. ———. 2017. “Reducing Communication Costs in Sparse Matrix-Vector Multiplication.” http://easychair.org/smart-program/Copper2017/2017-03-27.html#talk:36549.
      @unpublished{BienzEtAl2017c,
        title = {Reducing Communication Costs in Sparse Matrix-Vector Multiplication},
        author = {Bienz, Amanda and Gropp, William and Olson, Luke N.},
        month = mar,
        year = {2017},
        note = {$18^{\texttt{th}}$ Copper Mountain Conference on Multigrid Methods},
        url = {http://easychair.org/smart-program/Copper2017/2017-03-27.html#talk:36549}
      }
      
    8. ———. 2017. “Reducing Communication Costs in Parallel Algebraic Multigrid.” http://sc17.supercomputing.org/presentation/?id=drs106&sess=sess282.
      @unpublished{BienzEtAl2017d,
        title = {Reducing Communication Costs in Parallel Algebraic Multigrid},
        author = {Bienz, Amanda and Gropp, William and Olson, Luke N.},
        month = nov,
        year = {2017},
        note = {Doctoral Showcase at Supercomputing 2016},
        url = {http://sc17.supercomputing.org/presentation/?id=drs106&sess=sess282}
      }
      
    9. Bienz, Amanda, Robert D. Falgout, William Gropp, Luke N. Olson, and Jacob B. Schroder. 2016. “Reducing Parallel Communication in Algebraic Multigrid through Sparsification.” SIAM Journal on Scientific Computing 38 (5): S332–S357. doi:10.1137/15M1026341.
      @article{BienzEtAl2016,
        author = {Bienz, Amanda and Falgout, Robert D. and Gropp, William and Olson, Luke N. and Schroder, Jacob B.},
        doi = {10.1137/15M1026341},
        journal = {SIAM Journal on Scientific Computing},
        number = {5},
        pages = {S332-S357},
        title = {Reducing Parallel Communication in Algebraic Multigrid through Sparsification},
        volume = {38},
        year = {2016}
      }
      
    10. Bienz, Amanda, Jon Calhoun, William Gropp, and Luke N. Olson. 2016. “Hiding Communication Costs in SpMVs and Algebraic Multigrid.” http://grandmaster.colorado.edu/copper/2016/abstract/bienz_amanda___097808/.
      @unpublished{BienzEtAl2018,
        title = {Hiding Communication Costs in SpMVs and Algebraic Multigrid},
        author = {Bienz, Amanda and Calhoun, Jon and Gropp, William and Olson, Luke N.},
        month = mar,
        year = {2016},
        note = {$14^{\texttt{th}}$ Copper Mountain Conference on Iterative Methods},
        url = {http://grandmaster.colorado.edu/copper/2016/abstract/bienz_amanda___097808/}
      }
      
    11. Bienz, Amanda, and Luke N. Olson. 2016. “Topology-Aware Performance Modeling in Parallel SpMVs.” http://meetings.siam.org/sess/dsp_talk.cfm?p=75934.
      @unpublished{BienzOlson2016,
        title = {Topology-Aware Performance Modeling in Parallel SpMVs},
        author = {Bienz, Amanda and Olson, Luke N.},
        month = nov,
        year = {2016},
        note = {$17^{\texttt{th}}$ SIAM Conference on Parallel Processing for Scientific Computing},
        url = {http://meetings.siam.org/sess/dsp_talk.cfm?p=75934}
      }
      
    12. Bienz, Amanda, Laura Grigori, William Gropp, and Luke N. Olson. 2016. “Reducing Communication in Sparse Iterative and Direct Solvers.” https://jlesc.github.io/downloads/5th-workshop/talks.pdf.
      @unpublished{BienzEtAl2016b,
        title = {Reducing Communication in Sparse Iterative and Direct Solvers},
        author = {Bienz, Amanda and Grigori, Laura and Gropp, William and Olson, Luke N.},
        month = jun,
        year = {2016},
        note = {$5^{\texttt{th}}$ JLESC Workshop},
        url = {https://jlesc.github.io/downloads/5th-workshop/talks.pdf}
      }
      
    13. Bienz, Amanda, William Gropp, and Luke N. Olson. 2016. “Reducing Communication Costs in the Parallel SpMV.” http://sc16.supercomputing.org/sc-archive/tech_poster/tech_poster_pages/post211.html.
      @unpublished{BienzEtAl2016c,
        title = {Reducing Communication Costs in the Parallel SpMV},
        author = {Bienz, Amanda and Gropp, William and Olson, Luke N.},
        month = nov,
        year = {2016},
        note = {Research Poster at Supercomputing 2016},
        url = {http://sc16.supercomputing.org/sc-archive/tech_poster/tech_poster_pages/post211.html}
      }
      
    14. Bienz, Amanda, Robert D. Falgout, William Gropp, Luke N. Olson, and Jacob B. Schroder. 2015. “Reducing Communication Costs in Parallel Algebraic Multigrid.” http://grandmaster.colorado.edu/copper/2015/abstract/bienz_amanda___094984/.
      @unpublished{BienzEtAl2015,
        title = {Reducing Communication Costs in Parallel Algebraic Multigrid},
        author = {Bienz, Amanda and Falgout, Robert D. and Gropp, William and Olson, Luke N. and Schroder, Jacob B.},
        month = mar,
        year = {2015},
        note = {$17^{\texttt{th}}$ Copper Mountain Conference on Multigrid Methods},
        url = {http://grandmaster.colorado.edu/copper/2015/abstract/bienz_amanda___094984/}
      }
      
    15. Bienz, Amanda, Laura Grigori, William Gropp, and Luke N. Olson. 2015. “Topology-Aware Performance Modeling.” https://jlesc.github.io/events/3rd-jlesc-workshop/schedule/.
      @unpublished{BienzEtAl2015b,
        title = {Topology-Aware Performance Modeling},
        author = {Bienz, Amanda and Grigori, Laura and Gropp, William and Olson, Luke N.},
        month = jun,
        year = {2015},
        note = {$3^{\texttt{rd}}$ JLESC Workshop},
        url = {https://jlesc.github.io/events/3rd-jlesc-workshop/schedule/}
      }
      
    16. Bienz, Amanda, Jon Calhoun, Luke Olson, Marc Snir, and William D. Gropp. 2015. “Analyzing the Performance of a SpMV for Extreme Scale Computers.” http://sc15.supercomputing.org/schedule/event_detail-evid=post327.html.
      @unpublished{BienzEtAl2015c,
        title = {Analyzing the Performance of a SpMV for Extreme Scale Computers},
        author = {Bienz, Amanda and Calhoun, Jon and Olson, Luke and Snir, Marc and Gropp, William D.},
        month = nov,
        year = {2015},
        note = {Research Poster at Supercomputing 2015},
        url = {http://sc15.supercomputing.org/schedule/event_detail-evid=post327.html}
      }
      
    17. Bienz, Amanda, Laura Grigori, William Gropp, and Luke N. Olson. 2015. “Topology-Aware Asynchronous Methods and the Sparse Matrix-Vector Multiply.” http://www.fz-juelich.de/ias/jsc/EN/Expertise/Workshops/Conferences/JLESC-4/Programme/Abstracts/bienz-s.html?nn=1893370.
      @unpublished{BienzEtAl2015d,
        title = {Topology-Aware Asynchronous Methods and the Sparse Matrix-Vector
                Multiply},
        author = {Bienz, Amanda and Grigori, Laura and Gropp, William and Olson, Luke N.},
        month = nov,
        year = {2015},
        note = {$4^{\texttt{th}}$ JLESC Workshop},
        url = {http://www.fz-juelich.de/ias/jsc/EN/Expertise/Workshops/Conferences/JLESC-4/Programme/Abstracts/bienz-s.html?nn=1893370}
      }
      
    18. Bienz, Amanda, Robert D. Falgout, William Gropp, Luke N. Olson, and Jacob B. Schroder. 2014. “Scalability of Non-Galerkin Parallel Algebraic Multigrid.” http://grandmaster.colorado.edu/copper/2014/abstract/bienz_amandaj__087692/.
      @unpublished{BienzEtAl2014,
        title = {Scalability of non-Galerkin Parallel Algebraic Multigrid},
        author = {Bienz, Amanda and Falgout, Robert D. and Gropp, William and Olson, Luke N. and Schroder, Jacob B.},
        month = apr,
        year = {2014},
        note = {$13^{\texttt{th}}$ Copper Mountain Conference on Iterative Methods},
        url = {http://grandmaster.colorado.edu/copper/2014/abstract/bienz_amandaj__087692/}
      }
      
    19. Bienz, Amanda, and Luke N. Olson. 2014. “Reducing Network Contention Associated with Parallel Algebraic Multigrid.” http://sc14.supercomputing.org/schedule/event_detail-evid=spost141.html.
      @unpublished{BienzOlson2014,
        title = {Reducing Network Contention Associated with Parallel Algebraic Multigrid},
        author = {Bienz, Amanda and Olson, Luke N.},
        month = nov,
        year = {2014},
        note = {ACM Student Research Competition at Supercomputing 2014},
        url = {http://sc14.supercomputing.org/schedule/event_detail-evid=spost141.html}
      }