Iterative and direct parallel linear solvers in a hybrid MPI/OpenMP high performance computational engineering simulations

Research topic and goals

This research action aims at improving the parallel scalability and robustness of the hybrid MPI/OpenMP high performance computational code Alya developed at BSC by using the parallel linear solvers designed at Inria. In that framework both parallel sparse direct and hybrid iterative/direct linear solvers will be integrated in the Alya code to study their performance and identify their possible bottlenecks. This action will contribute to the definition of the global API for the solver stack currently developed at Inria that will ease the integration and testing of the linear solver in any large simulation code.

Results for 2016/2017

A database with testcases has been created on Marenostrum. This database includes a series of representative examples to test unsymmetric, symmetric and SPD matrices. In addition, different mesh topologies have been considered to assess the effects of mesh anisotropy, computational domain elongation, etc. Benchmarking is currently carried out to compare Alya internal solvers and MAPHYS.

Results for 2017/2018

  1. Full integration of the Inria solvers with their current individual API in the Alya code.
  2. Scalability studies carried out on for CFD and solid mechanics applications.
  3. New release of the PaStiX solver 6.0 : We have presented two approaches using a Block Low-Rank (BLR) compression technique to reduce the memory footprint and/or the time-to-solution of the sparse supernodal solver PaStiX (See (Pichon et al. 2017) and (Pichon et al. 2017)). Thanks to this compression technique, we have been able to solve a 1 billion unknown system (a 3D Laplacian matrix 100x100x100.000) on a single node with 3Tb of memory. The factorization time for this system was less than 6 hours using 96 cores, and the precision achieved at the first solve was 10e-5. With 10 additional iterative refinement steps, we reached easily 10e-8 in double precision. The cost of one solve was limited to 280 seconds. We were able to save 9Tb over the 11Tb that would be requested by the direct solver. The last release of the software includes these implementations and the description of the parameters are documented in https://gitlab.inria.fr/solverstack/pastix.

Visits and meetings

  • Guillaume Houzeaux (BSC) and Mariano Vázquez (BSC) met INRIA team at Bordeaux, 14-15 Oct. 2016.
  • Guillaume Houzeaux (BSC) met INRIA team at Bordeaux, 24-26 Feb. 2016.
  • INRIA met BSC team at Barcelona, Nov. 2016.
  • Guillaume Houzeaux (BSC) visits INRIA, Nov. 2017.

Impact and publications

None yet.

    Future plans

    We intend to complete the full integration of the Inria solvers with their current individual API in the Alya code so that scalability studies on different applications representative of Alya simulations can be performed (incompressible/compressible fluid, structure mechanics). Hopefully some of them will reveal numerical or software features to be further studied.

    References

    1. Pichon, G., M. Faverge, P. Ramet, and J. Roman. 2017. “Reordering Strategy For Blocking Optimization in Sparse Linear Solvers.” SIAM Journal On Matrix Analysis and Applications, SIAM Journal on Matrix Analysis and Applications, 38 (1). Society for Industrial and Applied Mathematics: 226–48. doi:10.1137/16M1062454.
      @article{pichon:hal-01485507,
        title = {{Reordering Strategy for Blocking Optimization in Sparse Linear Solvers}},
        author = {Pichon, G. and Faverge, M. and Ramet, P. and Roman, J.},
        url = {https://hal.inria.fr/hal-01485507},
        journal = {{SIAM Journal on Matrix Analysis and Applications}},
        publisher = {{Society for Industrial and Applied Mathematics}},
        series = {SIAM Journal on Matrix Analysis and Applications},
        volume = {38},
        number = {1},
        pages = {226 - 248},
        year = {2017},
        doi = {10.1137/16M1062454},
        keywords = {Sparse},
        pdf = {https://hal.inria.fr/hal-01485507/file/M106245.pdf},
        hal_id = {hal-01485507},
        hal_version = {v2}
      }
      
    2. Pichon, G., E. Darve, M. Faverge, P. Ramet, and J. Roman. 2017. “Sparse Supernodal Solver Using Block Low-Rank Compression.” In 18th IEEE International Workshop On Parallel and Distributed Scientific and Engineering Computing (PDSEC 2017). Orlando, United States. https://hal.inria.fr/hal-01502215.
      @inproceedings{pichon:hal-01502215,
        title = {{Sparse Supernodal Solver Using Block Low-Rank Compression}},
        author = {Pichon, G. and Darve, E. and Faverge, M. and Ramet, P. and Roman, J.},
        url = {https://hal.inria.fr/hal-01502215},
        booktitle = {{18th IEEE International Workshop on Parallel and Distributed Scientific and Engineering Computing (PDSEC 2017)}},
        address = {Orlando, United States},
        year = {2017},
        month = jun,
        keywords = {Sparse},
        pdf = {https://hal.inria.fr/hal-01502215/file/blr-final.pdf},
        hal_id = {hal-01502215},
        hal_version = {v1}
      }
      
    3. Vázquez, M., G. Houzeaux, S. Koric, A. Artigues, J. Aguado-Sierra, Arı́s R., D. Mira, et al. 2015. “Alya: Multiphysics Engineering Simulation Towards Exascale.” J. Comput. Sci.
      @article{VazquezEtAl2015,
        author = {V\'azquez, M. and Houzeaux, G. and Koric, S. and Artigues, A. and Aguado-Sierra, J. and Ar\'{\i}s, R. and Mira, D. and Calmet, H. and Cucchietti, F. and Owen, H. and Taha, A. and Burness, E.D. and Cela, J.M. and Valero, M.},
        journal = {J. Comput. Sci.},
        keywords = {Alya},
        title = {Alya: Multiphysics Engineering Simulation Towards Exascale},
        year = {2015}
      }
      
    4. Houzeaux, G, R Aubry, and M Vázquez. 2011. “Extension Of Fractional Step Techniques for Incompressible Flows: The Preconditioned Orthomin(1) for the Pressure Schur Complement.” Comput. &Amp; Fluids 44: 297–313.
      @article{HouzeauxEtAl2011,
        author = {Houzeaux, G and Aubry, R and V\'azquez, M},
        journal = {Comput. \& Fluids},
        keywords = {Orthomin(1) iteration},
        pages = {297--313},
        title = {Extension of fractional step techniques for incompressible flows: The preconditioned Orthomin(1) for the pressure Schur complement},
        volume = {44},
        year = {2011}
      }
      
    5. Houzeaux, G, M Vázquez, R Aubry, and JM Cela. 2009. “A Massively Parallel Fractional Step Solver For Incompressible Flows.” J. Comp. Phys 228 (17): 6316–32.
      @article{HouzeauxEtAl2009,
        author = {Houzeaux, G and V\'azquez, M and Aubry, R and Cela, JM},
        journal = {J. Comp. Phys},
        number = {17},
        pages = {6316--6332},
        title = {A Massively Parallel Fractional Step Solver for Incompressible Flows},
        volume = {228},
        year = {2009}
      }