Enhancing Interoperability in Task-based Programming Models through Common Low-Level Interfaces

  • Head
  • Beltran Vicenc (BSC)
  • Members
  • Alvarez David (BSC)
  • Aumage Olivier (INRIA)
  • Herault Thomas (INRIA)
  • Kale Sanjay (UIUC)

Research topic and goals

Task-based programming models are a promising approach to exploiting complex distributed and heterogeneous systems. Recently, these models have emerged as a viable alternative to traditional message-passing and fork-join approaches, reflecting the research community’s growing interest in their numerous benefits. Despite these advantages, the broader adoption within the scientific and industrial sectors remains limited. A major barrier is the poor interoperability among existing runtime systems, which prevents the development of complex applications through the integration of multiple components or libraries written in different task-based programming models. Such interoperability issues often lead to oversubscription, significantly impairing applications’ performance. Maintainability presents another significant challenge within task-based programming models, encountered by projects at varying stages. The relatively modest level of effort anticipated in a future steady-state raises critical questions about the sustainability of support for these models. Can the organization of software, potentially through the utilization of shared code bases for sub-components, facilitate easier maintenance? Such an approach could potentially streamline the support process, thereby enhancing the long-term viability and operational efficiency of task-based systems.

Our project aims to address these challenges by identifying the low-level primitives necessary for building efficient and scalable task-based runtime systems. Based on these findings, we will develop unified low-level tasking interfaces to enhance interoperability among different runtime systems, optimizing performance and encouraging broader adoption of task-based programming models.

Results for 2024/2025

Since the project’s inception in September 2024, we have established a monthly video call to coordinate activities and ensure continuous progress.

Our work is currently organized around two main objectives:

  • Survey and Documentation of Runtime System Components We are actively discussing and documenting key components of various runtime systems, including schedulers, dependency management, and communication libraries. This effort aims to identify commonalities and opportunities for cross-runtime interoperability. We plan to consolidate our findings into a survey paper that will highlight similarities and differences among existing runtime systems and propose reusable components that could benefit multiple implementations.
  • Porting StarPU and PaRSEC to nOS-V We have initiated work on adapting StarPU and PaRSEC to run on top of nOS-V. A working prototype of StarPU/nOS-V has already been developed, marking a significant milestone. To further advance this effort, we have secured a summer internship for 2025 in collaboration with Inria and BSC. This internship will focus on improving the integration and evaluating the performance of StarPU and PaRSEC within the nOS-V environment

Visits and meetings

  • We had a BoF session in the Kobe 2024 workshop, where we decided to start this project.
  • We are organizing another BoF session at the upcoming Argonne 2025 workshop
  • We have planned a summer internship from June to August 2025, from BSC to Inria.

Impact and publications

  • Solve interoperability problems between task-based programming models
  • Increase software development productivity and reach of task-based programming models
  • Research and document needs from higher-level programming systems (Charm++, HPX, Legion, OpenMP, OmpSs, ..) for the task-based execution systems
  • Explore possibility of creating a standard that allows for different use cases currently served, either via a flexible design or allowing for extensions

References

  1. Álvarez, David, Kevin Sala, and Vicenç Beltran. 2024. “NOS-V: Co-Executing HPC Applications Using System-Wide Task Scheduling.” In IEEE International Parallel and Distributed Processing Symposium, IPDPS 2024, San Francisco, CA, USA, May 27-31, 2024, 312–24. IEEE. https://doi.org/10.1109/IPDPS57955.2024.00035.
    @inproceedings{DBLP:conf/ipps/0006S024,
      author = {{\'{A}}lvarez, David and Sala, Kevin and Beltran, Vicen{\c{c}}},
      title = {nOS-V: Co-Executing {HPC} Applications Using System-Wide Task Scheduling},
      booktitle = {{IEEE} International Parallel and Distributed Processing Symposium,
                        {IPDPS} 2024, San Francisco, CA, USA, May 27-31, 2024},
      pages = {312--324},
      publisher = {{IEEE}},
      year = {2024},
      url = {https://doi.org/10.1109/IPDPS57955.2024.00035},
      doi = {10.1109/IPDPS57955.2024.00035},
      timestamp = {Wed, 17 Jul 2024 15:59:37 +0200},
      biburl = {https://dblp.org/rec/conf/ipps/0006S024.bib},
      bibsource = {dblp computer science bibliography, https://dblp.org}
    }
    
  2. Bosilca, George, Aurelien Bouteiller, Anthony Danalis, Thomas Herault, Pierre Lemariner, and Jack Dongarra. 2011. “DAGuE: A Generic Distributed DAG Engine for High Performance Computing.” In , 1151–58. Anchorage, Alaska, USA: IEEE.
    @inproceedings{icl:675,
      title = {DAGuE: A Generic Distributed DAG Engine for High Performance Computing},
      year = {2011},
      month = {2011-00},
      pages = {1151-1158},
      publisher = {IEEE},
      address = {Anchorage, Alaska, USA},
      keywords = {dague, parsec},
      author = {Bosilca, George and Bouteiller, Aurelien and Danalis, Anthony and Herault, Thomas and Lemariner, Pierre and Dongarra, Jack}
    }
    
  3. Augonnet, Cédric, Samuel Thibault, Raymond Namyst, and Pierre-André Wacrenier. 2009. “StarPU: A Unified Platform for Task Scheduling on Heterogeneous Multicore Architectures.” In Euro-Par 2009 Parallel Processing, edited by Henk Sips, Dick Epema, and Hai-Xiang Lin, 863–74. Berlin, Heidelberg: Springer Berlin Heidelberg.
    @inproceedings{10.1007/978-3-642-03869-3_80,
      author = {Augonnet, C{\'e}dric and Thibault, Samuel and Namyst, Raymond and Wacrenier, Pierre-Andr{\'e}},
      editor = {Sips, Henk and Epema, Dick and Lin, Hai-Xiang},
      title = {StarPU: A Unified Platform for Task Scheduling on Heterogeneous Multicore Architectures},
      booktitle = {Euro-Par 2009 Parallel Processing},
      year = {2009},
      publisher = {Springer Berlin Heidelberg},
      address = {Berlin, Heidelberg},
      pages = {863--874},
      isbn = {978-3-642-03869-3}
    }
    
    In the field of HPC, the current hardware trend is to design multiprocessor architectures that feature heterogeneous technologies such as specialized coprocessors (e.g. Cell/BE SPUs) or data-parallel accelerators (e.g. GPGPUs).
  4. Kale, L.V., M. Bhandarkar, N. Jagathesan, S. Krishnan, and J. Yelon. 1996. “Converse: an Interoperable Framework for Parallel Programming.” In Proceedings of International Conference on Parallel Processing, 212–17. https://doi.org/10.1109/IPPS.1996.508060.
    @inproceedings{508060,
      author = {Kale, L.V. and Bhandarkar, M. and Jagathesan, N. and Krishnan, S. and Yelon, J.},
      booktitle = {Proceedings of International Conference on Parallel Processing},
      title = {Converse: an interoperable framework for parallel programming},
      year = {1996},
      volume = {},
      number = {},
      pages = {212-217},
      keywords = {Parallel programming;Object oriented programming;Parallel languages;Runtime},
      doi = {10.1109/IPPS.1996.508060}
    }