- Bethel, E Wes;
- Loring, Burlen;
- Ayachit, Utkarsh;
- Duque, Earl PN;
- Ferrier, Nicola;
- Insley, Joseph;
- Gu, Junmin;
- Kress, James;
- O’Leary, Patrick;
- Pugmire, Dave;
- Rizzi, Silvio;
- Thompson, David;
- Usher, Will;
- Weber, Gunther H;
- Whitlock, Brad;
- Wolf, Matthew;
- Wu, Kesheng
In high-performance parallel in situ processing, the term in transit processing refers to those configurations where data must move from a producer to a consumer that runs on separate resources. In the context of parallel and distributed computing on an HPC platform one of the central challenges is to determine a mapping of data from producer ranks to consumer ranks. This problem is complicated by the heterogeneity that arises in producer-consumer pairs, such as when producer and consumer codes have different levels of concurrency, different scaling characteristics, or different data models. The resulting mapping and movement of data from M producer to N consumer ranks can have a significant impact on aggregate application performance, particularly when the data consumer requires only a subset of the overall data for its task. This chapter focuses on the design considerations that underlie SENSEI’s implementation to this challenging problem. These design considerations extend the core SENSEI architecture and include ideas like the need to accommodate flexibility in the choice of different partitioning methods, the ability for a data consumer to request and receive only the subset of data needed for its particular operation, and the ability to leverage any of several different data transport tools. The idea of proximity portability, being able to use different data transport methods as part of an in transit workflow, is illustrated through the use of three different transport layers where switching from one transport tool to another is accomplished with only a configuration file change. The chapter also includes a performance analysis summary showing the performance gains that are possible in terms of multiple metrics, such as memory footprint, time to solution, and amount of data moved, when using optimized partitioners in an in transit setting, gains that are made possible by the implementation shaped by specific design considerations.