Data Understanding


Contact name: Ken Moreland

Contact email:

VTK-m is a toolkit of scientific visualization algorithms for emerging processor architectures. One of the biggest recent changes in high-performance computing is the increasing use of accelerators. Accelerators contain small processing cores that are replicated and grouped for a very high computation rate at a much lower power. Current and future CPU processors also require much more explicit parallelism. VTK-m supports the fine-grained concurrency for data analysis and visualization algorithms required to drive extreme scale computing by providing abstract models for data and execution that can be applied to a variety of algorithms across many different processor architectures. In addition to serving as a repository for efficient implementations of visualization algorithms, VTK-m’s framework simplifies visualization development for multicore processors and automatically ports these algorithms across different processor types.



Contact name: Tom Peterka

Contact email:

DIY is an open-source package of scalable building blocks for data movement tailored to the needs of large-scale parallel analysis workloads. Scalable, parallel analysis of data-intensive computational science relies on decomposition among a large number of data-parallel subproblems, the efficient data exchange among them, and data transport between them and the memory/storage hierarchy. The abstraction enabling these capabilities is block parallelism; blocks and their message queues are mapped onto processing elements (MPI processes or threads) and are migrated between memory and storage by the DIY runtime. DIY supports distributed- and shared-memory parallel algorithms that can run both in- and out-of-core with the same code, with one or more threads per MPI process, and with one or more data blocks resident in memory. Computational scientists, data analysis researchers, and visualization tool builders can all benefit from these tools.



Contact name: Wes Bethel

Contact email:

SENSEI is an open source, generic in situ interface that makes it possible for parallel simulations, or other data producers, to code-couple to parallel third party endpoints, which may be applications or tools/methods, including user-written code in C++ or Python, as well as a growing class of data-intensive capabilities accessible through Python, capabilities such as AI/ML. Once a data producer is instrumented with the SENSEI interface, changing to a different endpoint is as simple as changing an XML configuration file with a text editor. SENSEI fully supports configurations where the simulation and endpoint are run at differing levels of concurrency, and will manage the potentially tricky process of partitioning and moving data from the M producer ranks to the N consumer ranks. The movement of data may be bidirectional, which means that a simulation not only sends data to an endpoint, but may also have access to results computed from the endpoint. While originally designed and implemented for analysis, visualization, and other data-intensive in situ tasks, SENSEI's design and implementation supports coupling of arbitrary code. SENSEI is part of the DOE HPC Data-Vis SDK.