Data Understanding
VTK-m
Contact name: Ken Moreland
Contact email: morelandkd@ornl.gov
VTK-m is a toolkit of scientific visualization algorithms for emerging processor architectures. One of the biggest recent changes in high-performance computing is the increasing use of accelerators. Accelerators contain small processing cores that are replicated and grouped for a very high computation rate at a much lower power. Current and future CPU processors also require much more explicit parallelism. VTK-m supports the fine-grained concurrency for data analysis and visualization algorithms required to drive extreme scale computing by providing abstract models for data and execution that can be applied to a variety of algorithms across many different processor architectures. In addition to serving as a repository for efficient implementations of visualization algorithms, VTK-m’s framework simplifies visualization development for multicore processors and automatically ports these algorithms across different processor types.
DIY
Contact name: Tom Peterka
Contact email: tpeterka@mcs.anl.gov
DIY is an open-source package of scalable building blocks for data movement tailored to the needs of large-scale parallel analysis workloads. Scalable, parallel analysis of data-intensive computational science relies on decomposition among a large number of data-parallel subproblems, the efficient data exchange among them, and data transport between them and the memory/storage hierarchy. The abstraction enabling these capabilities is block parallelism; blocks and their message queues are mapped onto processing elements (MPI processes or threads) and are migrated between memory and storage by the DIY runtime. DIY supports distributed- and shared-memory parallel algorithms that can run both in- and out-of-core with the same code, with one or more threads per MPI process, and with one or more data blocks resident in memory. Computational scientists, data analysis researchers, and visualization tool builders can all benefit from these tools.
SENSEI
Contact name: Wes Bethel
Contact email: EWBethel@lbl.gov
SENSEI is an open source, generic in situ interface that makes it possible for parallel simulations, or other data producers, to code-couple to parallel third party endpoints, which may be applications or tools/methods, including user-written code in C++ or Python, as well as a growing class of data-intensive capabilities accessible through Python, capabilities such as AI/ML. Once a data producer is instrumented with the SENSEI interface, changing to a different endpoint is as simple as changing an XML configuration file with a text editor. SENSEI fully supports configurations where the simulation and endpoint are run at differing levels of concurrency, and will manage the potentially tricky process of partitioning and moving data from the M producer ranks to the N consumer ranks. The movement of data may be bidirectional, which means that a simulation not only sends data to an endpoint, but may also have access to results computed from the endpoint. While originally designed and implemented for analysis, visualization, and other data-intensive in situ tasks, SENSEI's design and implementation supports coupling of arbitrary code. SENSEI is part of the DOE HPC Data-Vis SDK.
ParaView
Contact name: Berk Geveci
Contact email: berk.geveci@kitware.com
ParaView is an open-source, multi-platform data analysis and visualization application. ParaView users can quickly build visualizations to analyze their data using qualitative and quantitative techniques. ParaView was developed to analyze extremely large datasets using distributed memory computing resources. It can be run on supercomputers to analyze datasets of petascale size as well as on laptops for smaller data, has become an integral tool in many national laboratories, universities and industry, and has won several awards related to high performance computation.