DIY is a block-parallel library for implementing scalable algorithms that can execute both in-core and out-of-core.
GraphBLAS is an open effort, including an API, to define standard building blocks for graph algorithms in the language of linear algebra.
ParaView is an open-source multiple-platform application for interactive, scientific visualization. Catalyst, its in situ use case library, orchestrates the delicate alliance between simulation, analysis, and visualization tasks.
Tess is a parallel Delaunay and Voronoi tessellation library. It includes support for density estimation.
VisIt is an open-source interactive, scalable visualization, animation, and analysis tool. libsim enables its use in situ with the simulations.
VTK-m is a toolkit of scientific visualization algorithms for emerging processor architectures. It supports the fine-grained concurrency for data analysis and visualization algorithms required to drive extreme scale computing.
CrossVis is an open-source multivariate visual analytics system that provides statistical analytics, multi-scale summarizations, and correlation displays to guide user's to interesting features in complex scientific data.
EDDA is a distribution based data analysis and visualization software for in situ analytics. Based on Gaussian Mixture Models, Probability Distributions, and Information Theory, EDDA provides both C++ and Python APIs that can help scientists preserve salient information from their simulation output while delivering high quality visualization and achieving significant data reduction.
Roofline is a visually-intuitive performance model and set of tools developed to understand how computation, data movement, and locality constrain performance on modern multicore, manycore, and GPU-accelerated systems.
TAU Performance System is a portable profiling and tracing toolkit for performance analysis of parallel programs written in Fortran, C, C++, UPC, Java, Python, and others. SOSflow (Scalable Observation System for Scientific Workflows) provides a flexible, scalable, and programmable framework for observation, introspection, feedback, and control of HPC applications. APEX (Autonomic Performance Environment for eXascale) is a profiling and tracing library for asynchronous user-level threading systems without conventional call stacks or callgraphs. APEX can capture performance data and adapt runtime behavior for applications written in HPX, OpenMP, OpenACC, Kokkos, CUDA, and more.
Papyrus is a programming system that provides features for scalable, aggregate, persistent memory in an extreme-scale system for typical HPC usage scenarios. Papyrus provides a portable and scalable programming interface to access and manage parallel data structures on the distributed NVM storage.
DRAGON is a solution that enables all classes of GP-GPU applications to transparently compute on terabyte datasets residing in NVM while ensuring the integrity of data buffers as necessary for NVM. DRAGON leverages the page-faulting mechanism on the recent NVIDIA GPUs by extending capabilities of CUDA Unified Memory (UM). Further, DRAGON improves overall performance by dynamically optimizing accesses to NVM.
OpenARC is the first open-sourced, OpenACC/OpenMP compiler supporting Altera FPGAs, in addition to NVIDIA/AMD GPUs and Intel Xeon Phis. OpenARC has various additional directives/environment variables for internal tracing and architecture-specific optimizations. Combined with its built-in tuning tools, OpenARC allows users to control overall OpenACC-to-accelerator translation and optimization in a fine-grained, but still abstract manner, offering very high tunability.
Clacc is an OpenACC-to-OpenMP4 translation framework, which builds on clang’s existing OpenMP compiler/runtime support and allows OpenACC programs to be compiled by the production-quality clang/LLVM programming system. OpenACC support in clang/LLVM will facilitate the programming of GPUs and other accelerators in DOE applications, and it will provide a popular compiler platform on which to perform research and development for related optimizations and tools (e.g., static analyzers, debuggers, editor extensions).
Scientific Data Management
ADIOS provides a simple, flexible way for scientists to describe the data in their code that may need to be written, read, or processed outside of the running simulation. By providing an external to the code XML file describing the various elements, their types, and how you wish to process them this run, the routines in the host code (either Fortran or C) can transparently change how they process the data.
DataSpaces is a middleware library and runtime providing asynchronous coupling of codes using RDMA for memory-memory data transfer.
Darshan is a toolkit for characterizing the I/O behavior of applications, used in production at many DOE compute facilities.
FastBit is an open-source data processing library following the spirit of NoSQL movement. It offers a set of searching functions supported by compressed bitmap indexes. It treats user data in the column-oriented manner, and is able to accelerate user's data selection tasks without imposing undue requirements.
HDF5 is a data model, library, and file format for storing and managing data. It supports an unlimited variety of data types and is designed for flexible and efficient I/O and high volume and complex data. HDF5 is portable and is extensible, allowing applications to evolve in their use of HDF5. The HDF5 Technology suite includes tools and applications for managing, manipulating, viewing, and analyzing data in the HDF5 format.
Mochi is an open ecosystem enabling the development of a variety of distributed services supporting the data-related needs of DOE scientists.
PnetCDF is a high performance, parallel I/O library for storing and accessing data in the NetCDF format.
ROMIO is a portable implementation of the I/O portion of the MPI standard, included in most vendor MPI implementations.
Artificial Intelligence / Machine Learning
DeepHyper is a scalable, open-source software package for automated machine/deep learning. It comprises two components: Neural Architecture Search (NAS) for fully-automated search for high-performing deep neural network architectures; Hyperparameter Search (HPS) for optimizing hyperparameters for a given reference model.
AutoMOMML is an end-to-end, machine-learning-based framework to build predictive models for objectives such as performance, and power. The framework adopts statistical approaches to reduce the modeling complexity and automatically identifies and configures the most suitable learning algorithm to model the required objectives based on hardware and application signatures.