Data Understanding

  • DIY is a block-parallel library for implementing scalable algorithms that can execute both in-core and out-of-core.
  • GraphBLAS is an open effort, including an API, to define standard building blocks for graph algorithms in the language of linear algebra.
  • ParaView is an open-source multiple-platform application for interactive, scientific visualization. Catalyst, its in situ use case library, orchestrates the delicate alliance between simulation, analysis, and visualization tasks.
  • Tess is a parallel Delaunay and Voronoi tessellation library. It includes support for density estimation.
  • VisIt is an open-source interactive, scalable visualization, animation, and analysis tool. libsim enables its use in situ with the simulations.
  • VTK-m is a toolkit of scientific visualization algorithms for emerging processor architectures. It supports the fine-grained concurrency for data analysis and visualization algorithms required to drive extreme scale computing.
  • EDEN is an open-source multivariate visual analytics system that provides statistical analytics, multi-scale summarizations, and correlation displays to guide user's to interesting features in complex scientific data.
  • AutoMOMML is an end-to-end, machine-learning-based framework to build predictive models for objectives such as performance, and power. The framework adopts statistical approaches to reduce the modeling complexity and automatically identifies and configures the most suitable learning algorithm to model the required objectives based on hardware and application signatures.

Platform Readiness

  • Roofline is a visually-intuitive performance model and set of tools developed to understand how computation, data movement, and locality constrain performance on modern multicore, manycore, and GPU-accelerated systems.
  • TAU Performance System is a portable profiling and tracing toolkit for performance analysis of parallel programs written in Fortran, C, C++, UPC, Java, Python, and others. SOSflow (Scalable Observation System for Scientific Workflows) provides a flexible, scalable, and programmable framework for observation, introspection, feedback, and control of HPC applications.

Scientific Data Management

  • Dataspaces
  • Darshan is a toolkit for characterizing the I/O behavior of applications, used in production at many DOE compute facilities.
  • FastBit
  • HDF5
  • PnetCDF is a high performance, parallel I/O library for storing and accessing data in the NetCDF format.
  • ROMIO is a portable implementation of the I/O portion of the MPI standard, included in most vendor MPI implementations.