RAPIDS

SciDAC Institute for Computer Science and Data

"Shen, Han-Wei"

About RAPIDS

The Department of Energy (DOE) has led the world in enabling large-scale high-performance computing as an indispensable tool for scientific discovery in a broad range of disciplines. Our newest generation of systems is not simply larger than past systems — it brings distinct and novel challenges. These challenges arise from ever-increasing core counts on nodes; the deepening of the memory hierarchy and inclusion of fast, nonvolatile storage within the HPC system; the widening ratio of the FLOPS to I/O bandwidth; and the increasing ubiquity of computation accelerators. The data produced by these simulations on these high-end systems has also increased by orders of magnitude in size and complexity. This continuing explosion of data mandates constant attention and improvement to the tools and techniques used to manage and analyze this data. Moreover, the breadth of science performed on DOE advanced computing resources is growing and new motifs of investigation are emerging, including those involving analysis of experimental and observational data.

The objective of RAPIDS is to assist Office of Science (SC) application teams in overcoming computer science and data challenges in the use of DOE supercomputing resources to achieve science breakthroughs. To accomplish this objective, the Institute will solve computer science and data technical challenges for SciDAC and SC science teams, work directly with SC scientists and DOE facilities to adopt and support our technologies, and coordinate with other DOE computer science and applied mathematics activities to maximize impact on SC science.

RAPIDS will be organized around four primary thrust areas. The cornerstone of the project is the application engagement and community outreach thrust area. A rigorous plan for engagement through direct interactions, as well as outreach to connect with the broader community, will allow us to have the greatest and most immediate impact. Software technologies for computation, information, and data science are central to the success of SciDAC and SC science on leadership computing platforms. Our three technology thrust areas — data understanding, platform readiness, and scientific data management — focus on key challenges in these areas, bringing to bear mature technologies and enhancing these technologies to adapt to new science requirements and new platforms. Technology thrust area participants will engage directly with application teams to deploy these technologies and to glean insights on how to further improve these tools.