Object Storage Systems and Their Effects on Future Workflows

10/21/2020 - 1:00pm to 2:00pm

Recording from October 21, 2020


Bill Anderson, Bob Dattore, Riley Conroy
CISL/National Center for Atmospheric Research



Filesystems have been the predominant interface for storing and accessing data for decades and have been used in CISL's HPC environment from the early systems to GLADE.  Defined as part of a family of standards called POSIX (Portable Operating System Interface), filesystems are hierarchical containers of directories and files, with limited metadata associated with each file. While filesystems have served CISL and many other communities well, there are some limitations.

Due to consistency and locking requirements, filesystems have some scalability limitations. Also, there is generally minimal metadata that can be associated with files in POSIX, unless the metadata is embedded in the file data itself. Finally, driven by private and public clouds like AWS, Azure, and Google, object storage systems are becoming more common for some use cases.

Object storage systems represent a different way of storing data that mitigate the limitations of POSIX filesystems. However, object storage systems have their own limitations so there are good use cases for both types of systems.  In this seminar, we'll provide a detailed description of object storage systems, and discuss good use cases for both types of storage systems.  Finally, we'll share details and results from an early use of CISL's object storage system to serve data to the community.



Bill Anderson is a system engineer in the High Performance Computing Division in CISL. He has over 20 years of experience in high performance computing and has an interest in harnessing novel storage and compute technologies to meet the needs of NCAR's scientific community. He holds a bachelor's degree in physics and a master's degree in computer science.

Bob Dattore is a Software Engineer in CISL's Information Systems Division. He has worked in the Data Engineering & Curation Section for 29 years. In his early days, he worked on data quality and preparing observational datasets for prominent global reanalysis projects. More recently, he has focused on dataset and file metadata, and leveraging that metadata to improve data curation and drive data access services.

Riley Conroy is a Software Engineer in the Information Systems Division within CISL and has been testing the capabilities of CISL's object storage system in a production environment. Riley has worked with the Data Engineering an Curation Section for two years and previously worked in creating satellite data products and statistical post processing of model output within NOAA.

October 21, 2020
1-2pm MT

