Innovations in Open Science (IOS) Planning Workshop: Community Expectations for a Geoscience Data Commons
Community expectations for a Geoscience Research Data Commons to support open science-based discovery
1:30 pm – 11:00 am MDT
This workshop will focus on developing community requirements to modernize community-accessible data science infrastructure, to better connect our geoscience datasets with geoscience-focused analytics environments, and to support researcher needs in meeting data sharing expectations. The purpose of a “geoscience data commons” resource would be to allow a broad and diverse user community to share, integrate, analyze, and visualize geoscience research data to drive scientific discovery.
Scope of Workshop
This two and one-half day workshop will bring together a diverse group of 60 individuals and subject-matter experts from multiple stakeholder groups to identify and outline requirements for a Geoscience Research Data Commons environment.
- Geoscience researchers and graduate students (weighted towards the Atmospheric, Hydrologic, Geospace, and Oceanic sciences) from Universities, including Minority Serving Institutions (MSIs) and Historically Black Colleges and Universities (HBCUs).
- Technology experts, including those with an interest in Machine Learning/Artificial Intelligence (ML/AI) capabilities.
- Open source community representatives, such as PANGEO, and leading data repositories.
Workshop participants will be led through breakout sessions to develop requirements and recommendations for key topics that are central to the creation, operation, and sustainability of a Geoscience Research Data Commons. Potential topics for discussion include:
- Technical capabilities – Domain-specific needs for repository and analytics systems and services supported by dedicated hardware, software, AI solutions, and staff.
- Domain-specific needs for data curation and technical consulting services.
- Governance framework and stakeholder interactions – Policies and procedures for development, operations, long-term sustainability, and how to interact with targeted stakeholders. For example: who would be eligible to deposit data in such a resource, and how does that process work?
The workshop will produce a report using information from the breakout sessions that will outline Geoscience Research Data Commons requirements. The report will include recommendations in the following three areas:
- Gaps that a Research Data Commons environment could fill to support community needs for data curation, data analytics, and software-sharing capabilities.
- End-user expectations for a Research Data Commons environment.
- Guidance for governance and stakeholder interactions.
The final report will be published as an NSF NCAR technical report and also be made available via the workshop project website.
- Doug Schuster, NSF NCAR CISL/ISD, Data Management/Open Science
- John Clyne, NSF NCAR CISL/TDD, Data Analytics/Open Science
- Matt Mayernik, NSF NCAR Library, Open Science
- Taysia Peterson, NSF NCAR CISL
- Teagan King (AS-II, NSF NCAR CGD, NSF NCAR Earth System Data Science Initiative climate model outputs)
- Teagan King is an Associate Scientist II at NCAR. As a Climate & Global Dynamics Representative on the Earth System Data Science Leadership Committee, she leads efforts to enable open science across NCAR labs. She also has expertise in publishing datasets to the Climate Data Gateway as well as supporting the data user community and is heavily involved in the transition of CESM and CGD datasets to the Research Data Archive.
- Shanice Bailey (PHD Candidate, Columbia University, Dep of Earth and Environmental Sciences, Ocean Transport Group)
- Shanice Bailey is a PhD candidate at Columbia University. Her dissertation work includes applying water mass transformation theory to studying the physical oceanography of different water masses using ocean and climate model data. She has also been involved in numerous teaching, mentoring and outreach projects; such as, the HGS-NSBP mentoring program, PyClub program, and co-instructing some data science workshops that rely on Software Carpentry curricula.
- Douglas Rao (Research Scientist, North Carolina Institute for Climate Studies, ML/AI background)
- Douglas Rao is a research scientist with North Carolina Institute for Climate Studies and affiliated with NOAA National Centers for Environmental Information. His research focuses on leveraging innovative technologies to enhance the value of climate data for impact studies for ecosystems and environmental health. He currently serves as the Vice President for Earth Science Information Partners and leads a cluster in developing community standards for AI-ready open data.
- Jacquie Witte (Project Scientist II, NSF NCAR EOL, Field Campaign Observations)
- Scientist and Data Manager for EOL’s In-Situ Sensing Facility. Representing the EOL Field Data Archive which contains atmospheric, meteorological, and other geophysical datasets from operational sources and the scientific research programs and projects for which NCAR/EOL provides data management support.
- Michael Bell (Professor, Colorado State University Department of Atmospheric Science, CIF facility provider, Mesoscale model outputs)
- Professor Michael Bell has expertise in tropical weather and climate, field observations, and remote sensing. He is the principal investigator for the CSU SEA-POL Sea-Going and Land-Deployable Polarimetric radar, which is an NSF Community Facility. He is also one of the leads of the Lidar Radar Open Software Environment (LROSE) project in collaboration with NCAR EOL.
- Scott Collis (Dept Head and Director of the Argonne Testbed for Multiscale Observational Science), Argonne National Laboratory, Radar meteorology, Open Source science, instrumentation, Edge computing, Science communications)
- Scott is the head of the Geospatial Computing, Innovations, and Sensing (GCIS) department in the Environmental Science Division at Argonne National Laboratory and a Senior Fellow at the Northwestern Argonne Institute of Science and Engineering (NAISE). Scott is the Translator for precipitation radars and a workforce development coordinator for the Atmospheric Radiation Measurement (ARM) Facility, a multi-laboratory, U.S. Department of Energy (DOE) scientific user facility, and a key contributor to national and international climate research efforts. He is also the Director of the Argonne Testbed for Multiscale Observational Science, ATMOS. ATMOS is a field site on the Argonne campus where new technologies, methods and approaches are developed for understanding our planet from bedrock to stratosphere. Scott is the Science Lead for the Python-ARM Radar Toolkit, an architecture for interacting with radar data in the Python programming language.
- Angel Alos (Manager, Copernicus Climate and Atmospheric Data Store, ECMWF, CI background)
- Dr. Angel Lopez leads the Team in charge of the operational implementation of the Copernicus Climate and Atmosphere Data Store (CADS) Infrastructure at ECMWF. Before joining ECMWF, he was a member of the Data Specifications Team for the EU INSPIRE Directive at the EU Joint Research Centre (JRC). Currently he also co-chairs the OGC Climate Resilience DWG.
- Francis Tuluri (Professor, Jackson State University, Data Science/Engineering and MSI perspective)
- Francis Tuluri has a long standing research and teaching experience in a wide range of science and technology areas such as Air Quality Analytics and Prediction using Machine Learning Modeling; Air Quality and Environmental factors impact on Health; GIS and Air Quality Analytics, Cyber-based Physical Systems; Internet of Things (IoT) and Machine Learning; AI and Autonomous Robotics;. Published about 50 papers in peer reviewed journals of national and international reputation. In particular, served as a faculty mentor/investigator in summer outreach programs for developing mini projects for undergraduate students, and K12 students.
Please direct questions/comments about this page to: