2018 Climate Informatics Workshop

Sep. 19 to Sep. 21, 2018

10:30 am – 3:45 pm MDT

NCAR Mesa Lab

To view the Proceedings from the 8th International Workshop on Climate Informatics, please click HERE.

About Climate Informatics

We have greatly increased the volume and diversity of climate data from satellites, environmental sensors and climate models in order to improve our understanding of the climate system. However, this very increase in volume and diversity can make the use of traditional analysis tools impractical and necessitate the need to carry out knowledge discovery from data. Machine learning has made significant impacts in fields ranging from web search to bioinformatics, and the impact of machine learning on climate science could be as profound. However, because the goal of machine learning in climate science is to improve our understanding of the climate system, it is necessary to employ techniques that go beyond simply taking advantage of co-occurence, and, instead, enable increased understanding. 

The Climate Informatics workshop series seeks to build collaborative relationships between researchers from statistics, machine learning and data mining and researchers in climate science.  Because climate models and observed datasets are increasing in complexity and volume, and because the nature of our changing climate is an urgent area of discovery, there are many opportunities for such partnerships.

Climate informatics broadly refers to any research combining climate science with approaches from statistics, machine learning and data mining. The Climate Informatics workshop series, now in its eighth year, seeks to bring together researchers from all of these areas. We aim to stimulate the discussion of new ideas, foster new collaborations, grow the climate informatics community, and thus accelerate discovery across disciplinary boundaries. The format of the workshop seeks to overcome cross-disciplinary language barriers and to emphasize communication between participants by featuring tutorials, invited talks, panel discussions, posters and break-out sessions. We invite all researchers interested in learning about critical issues and opportunities in the field of climate informatics to join us, whether established in the field or just starting out.

The conference logo image is courtesy of Michael Tippett. Colors show deviations of sea-surface temperatures from their climatological values in the equatorial Pacific from January 1997 to April 2000 with time going counter-clockwise.

Confirmed invited speakers:

Gustau Camps-Valls, Universitat de Valencia 

Unsupervised Deep Feature Learning with Sparse Codes and Gaussianization

In this talk I'll review our latest works on unsupervised feature extraction with hierarchical (deep) representations. Two different pathways will be taken: a phisiologically /meaningful/ representation with deep convolutional neural networks where both population and lifetime sparsity is imposed, and a /meaningless/ (projection pursuit like) cascade representation where the goal is to transform the data into a multivariate Gaussian. I will discuss about the advantages and shortcomings, and illustrate their performance in synthetic and real problems of image segmentation, data classification, synthesis, and information estimation.

Julien Emile-Geay, University of Southern California

Paleoclimate informatics: enabling knowledge discovery about past climates

Climate exhibits scaling behavior, which means that fluctuations increase in amplitude with the timescale. Some of the most scientifically and societally-relevant fluctuations occurred before the short instrumental record beginning ca 1850, making their study impossible with such observations. Paleoclimate (pre-instrumental) observations are thus a critical window into this behavior, but pose a unique set of challenges to the analyst: they are indirect, typically sparse and noisy, and come from incredibly diverse archives, impeding a one-size-fits-all approach.

Until recently, integrating all these data sources into a coherent picture was impossible. In this talk, I will review recent progress in paleoclimate informatics, including paleo data semantics, data assimilation and graph theory — all the product of joint work with Y. Gil, L. Bradley, N. McKay, D. Guillot and the Last Millennium Reanalysis collective. I will argue that these advances now enable thoughtfully applied machine learning algorithms to play a useful role in knowledge discovery, with the potential to advance the study of past climates.  

Lucas Joppa, Microsoft, AI for Earth

AI for Earth

The speed and scale at which climate systems are changing, and the enormity of the human impact of those changes, requires a commensurate response in how society monitors, models, and manages climate systems. A key component to that response will emerge from the fundamentals of AI – transforming how we collect data, convert those data into actionable information, and communicate that information across the world. By training increasingly sophisticated algorithms with this unprecedented collection of data on dedicated computational infrastructure, we can combine human and computer intelligence in a way that will allow us to make increasingly informed and optimal choices about today – and tomorrow.

Eric Maloney, Colorado State University

Critical challenges in the simulation of tropical clouds and climate

Realistic simulations of current climate and projections of future climate change require that the treatment of clouds and moist convection in models be realistic. However, many climate models exhibit substantial biases in various aspects of the climate system that depend on realistic cloud parameterizations, which affects confidence in their ability to simulate future climate changes. Indeed, cloud feedbacks are one of the greatest sources of uncertainty in projections of future climate. This talk will highlight some tropical biases that are related to treatment of cloud parameterizations including in the simulation of the Madden-Julian oscillation, low clouds in colder tropical regions, and monsoon systems. The rectification of these biases onto aspects of climate such as the global mean energy budget will be discussed, especially as they affect future climate projections. Community efforts to diagnose these biases by entraining observations into process-oriented diagnostic frameworks will be highlighted, including a recent effort by the NOAA Model Diagnostics Task Force. New modeling approaches to parameterization of clouds in climate models will be discussed that provide great promise to mitigate tropical biases, and the talk will close by discussing some initial attempts to use machine learning to parameterize cloud processes in models.

Christopher Wikle, University of Missouri

Using parsimonious “deep” models for efficient implementation of multiscale spatio-temporal statistical models applied to long-lead forecasting

Spatio-temporal data are ubiquitous in engineering and the sciences, and their study is important for understanding and predicting a wide variety of processes. One of the chief difficulties in modeling spatial processes that change with time is the complexity of the dependence structures that must describe how such a process varies, and the presence of high-dimensional complex datasets and large prediction domains. It is particularly challenging to specify parameterizations for nonlinear dynamical spatio-temporal models that are simultaneously useful scientifically and efficient computationally. One potential parsimonious solution to this problem is a method from the dynamical systems and engineering literature referred to as an echo state network (ESN). ESN models use so-called reservoir computing to efficiently compute recurrent neural network (RNN) forecasts. Moreover, so-called ``deep" models have recently been shown to be successful at predicting high-dimensional complex nonlinear processes, particularly those with multiple spatial and temporal scales of variability (such as we often find in spatio-temporal geophysical data). Here we introduce a deep ensemble ESN (D-EESN) model in a hierarchical Bayesian framework that naturally accommodates non-Gaussian data types and multiple levels of uncertainties. The methodology is first applied to a data set simulated from a novel non-Gaussian multiscale Lorenz-96 dynamical system simulation model and then to a long-lead United States (U.S.) soil moisture forecasting application.

Qi (Rose) Yu, Northeastern University

Deep Learning for Large-Scale Spatiotemporal Data

In many real-world applications, such as climate science, transportation and physics, machine learning is applied to large-scale spatiotemporal data. Such data is often nonlinear, high-dimensional, and demonstrates complex spatial and temporal correlations. Deep learning provides a powerful framework for feature extraction, but existing deep learning models are still insufficient to handle the challenges posed by spatiotemporal data.

In this talk, I will show how to design deep learning models to learn from large-scale spatiotemporal data. In particular, I will present our recent results on 1) High-Order Tensor RNNs for modelling nonlinear dynamics, and 2) Diffusion Convolutional RNNs for modelling spatiotemporal patterns, applied to real-world climate and traffic data. I will also discuss the opportunities and challenges of applying deep learning to large-scale spatiotemporal data.

Sponsored by


National Science Foundation Logo

With additional sponsorship provided by:


Jupiter Logo


The CI2018 Hackathon will be held on September 19th, prior to the start of the workshop. We encourage all workshop participants to attend the Hackathon. The format will be similar to the CI2017 Hackathon.

The Hackathon will be run using the RAMP tool developed by the Paris-Saclay Center for Data Science under Python.

The goal for the CI2018 Hackathon will be to predict hurricanes evolution (24h forecast) using collected data from all past hurricanes (since 1979).  This challenge proposes to design the best algorithm to predict for a large number of storms the 24h-forecast intensity. The (real) database is composed of more than 3000 extra-tropical and tropical storm tracks, and it also provides the intensity and some local physical information at each timestep. Moreover, we also provide some 700-hPa and 1000-hPa feature maps of the neighborhood of the storm (from ERA-interm reanalysis database), that can be viewed as images centered on the current storm location.

Please see the Agenda for logistics, and contact Sophie Giffard-Roisin with any questions you have. The starting kit and the GitHub repository will be available soon for download!

Organizing Committee


Dan Cooley, Colorado State University 

Eniko Szekely, Swiss Data Science Center, EPFL/ETH

PC co-chairs:

Chen Chen, University of Chicago

Jakob Runge, German Aerospace Center

Communications Co-Chairs:

Soukayna Mouatadid, University of Toronto

Matt Staib, Massachusetts Institute of Technology 

Travel and Budget Chair:

Andrew Finley, Michigan State University

Hackathon Co-Chairs:

Sophie Giffard-Roisin, University of Colorado Boulder

David John Gagne II, National Center for Atmospheric Research

NCAR science contact:

Dorit Hammerling, National Center for Atmospheric Research

NCAR staff support:

Cecilia Banner, National Center for Atmospheric Research

Elizabeth Faircloth, National Center for Atmospheric Research

Steering committee:

Claire Monteleoni, University of Colorado Boulder

Imme Ebert-Uphoff, Colorado State University 

Doug Nychka, Colorado School of Mines

Program Committee members:

Conrad Albrecht, IBM

Niklas Boers, Potsdam Institute for Climate Impact Research

Won Chang, University of Cincinnati

Chen Chen, University of Chicago

Joachim Denzler, University Jena

Wei Ding, University of Massachusetts Boston

Andre Richard Erler, University of Toronto

Seth Flaxman, Imperial College of London

David John Gagne II, National Center for Atmospheric Research

Bedartha Goswami, Potsdam Institute for Climate Impact Research

Jaroslav Hlinka, Institute of Computer Science, Academy of Sciences of the Czech Republic

Karthik Kashinath, Lawrence Berkeley National Laboratory

Marlene Kretschmer, Potsdam Institute for Climate Impact Research

Mikael Kuusela, University of North Carolina at Chapel Hill

Peter Jan Van Leeuwen, University of Reading and NCEO

Jascha Lehmann, Potsdam Institute for Climate Impact Research

Bo Li, University of Illinois at Urbana-Champaign

Stefan Liess, University of Minnesota

Norbert Marwan, Potsdam Institute for Climate Impact Research

Patrick McDermott, University of Missouri

Peer Nowack, Imperial College

Scott Osprey, University of Oxford

Jakob Runge, German Aerospace Center

Savini Samarasinghe, Colorado State University

Joanna Slawinska, University of Wisconsin-Milwaukee

Brian Smoliak, WindLogics

Eniko Szekely, Swiss Data Science Center, EPFL/ETH

Pierre Tandeo, IMT-Atlantique

Michele Volpi, Swiss Data Science Center

Jakob Zscheischler, ETH Zurich

Accepted Papers


Calibration of probabilistic ensemble forecasts for Indian summer monsoon rainfall: a non-gaussian approach

Nachiketa Acharya


Towards Generative Deep Learning Emulators for Hydroclimate Simulations: Application to Snowpack Modeling

Adrian Albet, Alan Rhoades, Daniel Feldman, Andrew Jones, Sangram Ganguly and Mr Prabhat


Modeling Dynamical Systems with PDE inspired Neural State Space Models

Ibrahim Ayed, Emmanuel de Bézenac, Arthur Pajot and Patrick Gallinari


Modelling socio-economic impacts of weather and climate with climada

Gabriela Aznar Siguan and David N. Bresch


Deep Convolutional Networks for Feature Selection in Statistical Downscaling

Jorge Baño-Medina and Jose Manuel Gutiérrez


An AI Approach To Determining Time of Emergence of Climate Change

Elizabeth A. Barnes, Chuck Anderson and Imme Ebert-Uphoff


Emulating Earth System Model Temperatures

Lea Beusch, Lukas Gudmundsson and Sonia Seneviratne


A Statistical Model to Predict the Extratropical Transition of Tropical CyclonesAccepted

Melanie Bieli, Adam H. Sobel and Suzana J. Camargo


Identifying clustered weather patterns using a deep convolutional neural network: a test case

Ashesh Chattopadhyay and Pedram Hassanzadeh


Studying Extremal Dependence in Climate Using Complex Networks

Imme Ebert-Uphoff, Whitney Huang, Adway Mitra, Dan Cooley, Singdhansu Chatterjee, Chen Chen and Zhonglei Wang


Fused Deep Learning for Hurricane Track Forecast from Reanalysis Data

Sophie Giffard-Roisin, Mo Yang, Guillaume Charpiat, Balázs Kégl and Claire Monteleoni


On a Technique for Evaluating the Quality of Earth System Models

Kaibo Gong, Snigdhansu Chatterjee and Amy Braverman


Detection of Tundra Lake Patterns Observed in Historical Maps and Satellite Imagery

Ming Gong, Ivan Sudakov and Thao Nguyen


Climate research reproducibility with the climate4R R-based framework

Jose Manuel Gutierrez, Joaquín Bedia, Maialen Iturbide, Sixto Herrera, Rodrigo Manzanas, Jorge Baño-Medina, María Dolores Frías, Daniel San-Martín, Jesús Fernández and Antonio Cofiño


Adversarial Networks for Satellite Simulation and Dataset Translation

David Hall, Jebb Stewart and Craig Tierney


Ensemble Consistency Testing Using Causal Connectivity

Dorit Hammerling, Imme Ebert-Uphoff and Allison Baker


Evaluating proxy influence in data assimilation based climate field reconstructions using data depth

Trevor Harris, Bo Li, Nathan Steiger, Jason Smerdon and Derek Tucker


Land climate prediction using sea surface temperatures

Sijie He, Xinyan Li, Vidyashankar Sivakumar and Arindam Banerjee


Classification of climate for human comfort using unsupervised clustering in Colombia

Roland Hudson and Rodrigo Velasco


Methods to filter weather data using ecological memory functions

Malcolm Itter, Jarno Vanhatalo and Andrew Finley


Physics Guided Recurrent Neural Networks for modeling dynamical systems: application to monitoring water temperature and quality in lakes

Xiaowei Jia, Anuj Karpatne, Jared Willard, Michael Steinbach, Jordan Read, Paul Hanson, Hilary Dugan and Vipin Kumar


Learning Climate Patterns on the Sphere: Convolutional Neural Network on Unstructured Mesh

Chiyu Jiang, Karthik Kashinath and Mr Prabhat


Machine learning @ numerical modeling: decadal climate predictions improved by ensemble dispersion filter

Christopher Kadow, Sebastian Illing, Igor Kröner, Uwe Ulbrich and Ulrich Cubasch


Deep-dust: Predicting concentrations of fine dust in Seoul using LSTM

Sookyung Kim, Jungmin M. Lee, Jiwoo Lee and Jihoon Seo


Deep-Hurricane-Tracker: Tracking Extreme Climate Events

Sookyung Kim, Sangwoong Yoon, Joonseok Lee, Samira Kahou, Hyojin Kim, Karthik Kashinath and Mr. Prabhat


Satellite chlorophyll-a images inpainting

Mélissa Kouassi, Ariel Rimoux, Clément Soriot, Julien Brajard, Anastase Charantonis and Sylvie Thiria


Statistical predictions of climate indices using response-guided causal precursor detection

Marlene Kretschmer, Giorgia DiCapua, Jakob Runge and Dim Coumou


Local Sampling Machine Learning Technique for Refinement of Water Classification

Charles Labuzzetta, Zhengyuan Zhu and Yuyu Zhou


High-dimensional Data Visualization for Climate Model Intercomparison: Application of the Circular Plot

Jiwoo Lee, Peter Gleckler, Kenneth Sperber, Charles Doutriaux and Dean Williams


Seasonal Predictions of Sea Surface Temperature in the Tropical Atlantic using a Deep Neural Network Model Combined with Sparse Canonical Correlation Analysis

Carlos Lima


Toward a topological pattern detection in fluid and climate simulation data

Grzegorz Muszynski, Karthik Kashinath, Vitaliy Kurlin, Michael Wehner and Prabhat Prabhat


Machine learning parameterizations for ozone in climate sensitivity simulations

Peer Nowack, Peter Braesicke, Joanna Haigh, Luke Abraham, John Pyle and Apostolos Voulgarakis


A Toolbox for Causal Discovery in Climate Science

Joseph Ramsey, Kun Zhang, Madelyn Glymour, Ruben Sanchez Romero, Biwei Huang, Imme Ebert-Uphoff, Elizabeth Barnes and Clark Glymour


Topological characterization of shallow cumulus cloud fields using persistent homology

Jose Licon Salaiz, Henri Riihimäki and Thirza W. van Laar


Causal discovery in the presence of latent variables for climate science

Savini Samarasinghe, Elizabeth Barnes and Imme Ebert-Uphoff


Causality analysis of ecological time series: a time-frequency approach

Maha Shadaydeh, Yanira Guanche, Miguel Mahecha, Markus Reichstein and Joachim Denzler


Localized time series modeling of Greenland ice sheet elevation changes

Prashant Shekhar, Beata Csatho, Toni Schenk and Abani Patra


Spatio-temporal climate data causality analytics – an analysis of ENSOS’s global impacts

Hua Song, Jianwu Wang, Jing Tian, Jingfeng Huang and Zhibo Zhang


SupernoVAE: VAE based Kernel-PCA for analysis of spatio-temporal earth data

Xavier-Andoni Tibau, Christian Requena-Mesa, Christian Reimers, Joachim Denzler, Veronika Eyring, Markus Reichstein and Jakob Runge


A Deep Learning Perspective of the Madden-Julian Oscillation

Benjamin A. Toms, Karthik Kashinath, Prabhat, Mayur Mudigonda, Chiyu Jiang and Da Yang


Climate Field Completion via Markov Random Fields

Resherle Verna, Adam Vaccaro, Julien Emile-Geay and Dominique Guillot


Multidisciplinary Education on Big Data + HPC + Atmospheric Sciences

Jianwu Wang, Matthias Gobbert, Zhibo Zhang and Aryya Gangopadhyay


Physics-Informed Generative Learning to Emulate PDE-Governed Systems by Preserving High-Order Statistics

Jinlong Wu, Karthik Kashinath, Adrian Albert, Mr Prabhat and Heng Xiao


Detecting the effects of irrigation on regional temperature extremes

Yeliz Yilmaz


Linking information theory to west Sahel precipitation dynamics

Qing Zhu, Bessie Y Liu, Lei Zhao and Hongxu Ma

Mailing List

Please follow this link to add yourself to the CI 2018 Mailing List: http://climateinformatics.us15.list-manage.com/subscribe?u=4c133efdef1698107f37032b1&id=13f52538a8