2015 SEA conference focuses on Python

By Brian Bevirt
05/07/2015 - 12:00am

The UCAR Software Engineering Assembly (SEA) is a community for computational scientists and software engineers working on scientific computing that promotes their professional development and advocates effective software engineering practices. Its yearly technical conference fosters internal and external collaborations by exploring software engineering methods and tools that focus on scientific disciplines.

For its 2015 conference held on 13-17 April, UCAR’s Software Engineering Assembly (SEA) dedicated all 9 programming tracks to applications of the Python programming language for scientific computing. Previous SEA conferences have included Python tutorials and presentations, but its increasing utility and popularity in software engineering for science convinced the SEA Seminar Committee chaired by Davide DelVento (NCAR-CISL) to develop the entire conference around it. This decision was rewarded in several ways. The call for speakers produced more offers for presentations than would fit into the week-long conference. Next, participant interest was unprecedented: registrations filled the meeting space more than a week before opening day. Another measure of the strong interest in scientific Python arose when a speaker from Texas A&M University could only be scheduled for a Friday afternoon – usually an unpopular time with participants – but Ramalingam Saravanan’s presentation “Data analysis and visualization using the Python notebook interface” filled the meeting room.

Plenary session, SEA conference
Julianne Blomer (UCAR Integrated Information Services and SEA Best-Practices Committee Chair) introduces the conference’s welcome speaker Jim Hurrell (NCAR Director, standing by far wall) to the full auditorium at NCAR’s Foothills Lab campus. The five-day conference attracted 156 participants from universities, colleges, the software industry, national computing centers, government labs, and international meteorology centers. Photo by Brian Bevirt, CISL
Davide DelVento, Program chair
Davide DelVento managed the program for the 2015 SEA conference. —Photo by Marijke Unger, CISL

Python was first released in 1991 by Guido van Rossum, who developed it as a general-purpose, high-level programming language. Its streamlined, easy-to-read syntax allows programmers to express concepts in very few lines of code and therefore shortens project development times. An open-source language that allows anyone to contribute to its development, Python is also very adaptable, supporting projects of any size through multiple programming paradigms (the various ways programmers structure their codes), including object-oriented, imperative, procedural, and functional programming.

Many scientific computing tools and libraries are now commonly shared through Python, and it has quickly become a primary language for scientific data analysis programs. Davide DelVento added, “its popularity continues increasing because it is easy to use, learn, and teach, and there are so many useful, freely available, well-developed tools and libraries that support science applications and data analysis. Being an open source language is a key reason why Python is so widely used in scientific research: open sharing of information is fundamental to the way science works. Python allows researchers to share their codes across platforms, allowing others not only to reproduce their results, but also to improve or evolve the codes and advance the science.”

The Python language is streamlined, powerful, and easy enough to learn that it allows researchers to build or modify the specialized tools they need for their studies. Fernando Perez (University of California at Berkeley) introduced another capability to conference participants: it supports scientists’ ability to publish research results through executable books, papers, blogs, and wikis. This is an unprecedented new capability that allows reviewers anywhere to quickly and easily access research data and source code inside a browser for reviewing each other’s work and reproducing their results.

Fernando Perez, keynote speaker
Fernando Perez delivered the keynote presentation for the 2015 SEA conference, “IPython and Project Jupyter: A language-independent architecture for open computing and data science.” Photo by Brian Bevirt, CISL

Perez works with a large team of collaborators to develop two resources that support executable publications: IPython and Jupyter. As the keynote speaker for the 2015 SEA conference, he presented “IPython and Project Jupyter: A language-independent architecture for open computing and data science,” and he outlined how these open source tools are now offering a transformative path toward scientific publishing and science education. Perez received the 2012 Award for the Advancement of Free Software for creating IPython and his contributions to the Open Source Scientific Python ecosystem.

IPython is an open format for sharing, publishing, and archiving research and data. Its interactive computing environment allows a programmer or researcher to run experiments, produce results in real time, and display data in many ways. IPython’s executable publication component is called a “notebook” because it is the computing equivalent of a scientist’s lab notebook – a “computational notebook” environment in which researchers anywhere on the web can publish novel computer code and run it immediately in their notebook environment. The IPython home page provides links to instructions for getting started and a gallery of interesting examples. The IPython home page also provides information about how Project Jupyter grew out of it, along with links to help you start using these rich resources. Additional useful information appears on the Talks by Fernando Perez web page.

The SEA conference offered 37 more presentations and tutorials about how Python is being used to advance the scientific research enterprise. The five-day program provided descriptions, instruction, and examples of how Python programs and tools are producing technical advancements in climate modeling, weather prediction, data analysis and visualization, data mining, data transfer, database management, scientific field projects in remote locations, supercomputer performance monitoring and evaluation, parallel code debugging and optimization, software carpentry, and data carpentry. The program summary for the 2015 SEA conference provides slides and videos from these presentations, along with information about all presenters.

Sean Fisk
Sean Fisk worked on a Python project as a SIParCS intern at NCAR in 2013, and he returned in 2015 to present the Master's degree work he is currently doing with Python. Photo by Marijke Unger, CISL

One of the speakers was Sean Fisk (Grand Valley State University), who first came to NCAR in 2013 for CISL’s SIParCS program. His SIParCS project required him to write an RPython-based solver for partial differential equations, and he is continuing to work with Python for his Master’s degree in Computer Science. His SEA tutorial explained his process of converting a simple Python script into a package that integrates documentation, unit testing, and code quality assurance.

A three-day tutorial on Software Carpentry was presented by Jonah Duckles (University of Oklahoma) and Chris Hamm (University of Kansas). Software Carpentry helps scientists and engineers get more research done in less time and with less pain by teaching them basic lab skills for scientific computing. Their hands-on workshop covered basic concepts and tools, including program design, version control, data management, and task automation. Duckles and Hamm’s tutorial began with using the Unix shell to automate tasks, then taught participants to build programs with Python using Git for version control, and concluded by helping them manage their data with SQL.

Chris Hamm, Software Carpentry tutorial
Chris Hamm explains ways to use the Unix shell bash for task automation during the Software Carpentry tutorial on 15 April. Photo by Marijke Unger, CISL         

A two-day tutorial on Data Carpentry was presented by Leah Wasser (NEON) and Mariela Perignon (University of Colorado at Boulder). Data Carpentry, a partner of Software Carpentry, teaches basic concepts, skills, and tools for working more effectively with data. In many research domains, the rapid generation of large amounts of data is fundamentally changing how the research is done. This deluge of data presents great opportunities, but also many challenges for managing, analyzing, and sharing it. Data Carpentry aims to teach the skills that will enable researchers to be more productive by focusing on effective manipulation, visualization, and analysis techniques using Unix shells, Python or R languages, and database management using Excel and SQL. The goal of Wasser and Perignon’s tutorial is for researchers to retrieve, view, manipulate, analyze, and store their and others’ data in open and reproducible ways.

Leah Wasser, Data Carpentry tutorial
Leah Wasser uses IPython notebook technology to teach her Data Carpentry tutorial on 15 April. Photo by Marijke Unger, CISL         

As conference chair, DelVento thanks all the speakers, tutorial instructors, participants, and sponsors who made this event such a success. He also gives special thanks to Jennifer Williamson (NCAR-CISL) and Scott Briggs (NCAR-ASP) for providing administrative support, logistical planning, registration, and website support. In addition, they organized travel and managed all of NCAR’s on-site arrangements, which included the facility, meeting rooms, catering, and room setups.

The SEA Organizing Committee also thanks the National Science Foundation (NSF), whose support made it possible to reduce the registration fee to a very modest amount. Finally, CISL’s RSVP program and NCAR’s Advanced Study Program provided travel and other support for 16 conference speakers and the 12 diversity students. The SEA Executive Committee is already looking forward to planning next year’s conference.