Physical oceanography is a data intensive science with increasingly large amounts of data generated from observations and numerical models. With high resolution and detailed datasets comes the expectation of improved understanding of our oceans and climate. However, it is a challenge for the ocean and climate community to benefit from this newfound richness of data. Traditional software tools and approaches do not scale well enough to deal with the volume of data produced. Individual researchers, both established and early in their careers, may lack the technical skills to be able to work with such immense datasets. My research is about discovering new methods to enable these researchers to work with big data in physical oceanography.
Across many scientific disciplines, there has been a surge of interest in data science, big data, and data driven discovery as a novel paradigm in science building on more established methods of observation and experiments, theory, and computational simulation. This new approach has generated frameworks extensively used in fields such as bioinformatics, high-energy particle physics, and astronomy. I am exploring how such a data intensive approach could benefit physical oceanography.
Numerical ocean models produce extreme amounts of computational output. Global ocean circulation models at tenth degree resolution are currently being used and there are plans over the next few years to develop models at thirtieth-degree resolution. Such simulations are typically run for several decades of model time generating a significant amount of data. There is an ongoing demand from the ocean modelling community for new methods to be able to analyze efficiently such output. By leveraging computational frameworks from other scientific disciplines, I am developing the new software frameworks that meets that demand.
A historical challenge in oceanography has been a limited amount of observational data. The ocean is effectively opaque to electro-magnetic radiation and in-situ measurements usually require very expensive ship based observations. Recently, there has been a significant increase in data richness as new initiatives such as cabled ocean observatories and autonomous sensor platforms are deployed. Programs such Ocean Networks Canada and the Ocean Observatory Initiative in the U.S. are beginning to produce unprecedented amounts of oceanographic data at very high resolutions. I am pursuing new techniques and technologies to perform data driven discovery with large, varied oceanographic datasets.
My research program seeks to enable physical oceanographers to efficiently gain value from their large datasets obtained from both numerical models and observations. I am proposing to demonstrate the efficacy of these new data intensive methods with the goal of minimizing the time required in training graduate students and reducing the effort required by other researchers to make novel discoveries from oceanographic datasets. Importantly, the development of these new methods and software tools is not done in isolation. In practice, my research is necessarily collaborative which maximizes its overall impact.