The Theory, Science and “Art” of Data Assimilation for Numerical Weather Prediction
Data assimilation is a powerful technique that is widely used for Earth system applications (atmosphere, ocean, sea ice, land, waves) to combine observations with a numerical model to produce a forecast that is better than the observations or model alone. It is a multi-disciplinary science, combining elements of earth system science, remote sensing and instrumentation, applied mathematics, computer science, and electrical engineering.
Data assimilation for NWP is a sequential process that mathematically combines current observations with a model forecast (or background), along with their respective error estimates, to obtain the best estimate of the current model state (e.g. atmosphere). These analyses (initial conditions) have numerous applications, but are most often used to initialize the NWP model, including the short-range forecast for the next data assimilation cycle. Many of the advancements in the skill of weather forecasting over the past two decades can be attributed to improvements in data assimilation, to the global observing system (especially satellites), and to the effective assimilation remotely sensed observations.
In this presentation, I highlight the challenges of developing and implementing operational data assimilation systems for NWP. I first summarize the underlying theory and science of modern data assimilation systems. I next describe the “art” of data assimilation, much of which involves estimating the observation and background error covariances (as truth is not known), selecting a “good mix” of observations, and balancing system complexity against computational timing constraints. The background error covariances are especially important, as they govern the spread the observation information to the surrounding model grid points and variables, and partially control the relative weights given to the observations and the background. These background error covariances are estimated through various techniques, including static parameterizations that include multivariate balances found in nature, and ensemble estimates that include the situation- or flow-dependent “errors of the day”. Deciding on the optimal or “hybrid” combination of these background error statistics requires extensive tuning, and ultimately a final judgment call.
Finally, I will highlight the challenges for the next decade. In particular, data assimilation for high resolution global cloud-resolving or convective-permitting models requires development of computationally efficient methods to include non-Gaussian error distributions, allow for more nonlinearity and include additional probabilistic information from ensembles.