Open Source Altimetry

class: center, middle
<br/><br/><br/><br/>
.title[A new open-source gridded altimeter product]
<br/><br/><br/>
.author[Jonathan M. Lilly]
.institution[The Planetary Science Institute, Tucson, Arizona]
<br/><br/>
.date[October 31, 2022]<br/>
<br/>
.note[Created with [{Liminal}](https://github.com/jonathanlilly/liminal) using [{Remark.js}](http://remarkjs.com/) + [{Markdown}](https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet) +  [{KaTeX}](https://katex.org)]

---
class: center
# 
<br/><br/><br/><br/><br/><br/><br/><br/>

If you are reading this as a pdf, kindly [{click here}](http://www.jmlilly.net/talks/lilly22-ostst.html) for the html version in order to view the embedded animations.

---
class: left
# Overview

Gridded sea level anomalies and associated velocities are among the most important oceanographic data products.

The standard (essentially, only) product is the CMEMS/AVISO product, based on Optimal Interpolation, which is not open source.

An extensive review of mapping methods suggests that a different method, local polynomial fitting (a.k.a. loess), may be preferable.

In order to isolate the mapping problem from multi-satellite merging, here we employ only the 30-year Jason-class record.

A sweep through the parameter space of local polynomial fitting with an observing system simulation experiment leads to a best fit.

The best fit parameters, when applied to the Jason-class data, yields maps that rival CMEMS despite using a fraction of the data.

This suggests that the use improved mapping methods could greatly improve altimetric gridded products.

---
class: center
##CMEMS vs. New Jason-Only Product

SLA for 2007.  CMEMS (top) vs. new product (bottom).

---
class: left
##Comparison of the Two Products

The two animations in the previous slide appear quite comparable.

Yet whereas the upper panel incorporates Jason-class, ERS-class, Geosat-class, and other missions such as Saral and Cryosat, the lower panel uses Jason-class only.

In other words, a refinement in the mapping algorithm has led to a gridded product comparable to that of CMEMS despite using a fraction of the data.

This suggests that the current Optimal Interpolation algorithm is suboptimal and is substantially oversmoothing the data.

It also suggests that incorporating all available altimeter data into a refined mapping method would lead to greatly improved  products.

This however is a large task that would require a dedicated effort.

(Note: close inspection shows occasional “stripes” in the lower panel due to long-wavelength error; this should be seen as an upstream data processing issue rather than a mapping issue.)

---
class: left
##Local Polynomial Fitting

The idea of local polynomial fitting is to minimize, within the vicinity of some mapping point $(x\_o,y\_o)$, the weighted squared error $\Delta\_p^2(x\_o,y\_o)$ between the observations and a $p$th-order polynomial.

The $N$ observations are arranged into $N$-vectors, with $\tilde{\mathbf{x}}$ and $\tilde{\mathbf{y}}$ being the $x$- and $y$-locations, and  $\tilde{\mathbf{z}}$ being the observed values.

Then the weighted squared error in the vicinity of point $(x\_o,y\_o)$ is

<div>\[\Delta_p^2(x_o,y_o)\equiv\left\|\mathbf{W}\left(\tilde{\mathbf{x}},\tilde{\mathbf{y}},x_o,y_o\right)\left[\tilde{\mathbf{z}}-\mathbf{X}_p\left(\tilde{\mathbf{x}},\tilde{\mathbf{y}},x_o,y_o\right)\widehat{\bm{\beta}}_p\left(x_o,y_o\right)\right]\right\|^2\]</div>

where $\mathbf{W}$ is a matrix of weights, $\mathbf{X}\_p$ is a matrix of coordinate deviations containing powers up to $p$, and $\widehat{\bm{\beta}}\_p$ is the estimated field and its first $p$ derivatives.

---
class: left
##The Estimate Vector $\widehat{\bm{\beta}}_p$

The vector $\bm{\beta}_p$ contains the mean field $z$ and its first $p$ derivatives.

<div>\[\bm{\beta}_0(x_o,y_o)\equiv \begin{bmatrix} z(x_o,y_o) \end{bmatrix}\]</div>

<div>\[\bm{\beta}_1(x_o,y_o)\equiv \begin{bmatrix} z(x_o,y_o)\\[3pt] \frac{\partial z}{\partial x}  (x_o,y_o)
\\[3pt] \frac{\partial z}{\partial y} (x_o,y_o) \end{bmatrix}\]</div>

<div>\[\bm{\beta}_2(x_o,y_o)\equiv \begin{bmatrix} z(x_o,y_o)\\[3pt] \frac{\partial z}{\partial x}  (x_o,y_o)
\\[3pt] \frac{\partial z}{\partial y} (x_o,y_o)\\[3pt]
\frac{\partial^2z}{\partial x^2}  (x_o,y_o)\\[3pt]
\frac{\partial^2z}{\partial x\partial y}  (x_o,y_o)\\[3pt] \frac{\partial^2z}{\partial y^2} (x_o,y_o) \end{bmatrix}\]</div>

which is of length $Q$, with $Q=1$, 3, and 6 for $p=0$, 1, and 2. The estimate vector $\widehat{\bm{\beta}}_p$ is then our error-minimizing estimate of $\bm{\beta}_p$.

---
class: left
##The Deviation Matrix $\mathbf{X}_p$

The matrix $\mathbf{X}_p$ contains the coordinate deviations necessary for a Taylor expansion up to $p$th order.

<div>\[\mathbf{X}_0(\tilde{\mathbf{x}},\tilde{\mathbf{y}},x_o,y_o) \equiv
\left.\begin{bmatrix} 1 \\  \vdots \\ 1  \end{bmatrix}\right\} N \,\,\mathrm{terms}  \]</div>
<div>\[\mathbf{X}_{1}(\tilde{\mathbf{x}},\tilde{\mathbf{y}},x_o,y_o) \equiv
  \begin{bmatrix}
  1 & \tilde{x}_{1}- x_o  & \tilde{y}_{1} -y_o\\
  1 & \tilde{x}_{2}- x_o  & \tilde{y}_{2} -y_o\\
  \vdots & \vdots & \vdots \\
  1 & \tilde{x}_N- x_o  & \tilde{y}_N -y_o
  \end{bmatrix} \]</div>

<div>\[\mathbf{X}_{2}(\tilde{\mathbf{x}},\tilde{\mathbf{y}},x_o,y_o) \equiv
  \begin{bmatrix}
  1 & \tilde{x}_{1}' & \tilde{y}_{1}'& \frac{1}{2}\tilde{x}_{1}'{^2}  & \tilde{x}_{1}'\tilde{y}_{1}'  &\frac{1}{2}\tilde{y}_{1}'{^2}   \\[3pt]
  1 & \tilde{x}_{2}' & \tilde{y}_{2}'& \frac{1}{2}\tilde{x}_{2}'{^2}    & \tilde{x}_{2}'   \tilde{y}_{2}'  &\frac{1}{2} \tilde{y}_{2}'{^2}  \\
   \vdots & \vdots & \vdots & \vdots & \vdots & \vdots \\
  1 & \tilde{x}_N'  & \tilde{y}_N'& \frac{1}{2} \tilde{x}_N'{^2}  & \tilde{x}_N'   \tilde{y}_N'  &\frac{1}{2}\tilde{y}_N'{^2}
  \end{bmatrix}  \]</div>

In the last expression, we define $\tilde x\_n'\equiv \tilde x\_n - x\_o $ and $\tilde y\_n'\equiv \tilde y\_n - y\_o $.

---
class: left
##The Weighting Matrix $\mathbf{W}$

Finally, the weighting matrix  $\mathbf{W}$ is defined as

<div>\[ \mathbf{W}\left(\tilde{\mathbf{x}},\tilde{\mathbf{y}},x_o,y_o\right)
\equiv \begin{bmatrix}
 K_h\left(\tilde{r}_{1}\right) & 0& \cdots & 0\\
       0        & K_h\left(\tilde{r}_{2}\right) & \cdots &0\\
 \vdots & \vdots & \ddots & \vdots\\
    0    &   0     &   \cdots     & K_h\left(\tilde{r}_N\right)
        \end{bmatrix}\]</div>

where $\tilde r\_n\equiv \sqrt{\left(\tilde{x}\_n-x\_o\right)^2+\left(\tilde{y}\_n-y\_o\right)^2}$ is the distance between the  mapping point $(x\_o,y\_o)$ and the $n$th observation point $(x\_n,y\_n)$.

Here $K_h(r)\equiv K(r/h)$ where $K(r)$ is a symmetric, decaying function called the kernel.  Generally $K(r)$  is set to vanish outside of a prescribed radius, chosen here to be $r=1$.

The rescaled kernel $K_h(r) $ then vanishes outside of radius $r=h$, where $h$ is known as the *bandwidth*.

---
class: left
##The Error-Minimizing Solution

The problem of finding the local polynomial that minimizes the weighted error

can be shown to have the solution

<div>\[\widehat{\bm{\beta}}_{p}\left(x_o,y_o\right)=\left(\mathbf{X}_p^T
\mathbf{W}\mathbf{X}_p\right)^{-1} \mathbf{X}_p^T  \mathbf{W}  \tilde{\mathbf{z}}\]</div>

the heart of which involves inverting the $Q\times Q$ matrix $\mathbf{X}_p^T
\mathbf{W}\mathbf{X}_p$.

In the case that $p=0$, a linear fit, it can be shown that this reduces to a simple kernel smoothing.

The main free parameters of local polynomial fitting are thus the fit order $p$, the bandwidth $h$, and the choice of kernel $K(r)$.

Anisotropic smoothing, involving a matrix-valued bandwidth parameter, can also be employed, but will not be considered here.

---
class: left
##Choice of Kernel

A variety of different choices of weighting kernel $K(r)$ have been employed in the literature.

Here we propose the general form

<div>\[K(r;\alpha,\beta)\equiv \kappa_{\alpha,\beta} (1-r^\alpha)^\beta \]</div>

where $ \kappa_{\alpha,\beta} $ is a normalizing constant.  This broad family subsumes the most commonly used kernels as special cases.

The use of a general form allows us to readily determine the optimal choice of parameters in an observing system simulation experiment through numerical optimization.

This kernel is usefully reparameterized in terms of the *half-power radius* $R$ and *shape parameter* $\alpha$, with $\beta$ then given by

<div>\[\beta(R,\alpha)\equiv \frac{\ln(1/2)}{\ln\left(1-R^\alpha\right)}  \]</div>

where we note that $K(R;\alpha,\beta)/K(0;\alpha,\beta)=1/2$.

---
class: center
##Local Polynomial Fitting on the Sphere

Local polynomial fitting on the sphere can be accomplished by projecting data into a plane tangent to  the mapping point $(x_o,y_o)$.

Then the previously presented equations all hold, with $\tilde{\mathbf{x}}(x_o,y_o)$ and $\tilde{\mathbf{y}}(x_o,y_o)$ now giving the location within the tangent plane.

---
class: center
##Design Adaptivity and Fixed Population

Fits with $p=1$ or larger have a crucial property in that bias does not arise due to spatial variations in the distribution of data points.

This property, called *design adaptivity* by Fan (1992), corrects a severe defect in the zeroth-order fit. Compare  (a) the true field, (b) the zeroth-order or $p=0$ fit, and (c) the first-order or $p=1$ fit.

The problem of gaps arising from low data density with a fixed bandwidth, as in (c), is solved by letting the bandwidth locally expand to encompass a specified number of observation points, as in panel (d).  This is termed the *fixed population* algorithm herein.

---
class: center
##Bias Estimates

The existence of extensive theory surrounding local polynomial fitting allows errors to be estimated with reasonable accuracy.

Here, the true mapping bias (a) is compared with two different estimates that do not rely on the true (generally unknown) field.

---
class: left
##Comparison with Optimal Interpolation

In the oceanographic community, Optimal Interpolation (OI) is by far the most common mapping method.

Local polynomial fitting (LPF) and OI differ in both theory and practice. Factors in favor of LPF are numerous, including:

* LFP does not require prior knowledge of the covariance structure of the field being mapped.
* Optimal LPF parameters can be chosen or estimated empirically using theoretical results.
* LPF has several degrees of adjustability, whereas OI, strictly speaking, has zero.
* LPF is much faster because the matrix being inverted is vastly smaller—$Q\times Q$, i.e. $6 \times 6$ for a $p=2$ fit, versus $N\times N$ for OI.
* Although OI is theoretically optimal in some sense, key caveats are that (i) real-world performance may depart from optimality on paper and (ii) the sense of optimality is crucial.

For example, OI applied to an eddying ocean is optimal if one wishes to smooth out the eddies to estimate the mean, yet paradoxically is commonly applied to estimate eddy properties.

---
class: left
##Creation of a New Gridded Product

We now turn to the creation of an new open-source gridded product using local polynomial fitting.

For expediency, in order to isolate the mapping aspects of the problem and avoid the great challenge of merging different classes of altimeter data, here we focus entirely on the Jason-class data.

The Jason-class missions have sampled the same tracks continuously since 1992.

To determine the optimal mapping parameters, an observing system simulation experiment will be performed using a synthetic alongtrack dataset constructed from a global numerical model.

---
class: center
##Alongtrack Jason-Class Data

(a) SLA snapshot (b) standard deviation (c) % missing cycles

---
class: left
##About the Data

For alongtrack data, we use the [{Integrated Multi-Mission Ocean Altimeter Data for Climate Research Version 5.0}](https://podaac.jpl.nasa.gov/dataset/MERGED_TP_J1_OSTM_OST_ALL_V50) of Beckley et al.

This dataset has been re-organized such that descending and ascending tracks are separated, as shown on the previous slide, and subjected to some additional despiking.

An advantage of this re-organization is that one can simply plot the alongtrack data, with track number along the x-axis and alongtrack distance along the y-axis, and still make sense of it.

We will also form an empirical estimate of the small-scale noise, to be used later in an observing system simulation experiment.

This noise level estimate is formed by a high-wavenumber alongtrack bandpass using a wavelet transform.

A very high wavenumber band shows atmospheric structure, and can consequently be taken as a noise estimate, while an adjacent lower band shows ocean structure, as shown shortly.

---
class: center
##Despiking Illustration

Selected tracks from the Beckley product before (a) and after (b) median-based despiking.

---
class: center
## Change in Noise Level Between Missions

The median value of the absolute second central difference represents an estimate of the noise level, as it identifies the very smallest-scale variability in the absence of other structure.

Divisions between the TOPEX and Poseidon instruments are clear, as is the transition from TOPEX/Poseidon to the Jason satellites.

Nevertheless the time variabililty of the noise level will be ignored in a first approximation.

---
class: center
##An Empirical Noise Estimate

Alongtrack variability in (a) a very high band or (b) a somewhat lower band.  We can use (a) as an empirical noise estimate.

---
class: left
##Creating a Synthetic Alongtrack Dataset

To determine the best mapping parameters, we perform an observing system simulation experiment.

A synthetic alongtrack dataset is created by looking up sea surface height within an ocean model at the exact sample times and locations occurring in the data.  Spatially-varying noise is then added to match the empirically estimated noise in the real data.

For a model, we use a year-long, 1/8th degree simulation with the GOLD model performed by H. Simmons (Simmons and Alford, 2012) and generously made available for this study.

This simulation has the advantage of its surface fields having been saved at the full model resolution every hour.

This means that alongtrack spatial and temporal structure are both well resolved, leading to a high-quality simulated product.

Because tides are removed from the real alongtrack altimeter data in upstream processing, we use a version of the GOLD simulation that does not include tidal forcing.

---
class: center
##A Snapshot from the GOLD Model

A snapshot of sea surface height anomaly from the GOLD model, with Jason-class altimeter tracks overlaid and with the longest continuous ocean-only track shown as a heavy line.

---
class: left
##Determining the Best Mapping Parameters

Next we sweep through parameter space in order find the best set of parameters.

Parameters are allowed to vary as a function of latitude.

Error is quantified by the averaged squared deviation between the resulting map and the colocated true model value.

If any gaps emerge in the mapped product between +/- 66 degrees for a certain parameter choice, the error is set to infinity.

After this sweep, a simple parametric form is chosen if the error-minimizing choice is observed to vary with latitude, and its parameters are estimate through numerical optimization.

Otherwise, the best constant value across latitudes is determined.

This is done within the contact of fixed population fits, as these naturally adapt to data gaps.

---
class: center
##Optimal Latitude-Dependent Parameters

An exhaustive  sweep through (a) half-power radius $R$, (b) population $N$, and (c) shape parameter $\alpha$.  Importantly, the former  exhibits clear latitude dependence, but the other variables do  not.

The heavy gray curves are the best fits as a function of latitude.

---
class: left
##A Beta Version of a New Gridded Product

Using these best-fit curves, maps were created from the real-world alongtrack data for every altimeter cycle during the single year 2007, which is the same year for which the model was run.

This leads to the animation shown in the bottom panel earlier.

The favorable comparison with CMEMS, despite using only a fraction of the data, suggests considerable improvements in the gridded products could be realized through refining our mapping algorithms.

In other words, applying the local polynomial fitting method to all available altimeter data should result in greatly improved maps.

---
name: conclusions
class: left
## Next Up

The work presented in this talk is formulated in a series of drafts, which are expected to be submitted for publication in early 2023.

The new open-source gridded product for the entire 30-year Jason record will be made publicly available at that time.

The mapping code upon which this is based is already available as a part of [{jLab}](http://www.jmlilly.net/software.html), the author's Matlab toolbox.

This software is being substantially revised and refactored in conjunction with the forthcoming publications.

---
class: middle, center
#  Thank you!

For all submitted papers as well as my software toolbox, kindly visit my [{website}](http://www.jmlilly.net).