Conceptual Modelling

class: center, middle
<br/><br/><br/>
.title[Uncertain With Respect to What? <br/>The Importance of Conceptual Modelling]
<br/><br/>
.author[]
.coauthor[Jonathan M. Lilly$^1$, <br>Jeffrey J. Early$^2$, Adam Sykulski$^3$, Shane Elipot$^4$,<br> Cimarron J. Wortham$^2$,
Sofia Olhede$^5$]
.institution[$^1$Planetary Science Institute, $^2$NorthWest Research Associates,
$^3$Imperial College London, $^4$University of Miami, $^5$Ecole Polytechnique Fédérale de Lausanne]
<br/>
.date[February 21 & 22, 2024]
<br/>
<br/>
.note[Created with [{Liminal}](https://github.com/jonathanlilly/liminal) using [{Remark.js}](http://remarkjs.com/) + [{Markdown}](https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet) +  [{KaTeX}](https://katex.org)]

---
class: left
##What Do We Mean by Uncertainty?

Given an observed field $z$, we wish to quantify its uncertainty.  This means we have already introduced the decomposition

<div>$$z = \boxed{z_\star}+ z_\epsilon$$</div>

called an *unobserved components model*. Here $z\_\star$ is the “truth” and $z\_\epsilon$ is a deviation from that truth that we could interpret as an error.

Note that $z\_\star$ and $z\_\epsilon$ come in a pair.  You can't talk about one without the other. Yet, importantly, these can't be directly observed.

The simplest case is where $z$ is a measurement,  $z\_\epsilon$ is measurement error,  and $z\_\star$ is the true value of the field.

More generally, quantifying $z\_\epsilon$ requires introducing a suitable conceptual or statistical *model* for $z\_\star$.

This is often the most important, and challenging, part of uncertainty quantification: specifying the model for the “truth”!

---
class: left
##Some Examples

Splitting the data into model and an error is a ubiquitous step in data analysis—one which is often not specified explicitly.

<div>$$z = z_\star+ z_\epsilon$$</div>

We'll look at three examples in this talk.

1. A kinematic model of an eddy in a mooring time series.
2. A stochastic model for Lagrangian trajectories.
3. A local polynomial fit to alongtrack altimetry.

This means the error with respect to the actual truth—*which we don't know*—includes two error sources:

1. One source is due the unmodeled variability $z\_\epsilon$, e.g. noise
1. The second is  *model misfit* (or bias) and arises from $z\_\star$

Note: the additive model is not the only choice. We could also write $z=z\_\star +\epsilon(z\_\star)$ where the error is dependent on the truth.
---
class: left
##The Basic Idea

The power of conceptual modeling is that it lets you *predict* what observations should look like, if generated by the proposed process.

This in turns lets you use the observations to *estimate* the model parameters—quantifying the variability in a way that could not be done in the absence of the model.

The result is a simple way to understand what is going on.

This same results are not achieved either by an OSSE or by machine learning!

Conceptual models fall into two broad categories.

1. Kinematic models are valuable, easy to implement, and often overlooked. 
2. Stochastic models are extremely powerful but take considerable effort to identify, refine, and implement.

**The key to quantitative analysis of a dataset is often the specification of a suitable conceptual  model.**

---
class: center, middle
# Example 1: A Kinematic Model of an Eddy

<div>$$z = z_\star+ z_\epsilon$$</div>

$z(t)=u(t)+iv(t)$ is a mooring velocity time series  
$z\_\star(t)$ is the velocity expected by a simple model of an eddy   
$z\_\epsilon(t)$ the unexplained portion or residual

---
class: center
## Mystery Features in a Mooring Record

Sudden rotation of the currents concurrent with a temperature anomaly... what do these mean?

Let  $z\_\star(t)=u\_\star(t)+i v\_\star(t)$ result from a simple model of what eddy currents could look like at a mooring.

<!--
class: center
## Kinematic Model of an Advected Eddy

Currents due to a Rankine vortex advected past a mooring.-->

---
class: center
## Kinematic Model of Interacting Eddies

Currents due entirely to mutually advecting Rankine vortices.  
This looks just like the real mooring data!

On the basis of this model, we can estimate the parameters of the advected eddies—from measurements at a single point.

.cite[<a href="http://doi.org/10.1175/1520-0485(2002)032%3C0585:CEITLS%3E2.0.CO;2">{Lilly and Rhines (2002)}</a>, <a href="http://doi.org/10.1016/j.pocean.2003.08.013">{Lilly et. al (2003)}</a>]

---
class: left
## Kinematic Modelling—Lessons Learned

Advantages

1. Relatively easy and often quite informative
1.  Particularly well suited for eddies and organized structures
1.  Great for illustrating what data could look like under a certain hypothesis

Disadvantages

1. Not a statistical inference method, so gray areas must be handled subjectively
1. Might require an additional step to connect to dynamics

**Do not overlook the power of a simple kinematic model!**

---
class: center, middle
# Example 2: Quantifying Lagrangian Fluctuations

<div>$$z = z_\star+ z_\epsilon$$</div>

$z(t)=u(t)+iv(t)$ is a Lagrangian velocity  
$z\_\star(t)$ is the “mean flow”  
$z\_\epsilon(t)$ is a fluctuation about the mean, i.e., a stochastic process

---
class: left
## Quantifying Lagrangian Fluctuations

How can we quantify the fluctuations of the ocean currents about a suitably defined mean?

Defining the velocity fluctuation relative to the mean as

<div>\[z(t)\equiv u(t) + i v(t)-\left\langle u(t) + i v(t)\right\rangle\]</div>

the velocity autocovariance is defined as

<div>\[R_{zz} (\tau)\equiv\left\langle z(t+\tau)\,z^*(t)\right\rangle \]</div>

and the diffusivity is given by
<div>\[\kappa \equiv \lim_{t\longrightarrow\infty}
 \frac{1}{4} \frac{d}{d t} \left\langle x^2(t) + y^2(t)\right\rangle
 =\frac{1}{4} \int_{-\infty}^\infty R_{zz}(\tau)\, d \tau.\]</div>

But this reduces the description of the fluctuations about the mean to a single parameter, the diffusivity—one that is very sensitive to the specification of the averaging operator.

---
class: left
## Quantifying Lagrangian Fluctuations

Knowing $\kappa$ doesn't tell you much about the fluctuations.  Many different random processes could have the same value of $\kappa$.

A more interesting question is, fluctuations about the mean can be approximated by what type of stochastic process?

That is, we seek a *conceptual* or *stochastic* model for the fluctuating part of the flow.

We propose to model the fluctuating part of the flow as

<div>\[S_{zz}(\omega) = \frac{A^2}{(\omega^2+\lambda^2)^\alpha}=\int_{-\infty}^\infty e^{-i \omega \tau} R_{zz}(\tau) \, d\tau\]</div>

where $A$ sets the energy level, $\alpha$ sets the high-frequency slope, and $\lambda$ determines a low-frequency transition to a constant value.

.cite[<a href="https://jmlilly.net/papers/lilly17-npg.pdf">{Lilly, Sykulski, Early, and Olhede (2017). Fractional Brownian motion, the Matérn process, and stochastic modeling of turbulent dispersion.}</a>]

---
class: center
## Capturing Diffusivity ...

Using this model, we can accurately capture the diffusivity observed in simulations of quasigeostrophic turbulence...

<video  style="margin-top:-0.5em;" preload="auto" width="100%" height="auto" data-setup="{}" autoplay loop controls><source src="../videos/dispersionmovie.mp4" type="video/mp4" />../video>

Left = numerical model, right = stochastic model

---
class: center
## ... as Well as Trajectory Roughness

Top left: numerical simulations.  Top right: Matérn process.    
Bottom left: white noise.  Bottom right: red noise (Brownian).

---
class: center
##A Conceptual Model for the Spectrum

<div>\[
S(\omega) = \overset{\mathrm{background}}{\overbrace{\frac{A_b^2}{\left[\omega^2+\lambda_b^2\right]^\alpha}}}
+\overset{\mathrm{inertial\,\, oscillations}}{\overbrace{\frac{A_o^2}{\left(\omega-f_o\right)^2+\lambda_o^2}}}+
\overset{\mathrm{semidiurnal\,\,tide}}{\overbrace{\frac{A_s^2}{\left(\omega-f_s\right)^2+\lambda_s^2}}}
\]</div>

This is just several Matérn spectra added together.

<!-- 
class: center
## Spectra in Latitude Bands
<img style="width:75%;margin-top:-0.7em;margin-bottom:-0.8em"  src="../figures/drifterspectralinestransparency.png">

Averages in $\pm 5 ^\circ $ bins.  Frequency is nondimensionalized as $\omega/f_o$.
Inertial peak broadens equatorward.

In general, spectral model compares very well with estimated spectra.  Inertial and semidiurnal peaks are well captured. -->

---
class: left
## But... What Does “Mean” Mean?

In defining the diffusivity we've made use of $\left\langle \cdot \right\rangle$, an averaging operator.

In theory, this means the ensemble average, the average over all the possibilities from an abstract probability space.

In reality, we don't know this so we can't form this estimate.  How can we best approximate it?

Is it (i) the temporal average along a trajectory?  But a trajectory can move from one region to another with different statistics.

Is it (ii) the average at a fixed geographic location?  But the ocean can change state at a given location, e.g., Gulf Stream meanders.

Is it (iii) the climatological average?  Still the ocean can change state ...

So how do we form this average?  What do we mean by  “mean”?

This is a hard problem!

---
class: left
## Lagrangian Modeling—Lessons Learned

1.  A stochastic model provides a good match to  fluctuations of the ocean currents about the mean. 
1.  This model lets us quantify those fluctuations using three parameters (energy, diffusivity, and roughness).
1.  A powerful model for analyzing Lagrangian trajectories!
1. The choice of averaging operator $\left\langle \cdot \right\rangle$ matters and determining an appropriate choice is a challenging problem.

.cite[Sykulski, Olhede, Lilly, and Danioux (2016)]  
.cite[Lilly, Sykulski, Early, and Olhede (2017)]   
.cite[Sykulski, Olhede, Lilly, and Early (2017)]   
.cite[Sykulski, Olhede, Guillaumin, and Early (2019)]

---
class: center, middle
# Example 3: Mapping Alongtrack Altimetry

<div>$$z = z_\star+ z_\epsilon$$</div>

$z(x,y)$ is the true sea surface height  
$z\_\star(x,y)$ is an estimate generated from scattered observations   
$z\_\epsilon(x,y)$ is the error

<!--
class: left
## In OI, Choice of Covariance Matters

.cite[ <a href="https://ostst.aviso.altimetry.fr/fileadmin/user_upload/OSTST2022/Presentations/SC32022-Physically-consistent_mapped_altimetry_products_on_user-customizable_grids.pdf">{Wortham and Early (2022). Physically-consistent mapped
altimetry on custom grids. OSM22 poster presentation}</a>]-->

---
class: left
## Mapping Alongtrack Altimetry

Given $N$ observations of the sea surface height $z\_n$ at longitudes $\theta\_n$ and  latitudes $\phi\_n$, we wish to construct a continuous map $z\_\star(\theta,\phi)$.

Then $z\_\epsilon(\theta,\phi)$ is the deviation of this map from the truth $z(\theta,\phi)$.

This problem has two parts:

1.  What is the model for the mapped field $z\_\star(\theta,\phi)$? 
2. How can you determine the mapping parameters to minimize the error $z\_\epsilon(\theta,\phi)$ given that you don't actually know the truth?

---
class: left
## Importance of the Choice of Model

Mapping a gradient from non-uniformly distributed data samples using a local polynomial fit.  Fit order matters!

- Constant-order fits (a.k.a., a convolution) have terrible bias.
- First and higher-order fits are *design-adaptive*  (Fan, 1992) — there is no bias due to nonuniform data distribution.  
- Fixed-population (variable bandwidth) + higher-order fits are even more adaptive.

.cite[ <a href="https://ostst.aviso.altimetry.fr/fileadmin/user_upload/OSTST2022/Presentations/SC32022-Physically-consistent_mapped_altimetry_products_on_user-customizable_grids.pdf">See also the importance of covariance function choice in Optimal Interpolation in {Wortham and Early (2022). OSM22 poster presentation}.</a>]

---
class: left
##The Importance of OSSEs

In an Observing System Simulation Experiment (OSSE), a suitable numerical model is used in place of real-world data.

The observations $z$ are taken from the numerical model, sampled as if it were the real world (e.g., with altimeter tracks + noise).

But in the unobserved components model

<div>$$z = z_\star+ z_\epsilon$$</div>

the truth $z\_\star$ can also be found from the numerical model.

This means the uncertainty or error $z\_\epsilon$ is actually exactly known as the residual $z-z\_\star$.

This makes the OSSE an invaluable tool in assessing a proposed conceptual model or analysis method.

**If you come up with a new model or method, make sure it does what you expect by testing it in an OSSE!**

---
class: center
## Parameter Optimization With an OSSE

Perform the mapping on a numerical model sampled like the alongtrack Jason-class satellite altimetry.

Model fields take the place of the truth $z(\theta,\phi)$.  
Sweep through parameter space to minimize $z\_\epsilon(\theta,\phi)$.

Model output thanks to H. Simmons.

<!-- 
class: center
## Find Optimal Mapping Parameters

Some parameters strongly affect the error; others do not.

In this case, the most important parameter (after fit order) is the inner radius of the weighting function—its half-power point.

The exact shape and total number of points included don't matter.
-->

---
class: center
## Best-Fit Local Polynomial Fit vs. AVISO

<video  style="margin-top:-0.5em;" preload="auto" width="100%" height="auto" data-setup="{}" autoplay loop controls><source src="../videos/openoceanmovie2.mp4" type="video/mp4" />../video>

Left = AVISO, right = forthcoming open-source Jason-only product

Note that the local polynomial fit on the right is very simple!

This indicates the potential improvement in data products through refined conceptual & statistical models.

.cite[For another example of product improvement through model refinement, see <a href="http://doi.org/10.1002/2016JC011716">{Elipot et al (2016). A global surface drifter data set at hourly resolution}.</a>]

<!-- 
class: center
## Understanding the Error

In the OSSE, the error of the mapped field with respect to reveals a spatial pattern reflecting the track positions.

Error tends to be large in regions of high SSH *curvature*.

The reason for the error is that eddies literally fall in between the cracks of the observations.

This suggests a fundamental limitation of the data, unlikely to be further improved. 
-->

---
class: left
##Conceptual Modeling—Lessons Learned

1.  Coming up with a suitable conceptual model is often a critical step in data analysis.

--
1. It can enable you to address new problems, or to address old problems with a new degree of rigor.

--
1.  It's really hard. As in, years of work and multiple papers.

--
1. The community would benefit by valuing and prioritizing this step much more than it currently does.

--
1.  This involves thinking in more than purely dynamical terms.  It involves thinking in terms of the information content of data.

--
1.  We need more people working in this space, more training and collaboration, and more resources, and more patience.

--
1.  Wanting a simple way to think about things is not solved by machine learning!

---
class: center, middle     
# Thank you!   
        
This talk is available at   
[http://www.jmlilly.net/talks.html](http://www.jmlilly.net/talks.html)
        
.footnote[                          P.S.  Like the way this presentation looks? Check out [{Liminal}](https://github.com/jonathanlilly/liminal).] 
        
        
---
class: center, middle
# Example 4: Eddy Currents in Lagrangian Trajectories

<div>$$z = z_\star+ z_\epsilon$$</div>

$z(t)=u(t)+iv(t)$ is a Lagrangian velocity  
$z\_\star(t)$ is flow associated with a coherent eddy  
$z\_\epsilon(t)$ is everything else

---
class: left
##A Model for Lagrangian Trajectories

A model for Lagrangian trajectories with eddies:

<div>$$z(t)=x(t) + \mathrm{i} y(t) =z_\epsilon(t) + \boxed{z_\star(t)}.$$</div>

Stochastic background portion $z\_\epsilon(t)$ plus oscillatory portion $z\_\star(t)$.

The oscillatory portion $z\_\star(t)$ is modeled as a *modulated ellipse*:

<div>$$z_\star(t)=\mathrm{e}^{\mathrm{i}\theta(t)}\left[a(t)\cos \phi(t)+\mathrm{i}b(t)\sin \phi(t)\right].$$</div>

---
class: left
##Elements of Lagrangian Vortex Extraction

This analysis method consists of four steps:

1.  *Extracting* oscillatory velocity features using a technique called *wavelet ridge analysis*. .cite[Lilly, Scott, and Olhede (2011)]
2.  *Thresholding* with a suitable error quantity to remove false positives. .cite[Lilly and Olhede (2012a)]
3.  *Linking* ellipse properties to spatially-integrated properties using an extended version of Stokes' theorem.  .cite[Lilly (2018)]
3.  *Testing* statistical confidence through comparison with a null hypothesis.  .cite[Lilly and Pérez-Brunius (2021b)]

The power of this conceptual model comes from a body of theory allowing the parameters to be inferred from the oscillatory time series itself.

---
class: center
## Case Study of a Loop Current Eddy

Evidence of three different propagation velocities:  
3.5 cm/s, 5 cm/s, stalled
        
---
class: center
## Strong Cyclones Colored by $Ro\_\star$