In this lab, you'll be working with current meter data, that is, observations of the horizontal ocean currents made from a fixed point. The main analysis tool we will learn are variance ellipses, a fundamental tool for analyzing velocity datasets or any kind of bivariate (or complex-valued) data. You'll see their relationship to the two-dimensional histogram, and will learn some associated analysis and visualization ideas, in particular the value of finding a suitable coordinate rotaiton.
For this assignment, we are going to use a mooring from the Labrador Current on the west side of the Labrador Sea known as the ‘m1244’ mooring. Please download my version of it here and put it on your Matlab search path if you have not done so already. (This is included in the full distribution of the course materials.)
This notebook requires my jlab toolbox to be installed. You will also need to have set up Jupyter Lab to work with Matlab, following these instructions.
Now let's load and examine the data.
set(groot,'defaultfigurepaperposition',[1 1 7 5]) %set default figure size
clear
load m1244
m1244
m1244 = struct with fields: description: '1244 mooring 1994-1995' link: 'https://www.nodc.noaa.gov/woce/woce_v3/wocedata_1/cmdac/netcdf/acm29/acm29.htm' creator: 'Created by the L_M1244 script by J.M. Lilly' timestamp: '18-Oct-2018 05:40:00' lat: 55.478599548339844 lon: -53.654998779296875 depths: [201 1001 1501 2751] num: [7371x1 double] p: [7371x4 double] t: [7371x4 double] cv: [7371x4 double]
The data is organized as a structure with different fields. The field variables are as follows:
depths --- Depths of 4 different currents meters, in meters
num -- date in Matlab's "datenum" format
p --- Pressure in decibar
t --- Temperature in Centigrade
cv --- Complex velocity u+iv in cm/s, u = eastward, v = northward
Ordinarily, you would access these with the notation "m1244.cv", and so forth. However, jLab has a function called "use" that will map all the fields in a structure into variables in memory.
use m1244
whos
Name Size Bytes Class Attributes creator 1x43 86 char cv 7371x4 471744 double complex depths 1x4 32 double description 1x22 44 char lat 1x1 8 double link 1x78 156 char lon 1x1 8 double m1244 1x1 1004678 struct num 7371x1 58968 double p 7371x4 235872 double t 7371x4 235872 double timestamp 1x20 40 char
As you can see all of the fields now exist as variables. This makes it convenient to have different datasets with the same variable names, and swap out which one we are working with at the moment, thus keeping our variable names short and consistent.
Before we proceed let's find out the sampling interval and duration of the time series.
(num(2)-num(1))*24 %Sampling interval in days converted to hours
(num(end)-num(1))/365 %Duration in days converted to years
ans = 2.000000000931323 ans = 1.682648401826378
So we have a roughly 1.7 year record of the currents sampled every two hours.
Let's take a look at the currents the deepest depth. Note that I often like to use year.fraction as a simple time axis for time periods of a year or more.
%Note that I'll be adjusting the 'paperposition' property to have the figures display nicely.
uvplot(yearfrac(num),cv(:,4)),axis tight
%uvplot is a jlab function that plots the real and imaginary part of a complex variable
legend('Eastward Velocity','Northward Velocity')
ylabel('Velocity (cm/s)')
title(['Currents at ' int2str(depths(4)) ' m'])
Here a mean value is readily apparent in both the eastward and northward components. In addition, we see a lot of small-scale variability, possibly with a greater amplitude in the eastward component. Hints of multiple timescales of variability are present, with a fine noise-like variability superposed on somewhat longer timescales.
It is a little difficult to make sense of this plot, however, because it is not rotated into the most useful coordinate system. Let's figure out a sensible rotation angle.
To do this, we will look at the progressive vector diagram. For a current measurement at a point, the progressive vector diagram is defined as a plot of its time integral. In other words, the progressive vector diagram shows the displacement that would occur if a particle were advected with the given currents.
dt=num(2)-num(1); %sampling interval in days
dt=dt*3600*24; %sampling interval in seconds
cx=cumsum(cv)*dt; %complex displacement x+iy in centimeters
cx=cx/100/1000; %complex displacement x+iy in kilometers
plot(cx) %Matlab plots the real and imaginary parts against each other
xlabel('Displacement East (km)')
ylabel('Displacement North (km)')
xtick([-10:2:10]*1000),ytick([-10:2:10]*1000),axis tight
%Create a legend for the figure lines
clear str
for i=1:4
str{i}=['Depth ' int2str(depths(i)) ' m'];%use a cell array for the labels
end
legend(str)
title('Progressive Vector Diagram for m1244')
Here, the currents appear to be directed due southeast. However, there is a problem with this plot: the x and y axes, which are the same physical quantity of displacement, correspond to different physical lengths! If we re-scaled the figure window, the direction of the mean flow would appear to change. Thus our perspective on the comparison between the eastward and northward currents is distorted.
This problem is fixed by setting the data aspect ratio. Not setting the data aspect ratio correctly is actually a common problem, even, regrettably, in many published papers.
We'll now remake the above figure with the correct aspect ratio.
set(gcf,'paperposition',[1 1 7 4]) %set suitable figure size
plot(cx) %Matlab plots the real and imaginary parts against each other
xlabel('Displacement East (km)')
ylabel('Displacement North (km)')
xtick([-10:2:10]*1000),ytick([-10:2:10]*1000),axis tight
axis equal %This sets it the data aspect ratio to 1:1 <---------------------
%Create a legend for the figure lines
clear str
for i=1:4
str{i}=['Depth ' int2str(depths(i)) ' m']; %use a cell array for the labels
end
legend(str)
%In jlab the progressive vector diagram with the correct aspect ratio can also
%be plotted with the function “provec”.
Now we see a strong mean flow at all depths directed to the east-southeast. This matches what we saw above with the line plot, where the positive mean of the eastward currents is roughly twice as large as the negative mean of the southward currents.
It's natural to rotate our current meter data so that the east-southeastward direction of the mean flow corresponds to the first velocity component, and the direction normal to that to the second velocity component. In other words, we will rotate our $(u,v)$ data to become $)\tilde u,\tilde v)$ with $\tilde u$ corresponding to the ‘downstream’ direction and $\tilde v$ corresponds to the ‘cross-stream’ direction. We will take a little time to do this so we understand how rotations work.
Firstly we find the direction of the mean flow at the deepest current meter, where the flow is strongest.
angle(mean(cv(:,4)))*360/2/pi %angle in radians converted to degrees
ans = -29.602806827016341
So the angle is about 30 degrees clockwise from due east. That looks about right!
Rotation of a complex-valued number $z=u+iv$ are straightforward. To rotate $z=|z|e^{i\varphi}$ through some angle $\phi$, we simply multiply by $e^{i\phi}$. This leads to a new version of $z$ , denoted $\tilde z =\tilde u+ i\tilde v=|z|e^{i(\varphi+\phi)}$.
In this case, we want to choose the rotation angle $\phi$ as the negative of the angle of the mean flow. Let's look at the mean real and imaginary components of the velocity before and after this rotation.
mean(cv(:,4))
phi=-angle(mean(cv(:,4)));
mean(cv(:,4)*exp(1i*phi))
%note in Matlab, it's best to use 1i and not i for sqrt(-1), since we often use i for a counter
ans = 20.785806533575666 -11.809328448895153i ans = 23.906275152459880 - 0.000000000000001i
This shows that after the rotation, the mean of the cross-stream velocity (the imaginary part) is zero, as expected. Now let's re-plot the progressive vector diagram.
set(gcf,'paperposition',[1 1 7 2]) %set suitable figure size
plot(cx*exp(1i*phi)) %Matlab plots the real and imaginary parts against each other
xlabel('Displacement East (km)')
ylabel('Displacement North (km)')
xtick([-10:2:10]*1000),ytick([-10:2:10]*1000),axis tight
axis equal %This sets the data aspect ratio to 1:1
At first glance, this plot appears rather boring. But actually, boring can be a good sign because it means we've found a way to look at our dataset in such a way that it simplifies!
Let's return to the line plot we made earlier, but now make it for the rotated velocity data.
rcv=m1244.cv*exp(1i*phi); %Define a rotated version of the velocity
uvplot(yearfrac(num),rcv(:,4)),axis tight,
legend('Downstream Velocity','Cross-stream Velocity')
ylabel('Velocity (cm/s)')
title(['Rotated Currents at ' int2str(depths(4)) ' m'])
The downstream and cross-stream components are seen to be distinctly different. The cross-stream flow oscillates about a mean of zero, which is by construction, and also presents higher-frequency variability than does the downstream flow. Or, perhaps more accuately, it appears that the cross-stream flow is lacking an intermediate-timescale component that is present in the downstream flow.
Let's take a closer look at the timescales present in this dataset with a simple lowpass filter. We'll use a 24-point filter, which corresponds to a 2-day running mean since the sampling interval is 2 hours.
uvplot(yearfrac(num),vfilt(rcv(:,4),24)),axis tight
%vfilt is a jlab routine for filtering that uses, by default, an N-point Hanning filter
legend('Downstream Velocity','Cross-stream Velocity')
ylabel('Velocity (cm/s)')
title(['Rotated Currents at ' int2str(depths(4)) ' m'])
Here we can see clearly that the downstream flow presents an intermediate-timescale variability that is not present in the cross-stream flow.
Our rotation, designed just based on the mean, has thus also revealed distinctions in variability. The variability of the flow is now seen as being anisotropic, that is, lacking the property of being the same in all directions. This was not visible before the rotation because the downsteam and cross-stream components were mixed.
In general, finding ways to "rotate" a dataset, perhaps in an abstract way, in such a way that variability presents anisotropic structure is a quite simple yet powerful approach we can use to unlock its information content.
Here, the different timescales between the two components is not what one would expect if the variability were entirely due to the advection of eddies past the mooring; the timescales in that case would be the same. So this plot informs the physical hypotheses we would frame regarding the nature of the variability.
Next we will summarize the statistics of the velocity through looking at its properties on the $u,v$ plane. First, we'll make a simple line plot of $\tilde u$ versus $\tilde v$---a type of plot known as a hodograph---for the rotated versions of the currents we created earlier. We'll work with the lowpassed version of the time series for the moment.
plot(vfilt(rcv(:,4),24))
axis(49*[-1 1 -1 1]),xtick([-40:10:40]),ytick([-40:10:40])
vlines(0,'k:'),hlines(0,'k:')
%Next, plot the mean as a heavy line ending in a circle
plot([0+0*1i mean(rcv(:,4))],'color','k','linewidth',2)
plot(mean(rcv(:,4)),'ko','markerfacecolor','k')
axis equal %Don't forget to set the aspect ratio!
axis tight
xlabel('Downstream Velocity'),ylabel('Cross-stream Velocity')
title(['Hodograph at ' int2str(depths(4)) ' m'])
The mean flow is plotted for reference. Note that the time integral of this plot is the progressive vector diagram.
At first glance, this plot is not very informative. It basically tells us that there is varability about the mean, with greater variability along the downstream axis then the cross-stream axis, which we already knew. But we can use statsitics to nicely summarize what we are seeing.
To do this we will look at two-dimensional distributions of the velocity. Firstly, we plot the two-dimensional histogram at the deepest depth.
%using two2hist, a jlab routine for finding a two-dimensional histogram
[mat,xmid,ymid]=...
twodhist(real(rcv(:,4)),imag(rcv(:,4)),[-50:50],[-50:50]); %the last two arguments set the x- and y- bin sizes
mat(mat==0)=nan; %swap zero values in the histogram for NaNs, so they don't appear with a color
jpcolor(xmid,ymid,mat);hold on
colormap lansey, axis equal, axis(49*[-1 1 -1 1])
vlines(0,'k:'),hlines(0,'k:'),h=colorbar;
h.Label.String='Observations per bin';
plot([0+0*1i mean(rcv(:,4))],'color','w','linewidth',3) %plotting a white edge around the line for visibility
plot([0+0*1i mean(rcv(:,4))],'color','k','linewidth',2)
plot(mean(rcv(:,4)),'wo','markerfacecolor','k')
xtick([-40:10:40]),ytick([-40:10:40])
title('Two-Dimensional Histogram of Rotated Velocity')
This plots shows the number of observations within each bin, which we have specified to be $1\times 1$ cm/s bins ranging from -50 to 50 cm/s in both axes. The white bins are never observed.
The histogram of the velocity has a roughly elliptical shape, elongated along the x-axis. We now have a picture of a "cloud" of possibilities, with, generally speaking, bins closer to the location of the mean flow occurring more frequently.
Note this plot contains more information than was visible in the line plot, because while the line plot also showed us the shape of the distribution, it did not provide us with information as to the relative frequency of occurrence of velocity values.
Two-dimensional statistics are an exceedingly powerly analysis tool. Before proceeding, let's take a look at how we can choose our settings to get the most out of them.
First, you'll want to pay attention to your choice of bin size.
%same plot as above but with x- and y-bins of [-50:5:50]
[mat,xmid,ymid]=...
twodhist(real(rcv(:,4)),imag(rcv(:,4)),[-50:5:50],[-50:5:50]); %the last two arguments set the x- and y- bin sizes
mat(mat==0)=nan; %swap zero values in the histogram for NaNs, so they don't appear with a color
jpcolor(xmid,ymid,mat);hold on
colormap lansey, axis equal, axis(49*[-1 1 -1 1])
vlines(0,'k:'),hlines(0,'k:'),h=colorbar;
h.Label.String='Observations per bin';
plot([0+0*1i mean(rcv(:,4))],'color','w','linewidth',3) %plotting a white edge around the line for visibility
plot([0+0*1i mean(rcv(:,4))],'color','k','linewidth',2)
plot(mean(rcv(:,4)),'wo','markerfacecolor','k')
xtick([-40:10:40]),ytick([-40:10:40])
Here, the bins have been chosen to be too coarse, so we can't see much structure, and the plot has a blocky look to it.
%same plot as above but with x- and y-bins of [-50:.2:50]
[mat,xmid,ymid]=...
twodhist(real(rcv(:,4)),imag(rcv(:,4)),[-50:.2:50],[-50:.2:50]); %the last two arguments set the x- and y- bin sizes
mat(mat==0)=nan; %swap zero values in the histogram for NaNs, so they don't appear with a color
jpcolor(xmid,ymid,mat);hold on
colormap lansey, axis equal, axis(49*[-1 1 -1 1])
vlines(0,'k:'),hlines(0,'k:'),h=colorbar;
h.Label.String='Observations per bin';
plot([0+0*1i mean(rcv(:,4))],'color','w','linewidth',3) %plotting a white edge around the line for visibility
plot([0+0*1i mean(rcv(:,4))],'color','k','linewidth',2)
plot(mean(rcv(:,4)),'wo','markerfacecolor','k')
xtick([-40:10:40]),ytick([-40:10:40])
Here, the bins have been chosen to be too fine, so the "cloud" of points has been reduced to a mist, and again, we can't see much. Thus, you'll want to play around with your bin sizes to choose one that seems to reveal the most structure. The 1 cm/s bins we used earlier seem ideal for this dataset.
You'll also want to pay attention to your choice of color scale.
%same plot as above but with a different color axis
[mat,xmid,ymid]=...
twodhist(real(rcv(:,4)),imag(rcv(:,4)),[-50:50],[-50:50]); %the last two arguments set the x- and y- bin sizes
mat(mat==0)=nan; %swap zero values in the histogram for NaNs, so they don't appear with a color
jpcolor(xmid,ymid,mat);hold on
colormap lansey, axis equal, axis(49*[-1 1 -1 1])
vlines(0,'k:'),vlines(0,'k:'),hlines(0,'k:'),h=colorbar;
h.Label.String='Observations per bin';
plot([0+0*1i mean(rcv(:,4))],'color','w','linewidth',3) %plotting a white edge around the line for visibility
plot([0+0*1i mean(rcv(:,4))],'color','k','linewidth',2)
plot(mean(rcv(:,4)),'wo','markerfacecolor','k')
xtick([-40:10:40]),ytick([-40:10:40])
caxis([0 20])
If you see a splotch of all one color in your dataset, one likely explanation is that the color axes limits have been set inappropriately, wiping out some of the information. Thus you'll want to check the color axes and see if an improvement could be made.
Finally, one useful trick is to plot not the number of observations per bin, but rather the logarithm of the number of observations, like so:
%same plot as above but with a logarithmic color axis
[mat,xmid,ymid]=...
twodhist(real(rcv(:,4)),imag(rcv(:,4)),[-50:50],[-50:50]); %the last two arguments set the x- and y- bin sizes
jpcolor(xmid,ymid,log10(mat));hold on
mat(mat==0)=nan; %swap zero values in the histogram for NaNs, so they don't appear with a color
colormap lansey, axis equal, axis(49*[-1 1 -1 1])
vlines(0,'k:'),vlines(0,'k:'),hlines(0,'k:'),h=colorbar;
h.Label.String='Log10 Observations per bin';
plot([0+0*1i mean(rcv(:,4))],'color','w','linewidth',3) %plotting a white edge around the line for visibility
plot([0+0*1i mean(rcv(:,4))],'color','k','linewidth',2)
plot(mean(rcv(:,4)),'wo','markerfacecolor','k')
xtick([-40:10:40]),ytick([-40:10:40])