# Introduction to Spectral Analysis¶

In this assignment, we will look at the basics of spectral analysis. As complex-valued or bivariate data is quite common in the earth sciences, we will work with horizontal velocity data from an oceanographic current meter.

We'll be working with data from the m1244 mooring in the Labrador Sea, which you can get in NetCDF form here. (This is included in the full distribution of the course materials.)

Many thanks to Jan-Adrian Kallmyr for helping with translating the Matlab tutorial into Python.

Let's take a quick look at the dataset.

# The Periodogram, a.k.a. the Naive Spectral Estimate¶

The first point to know is that the spectrum is not something that can be computed. Instead, it is estimated. The true spectrum is a theoretical object that can't be computed unless you have an infinite amount of time and perfect sampling. The things that can be computed are called spectral estimates or estimated spectra.

Firstly, we will look at the modulus-squared Fourier transform. It is common for this to be referred to as ‘the spectrum’. However, this terminology is incorrect and misleading. Instead, the modulus-squared Fourier transform is a type of spectral estimate called the periodogram. It is known, actually, to be a very poor spectral estimate for reasons we will learn about in the course notes.

The x-axis is simply the index number of the terms in the squared discrete Fourier transform.

Here we have set the y-axis to be logarithmic. This is often useful in dealing with spectra. Note also that we have removed the mean prior to taking the fft, which minimizes broadband bias from the zero frequency.

In the array output by fft, positive frequencies (circles rotating in a counterclockwise sense) are on the left, while negative frequencies (circles rotating in a clockwise sense) are on the right, appearing in reverse order as we have discussed in class. Thus the highest resolvable frequency, called the Nyquist frequency, is in the middle.

You can see clearly that the periodogram roughly has a certain symmetry if you reflect it about the middle. However, it is not completely symmetric because, with a complex-valued signal such as velocity, the twin frequencies (positive and negative rotations at the same absolute frequency) are not required to cancel.

We will discuss the nature of the features we see here shortly. For the moment, we compare this with the periodogram of just the eastward velocity, the real part of our complex-valued velocity time series:

Now, the periodogram is perfectly symmetric about the center or Nyquist frequency. As discussed in the course notes, twin frequencies occur in conjugate pairs, with the same magnitudes but reversed phase. Because of this, two complex exponentials rotating in opposite directions cancel to yield a real-valued, phase-shifted sinuosoid. The symmetry we see in the periodogram is a reflection of the fact that the time series is real-valued.

Adding two oppositely-rotating circles in general leads to an ellipse. When the magnitude of one circle vanishes the result is a circle. When the magnitude of both circles are identical, the result is a line, and this is the case for a real-valued time series.

# One-Sided Spectra¶

There are several inconvenient aspects of presenting the periodogram in this way. Firstly, it is a little bit of a hassle to figure out where the frequencies are. Secondly, it is a bit odd to see the positive and negative frequencies meet in the middle at the Nyquist. Thirdly, the y-axis should be normalized in a meaningful way.

We'll make a new plot that addresses these issues.

Here the two-sided periodgram has been split into two one-sided portions, called the positive and negative rotary spectra, each containing roughly half as many data points as the original time series. Their structure is still hard to see at the moment; we will get to that later.

We have also normalized the y-axis in a sensible way. The spectrum integrates to the variance, so we would like our spectral estimate to do so, too. For the two-sided periodgram with an odd number of points, using radian frequency, the correct formula to recover the variance is

where the omega-omega is a differential, as would appear in an intergal. When we compare this to the directly computed variance, we find

verifying that our spectral estimate is correctly normalized to obtain the variance as computed in the time domain. These numbers imply a standard deviation of $\sigma=\sqrt{94.5}=9.7$ cm/s.

This explains the units of the spectrum. Its units must the square of whatever the units are of your time series, in this case cm/s, divided by the frequency units, in this case 1/days (recall that radians are dimensionless). This is because the spectrum is a partitioning of variance across frequencies.

Next we put down some markers of some meaningful frequencies: the Coriolis frequency at the latitude of the mooring location---around which the oceanic internal wave field is concentrated---and eight prominent tidal frequencies.

The tidal frequencies appear in three groups. From right to left, there are four semidiurnal tidal frequencies, three diurnal frequencies, and one low-frequency tide, the Mf lunal fortnightly tide at about one cycle per 13.6 days.

As we have discussed, when a frequency is expressed in units of radians per unit time, it is called a radian or angular frequency. When a frequency is expressed in units of cycles per unit time, it is called a cyclic frequency. This is more easily seen in how one writes sinusoid or a complex exponential:

Thus the relationship between the radian frequency $\omega$ and the cyclic frequency $f$ is $f=\omega/2\pi$.