1
Contents
Guidelines 3
1 Classical and Modern Spectrum Estimation 4
1.1 Properties of Power Spectral Density (PSD) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Periodogram-based Methods Applied to Real–World Data . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Correlation Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4 Spectrum of Autoregressive Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.5 Real World Signals: Respiratory Sinus Arrhythmia from RR-Intervals . . . . . . . . . . . . . . . . . . . 8
1.6 Robust Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2
Guidelines
The coursework comprises four assignments, whose individual scores yield 80% of the final mark. The remaining 20%
accounts for presentation and organisation. Students are allowed to discuss the coursework but must code their own
MATLAB scripts, produce their own figures and tables, and provide their own discussion of the coursework assignments.
General directions and notation:
◦ The simulations should be coded in MATLAB, a de facto standard in the implementation and validation of signal
processing algorithms.
◦ The report should be clear, well-presented, and should include the answers to the assignments in a chronological
order and with appropriate labelling. Students are encouraged to submit through Blackboard (in PDF format only),
although a hardcopy submission at the undergraduate office will also be accepted.
◦ The report should document the results and the analysis in the assignments, in the form of figures (plots), tables,
and equations, and not by listing MATLAB code as a proof of correct implementation.
◦ The students should use the following notation: boldface lowercase letters (e.g. x) for vectors, lowercase letters
with a (time) argument (x(n)) for scalar realisations of random variables and for elements of a vector, and uppercase
letters (X) for random variables. Column vectors will be assumed unless otherwise stated, that is, x ∈ RN×1
.
◦ In this Coursework, the typewriter font, e.g. mean, is used for MATLAB functions.
Presentation:
◦ The length limit for the report is 42 pages. This corresponds to ten pages per assignment in addition to one page for
front cover and one page for the table of contents, however, there are no page restrictions per assignment but only
for the full-report (42 pages).
◦ The final mark also considers the presentation of the report, this includes: legible and correct figures, tables, and
captions, appropriate titles, table of contents, and front cover with student information.
◦ The figures and code snippets (only if necessary) included in the report must be carefully chosen, for clarity and to
meet the page limit.
◦ Do not insert unnecessary MATLAB code or the statements of the assignment questions in the report.
◦ For figures, (i) decide which type of plot is the most appropriate for each signal (e.g. solid line, non-connected
points, stems), (ii) export figures in a correct format: without grey borders and with legible captions and lines, and
(iii) avoid the use of screenshots when providing plots and data, use figures and tables instead.
◦ Avoid terms like good estimate, is (very) close, somewhat similar, etc - use formal language and quantify your
statements (e.g. in dB, seconds, samples, etc).
◦ Note that you should submit two files to Blackboard: the report in PDF format and all the MATLAB code files
compressed in a ZIP/RAR format. Name the MATLAB script files according to the part they correspond to (e.g.
SEASP_Part_X-Y-Z.m).
Honour code:
Students are strictly required to adhere to the College policies on students responsibilities. The College has zero tolerance to plagiarism. Any suspected plagiarism or cheating
(prohibited collaboration on the coursework, including posting your solutions on internet
domains and similar) will lead to a formal academic dishonesty investigation. Being found
responsible for an academic dishonesty violation results in a discipline file for the student
and penalties, ranging from severe reduction in marks to expulsion from College.
3
1 Classical and Modern Spectrum Estimation
Aims: Students will learn to:
• Understand the challenges in spectrum estimation of real–valued data.
• Consider spectral estimation as a dimensionality reduction problem and learn how to benefit from a Machine Intelligence approach to spectral estimation.
• Perform practical spectrum estimation using parametric models, understand the issues with the resolution, bias and
variance, so as to motivate modern subspace approaches.
• Understand the pivotal role of correct estimates of second order statistics in applications of spectral estimations,
and the effects that biased vs. unbiased estimates of the correlation function have on practical estimates of power
spectra.
• Understand the benefits and drawbacks of model–based parametric and line spectra. Learn how these spectra
mitigate the problems with the bias and variance of classic spectral estimators.
• Use dimensionality reduction techniques to resolve the time–frequency uncertainty issues in frequency–based Machine Intelligence.
• Learn how to deal with the problems of ill–conditioning of data correlation matrices and perform advanced Principal
Component Regression type of estimation.
• Verify these concepts on real world examples in Brain Computer Interface, and in the conditioning and estimation
parameters of your own Electrocardiogram (ECG) and respiration signals.
• Understand how the above concepts can be utilised in the future eHealth.
Background. For a discrete time deterministic sequence {x(n)}, with finite energy P∞
n=−∞ |x(n)|
2 < ∞, the Discrete
Time Fourier Transform (DTFT) is defined as
X(ω) = X∞
n=−∞
x(n)e
−ωn (DTFT). (1)
We often use the symbol X(ω) to replace the more cumbersome X(e
ω). The corresponding inverse DTFT is given by
x(n) = 1
2π
Z π
−π
X(ω)e
ωndω (inverse DTFT). (2)
This can be verified by substituting (2) into (1). The energy spectral density is then defined as
S(ω) = |X(ω)|
2
(Energy Spectral Density). (3)
A straightforward calculation gives
1
2π
Z π
−π
S(ω)dω =
1
2π
Z π
−π
X∞
n=−∞
X∞
m=−∞
x(n)x(m)e
−ω(n−m)
dω =
X∞
n=−∞
X∞
m=−∞
x(n)x(m)
1
2π
Z π
−π
e
−ω(n−m)
dω
=
X∞
n=−∞
|x(n)|
2
.
(4)
In the process, we have used the equality R ∞
−∞ e
ω(n−m)dω = δn,m (the Kronecker delta). Equation (4) can be now
restated as
X∞
n=−∞
|x(n)|
2 =
1
2π
Z π
−π
S(ω) (Parseval0
s theorem). (5)
For random sequences we cannot guarantee finite energy for every realisation (and hence no DTFT). However, a random
signal usually has a finite average power, and can therefore be characterised by average power spectral density (PSD). We
assume zero mean data, E{x(n)} = 0, so that the autocovariance function (ACF) of a random signal x(n) is defined as
r(k) = E{x(k)x
∗
(k − m)} (Autocovariance function ACF). (6)
The Power Spectral Density (PSD) is defined as the DTFT of the ACF in the following way
P(ω) = X∞
k=−∞
r(k)e
−ωk Definition 1 of Power Spectral Density. (7)
The inverse DTFT of P(ω) is given by r(k) = 1
2π
R π
−π
P(ω)e
kωdω, and it is readily verified that 1
2π
R π
−π
P(ω)e
kωdω =
P∞
l=−∞ r(l)
h
1
2π
R π
−π
e
(k−l)ωdωi
= r(k).
4
Observe that
r(0) = 1
2π
Z π
−π
P(ω)dω. (8)
Since from (6) r(0) = E{|x(n)|
2} measures the (average) signal power, the name PSD for P(ω) is fully justified, as from
(8) it represents the distribution of the (average) signal power over frequencies. The second definition of PSD is given by
P(ω) = lim
N→∞
E
1
N
N
X−1
n=0
x(n)e
−nω
2
Definition 2 of Power Spectral Density. (9)
1.1 Properties of Power Spectral Density (PSD)
Approximation in the definition of PSD.
Show analytically and through simulations that the definition of PSD in (7) is equivalent to that in (9) under a mild [5]
assumption that the covariance sequence r(k) decays rapidly, that is [?]
lim
N→∞
1
N
N
X−1
k=−(N−1)
|k||r(k)| = 0. (10)
Provide a simulation for the case when this equivalence does not hold. Explain the reasons.
1.2 Periodogram-based Methods Applied to Real–World Data
Now consider two real–world datasets: a) The sunspot time series1
and b) an electroencephalogram (EEG) experiment.
a) Apply one periodogram-based spectral estimation technique (possibly after some preprocessing) to the sunspot time [10]
series. Explain what aspect of the spectral estimate changes when the mean and trend from the data are removed
(use the MATLAB commands mean and detrend). Explain how the perception of the periodicities in the data
changes when the data is transformed by first applying the logarithm to each data sample and then subtracting the
sample mean from this logarithmic data.
The basis for brain computer interface (BCI).
b) The electroencephalogram (EEG) signal was recorded from an electrode located at the posterior/occipital (POz) [10]
region of the head. The subject observed a flashing visual stimulus (flashing at a fixed rate of X Hz, where X is
some integer value in the range [11, . . . , 20]). This induced a response in the EEG, known as the steady state visual
evoked potential (SSVEP), at the same frequency. Spectral analysis is required to determine the value of ‘X’. The
recording is contained in the EEG_Data_Assignment1.mat file2 which contains the following elements:
◦ POz – Vector containing the EEG samples (expressed in Volts) obtained from the POz location on the scalp,
◦ fs – Scalar denoting the sampling frequency (1200 Hz in this case).
Read the readme_Assignment1.txt file for more information.
Apply the standard periodogram approach to the entire recording, as well as the averaged periodogram with different window lengths (10 s, 5 s, 1 s) to the EEG data. Can you identify the the peaks in the spectrum corresponding
to SSVEP? There should be a peak at the same frequency as the frequency of the flashing stimulus (integer X in
the range [11, . . . , 20]), known as the fundamental frequency response peak, and at some integer multiples of this
value, known as the harmonics of the response. It is important to note that the subject was tired during the recording
which induced a strong response within 8-10 Hz (so called alpha-rhythm), this is not the SSVEP. Also note that a
power-line interference was induced in the recording apparatus at 50 Hz, and this too is not the SSVEP. To enable
a fair comparison across all spectral analysis approaches, you should keep the number of frequency bins the same.
Hint: It is recommended to have 5 DFT samples per Hz.
How does the standard periodogram approach compare with the averaged periodogram of window length 10 s?
Hint: Observe how straightforward it is to distinguish the estimated SSVEP
peaks from other spurious EEG activity in the surrounding spectrum.
In the case of averaged periodogram, what is the effect of making the window size very small, e.g. 1 s?
1
Included in MATLAB, use load sunspot.dat
5
1.3 Correlation Estimation
Unbiased correlation estimation and preservation of non-negative spectra. Recall that the correlation-based definition
of the PSD leads to the so-called correlogram spectral estimator given by
P(ω) =
N
X−1
k=−(N−1)
rˆ(k)e
jωk (11)
where the estimated autocorrelation function rˆ(k) can be computed using the biased or unbiased estimators given by
Biased: rˆ(k) = 1
N
X
N
n=k+1
x(n)x
∗
(n − k) (12)
Unbiased: rˆ(k) = 1
N − k
X
N
n=k+1
x(n)x
∗
(n − k) (0 ≤ k ≤ N − 1). (13)
Although it may seem that the unbiased estimate is more appropriate as its mean matches the true mean of PSD,
observe that this estimate (despite being exact) can be highly erratic for larger lags k (close to N), where fewer samples
are available to estimate the PSD. As a consequence, the ACF may not be positive definite, resulting in negative PSD
values.
a) Write a MATLAB script which calculates both biased and unbiased ACF estimates of a signal and then use these [10]
ACF estimates to compute the corresponding correlogram in Eq. (11). Validate your code for different signals
e.g. WGN, noisy sinusoidal signals and filtered WGN. Explain how the spectral estimates based on (12)-(13) differ
from one another? In particular, how does the correlogram corresponding to the unbiased ACF estimates behave
for large lags (i.e. k close to N)? Does the unbiased ACF estimate result in negative values for the estimated PSD?
Plotting the PSD in dB. Depending on the estimation approach, the spectral estimate Pˆ(ω) can be asymptotically
unbiased with variance µP2
(ω), where µ > 0 is a constant. When several realisations of a random signal are available,
it is possible to present the estimate PSD as a confidence interval defined by Pˆ(ω) ± µσˆP (ω)
, where Pˆ(ω) and σˆP (ω)
are respectively the mean and standard deviation of the estimated PSDs of the available observations. A drawback of
this approach is that, as stated earlier, the standard deviation is proportional to the value of the PSD and therefore the
confidence interval widens in zones where the PSD increases, and it is these parts that we are particularly interested in.
Fig. 1 shows an overlay plot of 100 realisations of the PSD of two sinusoids immersed in i.i.d. WGN showing the mean
(top), and the standard deviation of the set (bottom).
For ease of presentation, by plotting the PSD estimates in decibels we observe a more condensed realisation due to
the contraction property of the logarithm.
b) Use your code from the previous section (only the biased ACF estimator) to generate the PSD estimate of several [5]
realisations of a random process and plot them as in Fig. 1. Generate different signals composed of sinusoids
corrupted by noise and elaborate on how disperse are the different realisation of the spectral estimate. Hint: use the
fft and fftshift commands in MATLAB.
c) Plot your estimates in dB, together with their associated standard deviation (again as in Fig. 1 for comparison). [5]
How much spread out are the estimates now? Comment on the benefits of this representation.
Frequency estimation by MUSIC. In order to accurately estimate the spectrum of closely-spaced sine waves using
the periodogram, a large number of samples N is required since the frequency resolution of the periodogram is proportionate to 1/N. On the other hand, subspace methods assume a harmonic model consisting of a sum of sine waves, possibly
complex, in additive noise. In this setting, the noise is also complex-valued.
For illustration, consider a complex-valued signal of 30 samples in length, generated using the following code:
n = 0:30;
noise = 0.2/sqrt(2)*(randn(size(n))+1j*randn(size(n)));
x = exp(1j*2*pi*0.3*n)+exp(1j*2*pi*0.32*n)+ noise;
The signal consists of two complex exponentials (sine waves) with frequencies of 0.3 Hz and 0.32 Hz and additive
complex white Gaussian noise. The noise has zero mean and variance of 0.2.
The spectral estimate using the periodogram (rectangular window, 128 frequency bins and unit sampling rate) is shown
in Fig. 2. Observe that the periodogram was not able to identify the two lines in the spectrum; this is due to the resolution
of the periodogram being proportionate to 1/N, which is greater than the separation between the two frequencies.
6
0 5 10 15 20 25 30 35 40
0
100
200
300
PSD estimates (different realisations and mean)
Frequency [π radians]
0 5 10 15 20 25 30 35 40
0
20
40
Standard deviation of the PSD estimate
Frequency [π radians]
Figure 1: PSD estimates of two sinusoids immersed in noise. Top: An overlay plot of 100 realisations and their mean.
Bottom: Standard deviation of the 100 estimates.
0 100 200 300 400 500 600
−25
−20
−15
−10
−5
0
5
10
15
Frequency (mHz)
P
o
w
e
r
/
f
r
e
q
u
e
n
c
y (d
B/H
z)
Periodogram Power Spectral Density Estimate
Figure 2: Periodogram of two complex exponentials with closely-spaced frequencies.
d) Familiarise yourself with the generation of complex exponential signals, and generate signals of different frequen- [5]
cies and length. Verify that by considering more data samples the periodogram starts showing the correct line
spectra.
e) Use the following code to find the desired line spectra using the MUSIC method.
[X,R] = corrmtx(x,14,’mod’);
[S,F] = pmusic(R,2,[ ],1,’corr’);
plot(F,S,’linewidth’,2); set(gca,’xlim’,[0.25 0.40]);
grid on; xlabel(’Hz’); ylabel(’Pseudospectrum’);
Explain the operation of the first three lines in the code using the MATLAB documentation and the lecture notes. [10]
What is the meaning of the input arguments for the functions corrmtx and pmusic? Does the spectrum estimated
using the MUSIC algorithm provide more detailed information? State briefly the advantages and disadvantages of
the periodogram and the MUSIC algorithms and comment on the bias and variance. How accurate would a general
spectrum estimate be when using MUSIC?
7
1.4 Spectrum of Autoregressive Processes
In many spectrum estimation applications, only short data lengths are available; thus, classical spectrum estimation techniques based on the Fourier transform will not be able to resolve frequency elements spaced close to one another. In order
to solve this problem, we can use modern spectrum estimation methods based on the pole-zero modelling of the data.
Consider a general ARMA(p, q) process given by
y(n) = a1y(n − 1) + · · · + apy(n − p) + w(n) + b1w(n − 1) + · · · + bqw(n − q)
The power spectrum of y has the form
Py(e
jω) = |
Pq
k=1 bke
−jkω|
2
|1 −
Pp
k=1 ake−jkω|
2
Thus, the power spectrum can be estimated through the parameters (a1, ..., ap, b1, .., bq). The assumption of an underlying model for the data is the key difference between classical and modern spectrum estimation methods.
For an AR process in particular, the power spectrum is the output of an all-pole filter given by
Py(e
jω) = σ
2
w
|1 −
Pp
k=1 ak(k)e−jkω|
2
The parameters σ
2
w and a =
a1 . . . ap
T
can be estimated by a set of (p + 1) linear equations
rx(0) rx(1) . . . rx(p)
rx(1) rx(0) . . . rx(p − 1)
. . . .
. . . .
. . . .
rx(p) rx(p − 1) . . . rx(0)
1
a1
.
.
.
ap
= σ
2
w
1
0
.
.
.
0
where rx(k) could be calculated using the biased autocorrelation estimate
rx(k) = 1
N
N
X−1−k
n=0
x(n + k)x(n)
or the unbiased autocorrelation estimate
rx(k) = 1
N − k
N
X−1−k
n=0
x(n + k)x(n)
a) Based on your answers in Section 2.1, elaborate on the shortcomings of using the unbiased ACF estimate when [5]
finding the AR parameters? [see Eq. (13)]
b) Generate 1000 samples of data in MATLAB, according to the following equation [10]
x(n) = 2.76x(n − 1) − 3.81x(n − 2) + 2.65x(n − 3) − 0.92x(n − 4) + w(n)
where w ∼ N (0, 1) and discard the first 500 samples (x=x(500:end)) to remove the transient output of the filter.
Estimate the power spectrum density of the signal using model orders p = 2, ..., 14 and comment on the effects
of increasing the order of the (assumed) underlying model by comparing the estimation to the true Power Spectral
Density. Only plot the results of the model orders which produced the best results.
c) Repeat the experiment in b) for data length of 10, 000 samples. What happens to the PSD when the chosen model [5]
order is lower (under-modelling) or higher (over-modelling) than the correct AR(4) model order?
1.5 Real World Signals: Respiratory Sinus Arrhythmia from RR-Intervals
Important change Section 1.5 of the CW:
• If you have taken Adaptive Signal Processing last year, then you already have your own ECG data from the
wrists, and please proceed with this Assignment;
• If you do not have your own ECG recordings, then we will provide the data. We will an email to the class
with the data and explanations.
8
Respiratory sinus arrhythmia (RSA) refers to the modulation of cardiac function by respiratory effort. This can be readily
observed by the speeding up of heart rate during inspiration (“breathing in”) and the slowing down of heart rate during
expiration (“breathing out”). The strength of RSA in an individual can be used to assess cardiovascular health. Breathing
at regular rates will highlight the presence of RSA in the cardiac (ECG) data.
a) Apply the standard periodogram as well as the averaged periodogram with different window lengths (e.g. 50 s, 150 [10]
s ) to obtain the power spectral density of the RRI data. Plot the PSDs of the RRI data obtained from the three trials
separately.
b) Explain the differences between the PSD estimates of the RRI data from the three trials? Can you identify the peaks [5]
in the spectrum corresponding to frequencies of respiration for the three experiments?
c) Plot the AR spectrum estimate for the RRI signals for the three trials3
. To find the optimal AR model order, [10]
experiment with your model order until you observe a peak in the spectrum (approximately) corresponding to
the theoretical respiration rate. List the differences you observe between your estimated AR spectrum and the
periodogram estimate in Part a).
1.6 Robust Regression
Load the file4 PCAPCR.mat which includes the data matrices X ∈ R
N×dx and Y ∈ R
N×dy , described below.
Training Data Testing Data Note
X - Input variables, some of which are collinear. Each column represents N measurements of an input variable.
Xnoise Xtest Noise corrupted input matrix Xnoise = X + NX, where the elements of NX were drawn from a zero-mean Gaussian distribution.
Y Ytest Output variables obtained from Y = XB + NY where the coefficient matrix B is unknown. Each column in Y represents N
measurements of an output variable. The elements of NY were
drawn from a zero-mean Gaussian distribution.
Figure 3: Principle of PCA: Illustration of the signal and noise subspaces.
Using the Matlab command svd, obtain the singular value decomposition for the matrices X and Xnoise.
a) Plot the singular values of X and Xnoise (hint: use the stem command), and identify the rank of the input data [5]
X. Plot the square error between each singular value of X and Xnoise. Explain the effect of noise on the singular
values, and state at what point would it become hard to identify the rank of the matrix Xnoise.
b) Using only the r most significant principal components (as determined by the identified rank), create a low-rank [5]
approximation of Xnoise, denoted by X˜
noise. Compare the difference (error) between the variables (columns) of the
noiseless input matrix, X, and those in the noise corrupted matrix Xnoise and denoised matrix X˜
noise.
The output data are obtained as Y = XB + NY. The ordinary least squares (OLS) estimate for the unknown
regression matrix, B, is then given
Bˆ
OLS = (XT X)
−1XT Y (14)
Since the matrix XT X which is calculated from the original data, X, is sub-rank, the OLS solution in (14) becomes
intractable. On the other hand, for the noisy data, Xnoise, the term XT
noiseXnoise is full-rank, and therefore admits the OLS
solution, however, this may introduce spurious correlations in the calculation of regression coefficients.
9
X t1 t2 … t … r
pT
1 pT
2
Signal Subspace Noise Subspace
tm
pT
m
= tr+1
S
N
pT
r
tr
pT
tr+1
Noise
X X
3Use the MATLAB function aryule to estimate the AR coefficients for your RRI signal.
To circumvent this issue in the estimation of B, the principal component regression (PCR) method first applies principal component analysis (PCA) on the input matrix Xnoise. Specifically, the SVD of Xnoise is given by Xnoise = UΣVT
.
By retaining the r largest principal components (r-singular values and the associated singular vectors), the PCR solution
is given by
Bˆ
PCR = V1:r(Σ1:r)
−1UT
1:rY
where the subscript (1 : r) denotes the r-largest singular values and the corresponding singular vectors. In this way, the
PCR solution avoids both the problem of collinearity and noise in the input matrix. Figure 4 illustrates the difference
between the OLS and PCR methods.
BOLS
BOLS BOLS
N
=
=
Yˆ OLS
= Yˆ PCR
BPCR
ˆ
ˆ ˆ
ˆ
X
XNoise
X
X˜ Noise
Figure 4: Comparing OLS and PCR solutions.
c) Calculate the OLS and PCR solutions for the parameter matrix B, which relates Xnoise and Y. Next, compare the [5]
estimation error between Y and Yˆ
OLS = XnoiseBˆ
OLS and Yˆ
PCR = X˜
noiseBˆ
PCR. Explain what happens when you
estimate the data from the test-set using the regression coefficients computed from the training set, and quantify the
performance by comparing Ytest and Yˆ
test-OLS = XtestBˆ
OLS with Yˆ
test-PCR = X˜
testBˆ
PCR.
In real world machine intelligence applications, a model is trained with a finite set of data which is referred to as the
training set. After the training, the model is not only expected to be a good fit to the training data, but also it needs
to model out-of-sample data. Any model which fits the training data well but has poor out-of-sample performance is
said to be “over-fitted". Therefore, it is important to validate the regression model computed in this section on a test-set
which is another realisation of the signal drawn from the statistical distribution of the training set. For this task, the file
PCAPCR.mat contains both the training data and test data, which should be used to validate the effectiveness of the
regression model derived from the OLS and PCR solutions.
d) The best way to assess the effectiveness of the PCR compared to the OLS solution is by testing the estimated [5]
regressions coefficients, Bˆ , over an ensemble of test data. The file PCR.zip contains the script regval, the
output of which is a new realisation of the test data, Y, and its estimate, Yˆ , the input are the regression coefficients,
and the function syntax is:
[Yˆ , Y] = regval(Bˆ ).
Using the same PCR and OLS regression coefficients as in (c), compute and compare the mean square error estimates for the PCR and OLS schemes, MSE = E{kY − Yˆ k
2
2}, based on the realisations of Y and Yˆ provided by
the function regval. Comment on the effectiveness of these schemes.
如有需要,请加QQ:99515681 或WX:codehelp