Description
1. For f = [f1, f2], g = [g1, g2] : [0, 1] → R
2 with fj
, gj ∈ L
2
[0, 1], define an inner product
hf, gi =
X
2
j=1
Z 1
0
fj (t)gj (t)dt.
This inner product allows us to define an FPCA for a bivariate stochastic process X(t) =
[X1(t), X2(t)], t ∈ [0, 1], as described in Section 8.5 of Ramsay and Silverman (2005)
(attached). Implement this FPCA in code, and apply it to analyze the gait cycle data
fda::gait.
2. Simulate a sample of n = 20 realizations from (1) a Gaussian process, and (b) a nonGaussian process. Display the simulated processes.
3. Implement an FPCA for the yeast data (available under the Misc section of the Class
Notebook). Describe the first few modes of variation, and discuss whether the variation
in the dataset is properly summarized by the first few eigenfunctions using FPCA.
4. Let X1, . . . , Xn be independent realizations of an L
2
stochastic process X on T = [0, 1].
Let (λˆ
k, φˆ
ik) be the kth eigenvalue–eigenfunction pair of the sample covariance function
Gˆ. Let Zk = n
−1 Pn
i=1
ˆξik. Show that
(a) Zk = 0 for k = 1, . . . , n − 1;
(b) (n − 1)−1 Pn
i=1(
ˆξik − Zk)
2 = λˆ
k;
(c) (n − 1)−1 Pn
i=1(
ˆξik − Zk)(ˆξik0 − Zk
0) = 0 if k 6= k
0
.
5. Let x1, . . . , xn be n linearly independent (nonrandom) functions in a Hilbert space H.
Show that Pn
i=1(xi − x¯) ⊗ (xi − x¯) ∈ B(H) has rank n − 1.
6. (Reproducing Kernel Hilbert Space) Let K : [0, 1] × [0, 1] → R be a continuous function
(Mercer’s kernel). Define
HK = {f =
X∞
j=1
ajλjej

X∞
j=1
a
2
jλj < ∞},
1
where the (λj
, ej ) are the eigenvalue and eigenfunction pairs of the covariance operator
K associated with K. For f =
P∞
j=1 ajλjej and g =
P∞
j=1 bjλjej
in HK, define inner
product
hf, giK =
X∞
j=1
aj bjλj
.
Then HK is a Hilbert space with this norm.
(a) Show that f(x) = P∞
j=1 ajλjej (x), where the sum converges absolutely.
(b) Show that K(·, t) ∈ HK for t ∈ [0, 1].
(c) Show that hK(·, t), fiK = f(t) for t ∈ [0, 1]. This means the evaluation functional
δt
: HK → R, δt(f) = f(t) is a continuous linear functional.
(Remark: A Hilbert space of functions in which point evaluation is a continuous
linear functional is called a Reproducing Kernel Hilbert Space (RKHS). )
2
166 8. Principal components analysis for functional data
PC 1 (31.2%)
4 6 8 10 12 14 16 18
4 3 2 1 0 1 2 +
+
+
+
+++++++++
+
+
+
+
++
+
+
+
+
+
++
+
+
+
+
++++++





















PC 2 (20.9%)
4 6 8 10 12 14 16 18
4 3 2 1 0 1 2 +
+
+
+
+
++
++++++
+
+
+
+
++
+
+
+
+
+
++
+
+
+
+
++++++ 


















PC 3 (17.5%)
4 6 8 10 12 14 16 18
4 3 2 1 0 1 2 +
+
+
+++++++++++
+
+
+
++
+
+
+
+
+
++
+
+
+
+
++++++


























Figure 8.7. The solid curve in each panel is the mean acceleration in height
in cm/year2 for girls in the Zurich growth study. Each principal component is
plotted in terms of its effect when added (+) and subtracted (−) from the mean
curve.
marker events. Full details of this process can be found in Ramsay, Bock
and Gasser (1995). The curves are for 112 girls who took part in the Zurich
growth study (Falkner, 1960).
Figure 8.7 shows the first three eigenfunctions or harmonics plotted as
perturbations of the mean function. Essentially, the first principal component reflects a general variation in the amplitude of the variation in
acceleration that is spread across the entire curve, but is particularly
marked during the pubertal growth spurt lasting from 10 to 16 years of
age. The second component indicates variation in the size of acceleration
only from ages 4 to 6, and the third component, of great interest to growth
researchers, shows a variation in intensity of acceleration in the prepubertal
period around ages 5 to 9 years.
8.5 Bivariate and multivariate PCA
We often wish to study the simultaneous variation of more than one function. The hip and knee angles described in Chapter 1 are an example; to
understand the total system, we want to know how hip and knee angles
vary jointly. Similarly, the handwriting data require the study of the simultaneous variation of the X and Y coordinates; there would be little point
in studying one coordinate at a time. In both these cases, the two variables
being considered are measured relative to the same argument, time in both
cases. Furthermore, they are measuring quantities in the same units (degrees in the first case and cm in the second). The discussion in this section
is particularly aimed towards problems of this kind.
from Ramsay and Silverman 2005 Functional Data Analysis Second Edition
8.5. Bivariate and multivariate PCA 167
8.5.1 Defining multivariate functional PCA
For clarity of exposition, we discuss the extension of the PCA idea to deal
with bivariate functional data in the specific context of the hip and knee
data. Suppose that the observed hip angle curves are Hip1, Hip2,..., Hipn
and the observed knee angles are Knee1, Knee2,..., Kneen. Let Hipmn and
Kneemn be estimates of the mean functions of the Hip and Knee processes.
Define vHH to be the covariance operator of the Hipi, vKK that of the Kneei,
vHK to be the crosscovariance function, and vKH(t, s) = vHK(s,t).
A typical principal component is now defined by a 2vector ξ = (ξH, ξK)
of weight functions, with ξH denoting the variation in the Hip curve and ξK
that in the Knee curve. To proceed, we need to define an inner product on
the space of vector functions of this kind. Once this has been defined, the
principal components analysis can be formally set out in exactly the same
way as previously.
The most straightforward definition of an inner product between bivariate functions is simply to sum the inner products of the two components.
Suppose ξ1 and ξ2 are both bivariate functions each with hip and knee
components. We then define the inner product of ξ1 and ξ2 to be
ξ1, ξ2 =
ξH
1 ξH
2 +
ξK
1 ξK
2 . (8.20)
The corresponding squared norm ξ2 of a bivariate function ξ is simply
the sum of the squared norms of the two component functions ξH and ξK.
What all this amounts to, in effect, is stringing two (or more) functions together to form a composite function. We do the same thing with
the data themselves: define Anglesi = (Hipi, Kneei). The weighted linear
combination (8.4) becomes
fi = ξ, Anglesi =
ξHHipi +
ξKKneei. (8.21)
We now proceed exactly as in the univariate case, extracting solutions of
the eigenequation system V ξ = ρξ, which can be written out in full detail
as
vHH(s,t)ξH(t) dt +
vHK(s,t)ξK(t) dt = ρξH(s)
vKH(s,t)ξH(t) dt +
vKK(s,t)ξK(t) dt = ρξK(s). (8.22)
In practice, we carry out this calculation by replacing each function Hipi
and Kneei with a vector of values at a fine grid of points or coefficients
in a suitable expansion. For each i these vectors are concatenated into a
single long vector Zi; the covariance matrix of the Zi is a discretized version
of the operator V as defined in (8.7). We carry out a standard principal
components analysis on the vectors Zi, and separate the resulting principal
component vectors into the parts corresponding to Hip and to Knee. The
168 8. Principal components analysis for functional data
Proportion of gait cycle
Hip angle (degrees)
0.0 0.2 0.4 0.6 0.8 1.0
20 0 20 40 60
Hip curve for PC 1
Proportion of gait cycle
Knee angle (degrees)
0.0 0.2 0.4 0.6 0.8 1.0
0 20 40 60 80
Knee curve for PC 1
Proportion of gait cycle
Hip angle (degrees)
0.0 0.2 0.4 0.6 0.8 1.0
20 0 20 40 60
Hip curve for PC 2
Proportion of gait cycle
Knee angle (degrees)
0.0 0.2 0.4 0.6 0.8 1.0
0 20 40 60 80
Knee curve for PC 2
Figure 8.8. The mean hip and knee angle curves and the effects of adding and
subtracting a multiple of each of the first two vector principal components.
analysis is completed by applying a suitable inverse transform to each of
these parts if necessary.
If the variability in one of the sets of curves is substantially greater
than that in the other, then it is advisable to consider downweighting the
corresponding term in the inner product (8.20), and making the consequent
changes in the remainder of the procedure. In the case of the hip and knee
data, however, both sets of curves have similar amounts of variability and
are measured in the same units (degrees) and so there is no need to modify
the inner product.
8.5.2 Visualizing the results
In the bivariate case, the best way to display the result depends on the
particular context. In some cases it is sufficient to consider the individual
parts ξH
m and ξK
m separately. An example of this is given in Figure 8.8, which
displays the first two principal components. Because ξH
m2 +ξK
m2 = 1 by
definition, calculating ξH
m2 gives the proportion of the variability in the
mth principal component accounted for by variation in the hip curves.
For the first principal components, this measure indicates that 85% of
the variation is due to the hip curves, and this is borne out by the presentation in Figure 8.8. The effect on the hip curves of the first combined
principal component of variation is virtually identical to the first principal
8.5. Bivariate and multivariate PCA 169
component curve extracted from the hip curves considered alone. There is
also little associated variation in the knee curves, apart from a small associated increase in the bend of the knee during the part of the cycle where
all the weight is on the observed leg. The main effect of the first principal
component remains an overall shift in the hip angle. This could be caused
by an overall difference in stance; some people stand up more straight than
others and therefore hold their trunks at a different angle from the legs
through the gait cycle. Alternatively, there may simply be variation in the
angle of the marker placed on the trunk.
For the second principal component, the contributions of both hip and
knee are important, with somewhat more of the variability (65%) due to
the knee than to the hip. We see that this principal component is mainly a
distortion in the timing of the cycle, again correlated with the way in which
the initial slight bend of the knee takes place. There is some similarity to
the second principal component found for the hip alone, but this time there
is very substantial interaction between the two joints.
A particularly effective method for displaying principal components in
the bivariate case is to construct plots of one variable against the other.
Suppose we are interested in displaying the mth principal component
function. For equally spaced points t in the time interval on which the
observations are taken, we indicate the position of the mean function values (Hipmn(t), Kneemn(t)) by a dot in the (x,y) plane, and we join this dot
by an arrow to the point (Hipmn(t) + CξH
m(t), Kneemn(t) + CξK
m(t)). We
choose the constant C to give clarity. Of course, the sign of the principal
component functions, and hence the sense of the arrows, is arbitrary, and
plots with all the arrows reversed convey the same information.
This technique is displayed in Figure 8.9. The plot of the mean cycle
alone demonstrates the overall shape of the gait cycle in the hipknee plane.
The portion of the plot between time points 11 and 19 (roughly the part
where the foot is off the ground) is approximately half an ellipse with axes
inclined to the coordinate axes. The points on the ellipse are roughly at
equal angular coordinates — somewhat closer together near the more
highly curved part of the ellipse. This demonstrates that in this part of
the cycle, the joints are moving roughly in simple harmonic motion but
with different phases. During the other part of the cycle, the hip angle is
changing at a approximately constant rate as the body moves forward with
the leg approximately straight, and the knee bends slightly in the middle.
Now consider the effect of the first principal component of variation.
As we have already seen, this has little effect on the knee angle, and all
the arrows are approximately in the xdirection. The increase in the hip
angle due to this mode of variation is somewhat larger when the angle
itself is larger. This indicates that the effect contains an exaggeration (or
diminution) in the amount by which the hip joint is bent during the cycle,
and is also related to the overall angle between the trunk and the legs.
170 8. Principal components analysis for functional data
Hip angle (degrees)
Knee angle (degrees)
20 0 20 40 60
0 20 40 60 80
1
2 4 3 5 7 6 109 8
11
12
13
14 15 16
17
18
19
20
Mean curve (numbered along cycle)
•
• • • • • • • • •
•
•
•
• • •
•
•
•
•
Hip angle (degrees)
Knee angle (degrees)
20 0 20 40 60
0 20 40 60 80
PC 1 (44.5% of variability)
•
• • • • • • • • • •
•
•
• • •
•
•
•
•
Hip angle (degrees)
Knee angle (degrees)
20 0 20 40 60
0 20 40 60 80
PC 2 (19% of variability)
•
• • • • • • • • • •
•
•
• • •
•
•
•
•
Hip angle (degrees)
Knee angle (degrees)
20 0 20 40 60
0 20 40 60 80
PC 3 (12.3% of variability)
Figure 8.9. A plot of 20 equally spaced points in the average gait cycle, and the
effects of adding a multiple of each of the first three principal component cycles
in turn.
The second principal component demonstrates an interesting effect.
There is little change during the first half of the cycle. However, during
the second half, individuals with high values of this principal component
would traverse roughly the same cycle but at a roughly constant time ahead.
Thus this component represents a uniform time shift during the part of the
cycle when the foot is off the ground.
A high score on the third component indicates two effects. There is some
time distortion in the first half of the cycle, and then a shrinking of the
overall cycle; an individual with a high score would move slowly through
the first part of the cycle, and then perform simple harmonic motion of
knee and hip joints with somewhat less than average amplitude.
8.5.3 Inner product notation: Concluding remarks
One of the features of the functional data analysis approach to principal components analysis is that, once the inner product has been defined
appropriately, principal components analysis looks formally the same,
whether the data are the conventional vectors of multivariate analysis,
scalar functions as considered in Section 8.2.2, or vectorvalued functions
as in Section 8.5.1. Indeed, principal component analyses for other possible
forms of functional data can be constructed similarly; all that is needed
8.6. Further readings and notes 171
is a suitable inner product, and in most contexts the definition of such an
inner product will be a natural one. For example, if our data are functions
defined over a region S in twodimensional space, for example temperature
profiles over a geographical region, then the natural inner product will be
given by
S
f(s)g(s)ds,
and the principal component weight functions will also be functions defined
over s in S.
Much of our subsequent discussion of PCA, and of other functional data
analysis methods, will use univariate functions of a single variable as the
standard example. This choice simplifies the exposition, but in most or all
cases the methods generalize immediately to other forms of functional data,
simply by substituting an appropriate definition of inner product.
8.6 Further readings and notes
An especially fascinating and comprehensive application of functional principal components analysis can be found in Locantore, Marron, Simpson,
Tripoli, Zhang and Cohen (1999). These authors explore abnormalities in
the curvature of the cornea in the human eye, and along the way extend
functional principal components methodology in useful ways. Since the
variation is over the spherical or elliptical shape of the cornea, they use
Zernicke orthogonal basis functions. Their color graphical displays and the
importance of the problem make this a showcase paper.
Viviani, Gr¨on and Spitzer (2005) apply PCA to repeated fMRI scans of
areas in the human brain, where each curve is associated with a specific
voxel. They compare the functional and multivariate versions, and find
that the functional approach offers a rather better image of experimental
manipulations underlying the data. They also find that the use of the GCV
criterion is particularly effective in choosing the smoothing parameter prior
to applying functional PCA.
While most of our examples have time as the argument, there are many
important problems in the physical and engineering sciences where spectral
analysis is involved. An example involving elements of both registration and
principal components analysis is reported in Liggett, Cazares and Semmes
(2003). Kneip and Utikal (2001) apply functional principal components
analysis to the problem of describing a set of density curves where the
argument variable is log income.
Besse, Cardot and Ferraty (1997) studied the properties of estimates of
curves where these are assumed to lie within a finitedimensional subspace,
and where principal components analysis is used in the estimation process,
and Cardot (2004) extended this work.