1 Introduction

Recent advances in medical image analysis allow researchers to track an individual subject’s development with multiple repeated observations [4]. Longitudinal regression analysis, which adequately accounts for intra-subject correlation, is essential to estimate unknown relationships between subject-specific temporal changes and characteristics of individuals via repeated observations [2].

Data derived from medical images, such as diffusion tensors, diffeomorphic deformations, or geometric shapes, are to be analyzed on their natural nonlinear spaces, e.g. Riemannian manifolds. For longitudinal analysis of nonlinear data on a Riemannian manifold, hierarchical geodesic models were suggested as extensions of a linear mixed effects model on Euclidean space [6, 7, 9]. These methods estimate the same model for both subject and population levels which is a direct reformulation of a linear mixed effects model on a Riemannian manifold. The previous methods face challenges from too few observations per subject to hierarchically solve a multivariate model at a subject level even though the number of subjects may be sufficient. Despite the importance of including subject-specific characteristics for longitudinal analysis, a hierarchical model that analyzes the relationship between subject-wise morphological changes over time and multiple covariates on a Riemannian manifold has not been shown before.

We propose a novel hierarchical multi-geodesic model for longitudinal analysis of subject-specific morphological changes of anatomical structures on a Riemannian manifold. Each subject-wise morphological change over time is modeled by a univariate geodesic model which only requires a minimum of two observations per subject. The effects of subject-specific characteristics, represented by subject-specific covariates, on subject-wise trajectories are modeled by a multi-geodesic model. The covariates are fixed for an individual subject but varying across a population. The multi-geodesic model consists of multivariate intercept and slope models which account for the effects of the subject-specific characteristics to subject-wise baselines and developments over time, respectively. It is worth noting that the proposed method is different from the direct extension of a linear mixed effects model because the relationship between subject-specific characteristics and the slopes of subject-wise temporal trajectories cannot be modeled by linearly adding the covariates as additional explanatory variables.

A synthetic example on a \(S^2\) manifold with a comparison to the ground truth showed the feasibility of the proposed method. Experimental validation on 72 corpus callosum shapes represented on a nonlinear shape space from 24 autism spectrum disorder subjects with different autism severity scores demonstrate the capability of the proposed method for longitudinal analysis of the relationship between subject-specific morphological changes and multiple covariates.

2 Method

Background on Riemannian Geometry: A geodesic is a zero-acceleration curve on an n-dimensional Riemannian manifold M. It has the minimizing property that there is no curve shorter than a geodesic between any two points within a small neighborhood. An exponential map \(Exp( p, v ) = q\) is a mapping of \(p \in M\) to \(q \in M\) along a geodesic going out from p in the direction and magnitude of v. Its inverse, \(Log( p, q ) = v\), is defined onto a neighborhood U(p) of p. The Riemannian distance between p and q is the length of a geodesic between the two, \(d(p, q ) = ||Log(p,q)||\). Parallel transport \(\psi _{p \rightarrow q}(u)\) of a tangent vector \(u \in T_pM\) along a differentiable curve \(c(t) :I \rightarrow M\) from p to q is defined by a unique parallel vector field V(t) along c(t) where \(I=[ 0, 1 ]\), \(c(0) = p\), and \(c(1) = q\). V(t) satisfies \(V(t_0) = u\) and \(\frac{DV}{dt} = 0\), where \(\frac{DV}{dt}\) is a covariant derivative of a vector field V. Parallel transport has the following angle and scale preserving properties: the angle of u along c(t) does not change, \(\frac{DV}{dt} = 0\), and the scale of u also does not change, \(||u|| = ||V(t)||\), at any \(t \in I\) [1]. \(\psi _{p \rightarrow q}(u)\) is not unique and depends on a curve c that connects p and q. We will only use \(\psi _{p \rightarrow q}\) along with the unique geodesic between p and q to guarantee the transport to be unique.

Subject-Wise Trajectory Estimation: Let \(y_{ij} \in M\) be the jth observation of the ith subject associated with time \(t_{ij} \in \mathbb {R}\), \(i=1,.., N_s\), \(j=1,..., N^i_{obs}\). \(N_s\) and \(N^i_{obs}\) are the number of subjects and the number of observations of the ith subject, respectively. We estimate a subject-wise trajectory \(Y_i\) by the least squares geodesic regression model [3].

$$\begin{aligned} ( \hat{a}_i, \hat{b}_i ) = \mathop {\mathrm {argmin}}\limits _{ a_i, b_i } \sum ^{N_{obs}^i}_{j=1} d^2( y_{ij}, Exp( a_i, b_i t_{ij} ) ), \qquad Y_i = Exp( \hat{a}_i, \hat{b}_i t), \end{aligned}$$
(1)

where \(a_i \in M\) and \(b_i \in T_{a_i}M\) are an intercept and a slope tangent vector of the geodesic model \(Y_i\) for the ith subject’s trajectory as shown in Fig. 1(a). It is worth noting that Eq. 1 only requires the minimum of two observations per subject to solve for two coefficients, an intercept \(a_i\) and a slope \(b_i\).

Hierarchical Multi-geodesic Model: Let the ith subject be associated with two sets of subject-specific covariates for intercepts \(\pmb {\eta }^i = \{ \eta ^i_{1}, ..., \eta ^i_{N_\eta } \}\) and slopes \(\pmb {\theta }^i = \{ \theta ^i_{1}, ..., \theta ^i_{N_\theta } \}\), where \(N_\eta \) and \(N_\theta \) are the numbers of the covariates.

We aim to model the effects of \(\pmb {\eta }\) and \(\pmb {\theta }\) on subject-wise trajectories. The proposed Hierarchical Multi-Geodesic model (HMG) consists of an intercept model \(f(\pmb {\eta })\) and a slope model \(g( \pmb {\theta })\),

$$\begin{aligned} Y = Exp( Exp( f( \pmb {\eta } ), g( \pmb {\theta } ) t ), \epsilon ). \end{aligned}$$
(2)

Intercept Model: The intercept model \(f(\pmb {\eta })\) is formulated as a multivariate geodesic model with a base intercept \(\beta _0 \in M\) and tangent vectors \(\beta _k \in T_{\beta _0}M\) [5],

$$\begin{aligned} f(\pmb {\eta }) = Exp( \beta _0, \beta _1 \eta _{1} + ... + \beta _{N_\eta } \eta _{N_\eta } ). \end{aligned}$$
(3)

\(\beta _k\), \(k=1,..., N_\eta \), are coefficients that represent the effects of subject-specific covariates \(\eta _k\) to the intercepts of subject-wise trajectories. For example, a hypothesis that a subject diagnosed with autism and a healthy subject might have different baseline corpus callosum shapes at 3-month after birth can be modeled by the intercept model f. The coefficients \(\beta _k\) can be estimated by a least squares formulation of the multivariate geodesic model,

$$\begin{aligned} ( \hat{\beta }_0, \hat{\beta }_1, ..., \hat{\beta }_{N_\eta } ) = \mathop {\mathrm {argmin}}\limits _{\beta _0, ..., \beta _{N_\eta }} \sum ^{N_{s}}_{i=1} d^2( f( \pmb {\eta }^i ), \hat{a}_i ). \end{aligned}$$
(4)

We optimize Eq. 4 by a Euclideanized optimization scheme similar to [5] with an iterative update of an anchor point to the estimated intercept, instead of fixing it at the intrinsic mean of given data. The estimated intercept model is given as \(\hat{f}(\pmb {\eta }) = Exp( \hat{\beta }_0, \hat{\beta }_1 \eta _1 + ... + \hat{\beta }_{N_\eta } \eta _{N_\eta } )\) after optimization.

Fig. 1.
figure 1

Illustration of the proposed method. (a) Subject-wise geodesic trajectory estimation. (b) The example of an intercept model f(c) with a single covariate c. (c) Stop-over parallel transport \(\phi \) from \(\hat{a}_2\) to \(\hat{\beta }_0\).

The least squares formulation Eq. 4 assumes that the distribution of subject-wise intercepts around \(\hat{f}(\pmb {\eta })\) is the generalized normal distribution [3],

$$\begin{aligned} \hat{a}_i = Exp( \hat{f}( \pmb {\eta }^i ), \epsilon _a ), \end{aligned}$$
(5)

where \(\epsilon _a \sim N( 0, \sigma _{a}^2 )\). \(\epsilon _a\) can be directly interpreted as random effects of subject-wise intercepts that indicate longitudinal effects of repeated observations of individual subjects [2]. Figure 1(b) shows the illustration of the concept of the intercept model f with an example model with one continuous covariate c.

Tangent Vector Space of Slope Model: The effects of covariates on the slopes of subject-wise trajectories are modeled as a linear model \(g_0(\pmb {\theta })\) on a tangent vector space from a set of subject-wise slope tangent vectors \(\hat{b}_i\),

$$\begin{aligned} g_0(\pmb {\theta }) = \gamma _0 + \gamma _1 \theta _1 + ... + \gamma _{N_\theta } \theta _{N_\theta }, \end{aligned}$$
(6)

where \(\gamma _k\), \(k=0,...,N_\theta \), are coefficient tangent vectors associated with \(\theta _k\). The model can explain a hypothesis on the relationship between covariates and the slopes. For example, a corpus callosum may develop differently from a baseline for a subject diagnosed with autism versus a healthy subject.

There are two problems that we need to consider to model subject-wise slopes. First, \(\gamma _k\) must be on a single tangent vector space to be linearly combined as in Eq. 6. For consistent modeling and comprehensible interpretation, we set \(\gamma _k\) to be on a tangent vector space of the base intercept of the intercept model, \(T_{\hat{\beta }_0}M\), which makes \(g_0(\pmb {\theta }) \in T_{\hat{\beta }_0}M\). Second, \(\hat{b}_i\) are not directly comparable to each other because they are on different tangent vector spaces \(T_{\hat{a}_i}M\). Therefore, we need to properly transport \(\hat{b}_i\) to \(T_{\hat{\beta }_0}M\) to estimate \(\gamma _k\).

Stop-Over Parallel Transport \(\phi \): Recall that subject-wise intercepts \(\hat{a}_i\) are the combination of the fixed effects intercept model \(\hat{f}(\pmb {\eta }^i)\) and the random effects \(\epsilon _a\) on subject-wise intercepts as explained in Eq. 5. In other words, each \(\hat{a}_i\) is randomly distributed in the normal distribution of the random effects centered at \(\hat{f}(\pmb {\eta }^i)\) [8]. Therefore, we need to transport a tangent vector on \(T_{\hat{a}_i}M\) to \(T_{\hat{f}(\pmb {\eta }^i)}M\) first to account for the random effects and then from \(T_{\hat{f}(\pmb {\eta }^i)}M\) to \(T_{\hat{\beta }_0}M\) to account for the fixed effects as shown in Fig. 1(c) by a stop-over parallel transport \(\phi \),

$$\begin{aligned} \tilde{b}_i = \phi ( \hat{b}_i ) = \psi _{ \hat{f}( \pmb {\eta }^i ) \rightarrow \hat{\beta }_0 } ( \psi _{ \hat{a}_i \rightarrow \hat{f}( \pmb {\eta }^i )} ( \hat{b}_i ) ), \end{aligned}$$
(7)

where \(\tilde{b}_i \in T_{\hat{\beta }_0}M\). \(\psi \) is a parallel transport along with a geodesic between two points with the angle and scale preserving properties that suit our need to consistently transport \(\hat{b}_i\) at each stage of the stop-over transport. Figure A1 in the appendix shows the effect of the stop-over parallel transport with a synthetic experiment on a \(S^2\) manifold. The direct parallel transportation to the intercept point may arbitrarily rotate tangent vectors when the stop-over transport preserves the directions of the tangent vectors consistently.

Slope Model Estimation: With \(\tilde{b}_i\) transported to \(T_{\hat{\beta }_0}M\) by \(\phi \), we estimate slope coefficients \(\gamma _k\), \(k=0,..., N_\theta \), in \(g_0(\pmb {\theta })\). It can be formulated as the least squares formulation of a standard multivariate linear regression problem [2],

$$\begin{aligned} (\hat{\gamma }_0, \hat{\gamma }_1, ..., \hat{\gamma }_{N_\theta } ) = \mathop {\mathrm {argmin}}\limits _{\gamma _0, \gamma _1, ..., \gamma _{N_\theta } } \sum ^{N_s}_{i=1} || \gamma _0 + \gamma _1 \theta _1^i + ... + \gamma _{N_\theta } \theta _{N_\theta }^i - \tilde{b}_i ||^2. \end{aligned}$$
(8)

Equation 8 is optimized by the closed-form solution of a multivariate linear regression problem with the assumption of no correlation in \(\gamma _k\). The estimated slope model \(\hat{g}_0( \pmb {\theta }) = \hat{\gamma }_0 + \hat{\gamma }_1 \theta _1 + ... + \hat{\gamma }_{N_\theta } \theta _{N_\theta }\) is transported to the respective tangent vector space of the intercept model \(\hat{f}( \pmb {\eta } )\), \(\hat{g} ( \pmb {\theta } ) = \psi _{ \hat{\beta }_0 \rightarrow \hat{f}(\pmb {\eta })} ( \hat{g}_0( \pmb {\theta } ) )\).

The least squares formulation in Eq. 8 is related to the random effects of subject-wise slopes, similar to Eq. 5 for the intercept model. The complete estimated multi-geodesic model is then formulated as \(y = Exp( \hat{f}( \pmb {\eta } ), \hat{g}( \pmb {\theta } ) t )\).

Fig. 2.
figure 2

Synthetic example on a \(S^2\) manifold. (a) Generated data points colored by associated values of a covariate c (blue to green) with subject-wise trajectories (white lines). (b) Estimated single geodesic model (SG, magenta). Data points are colored by time (black to white). (c) Estimated hierarchical single geodesic model (HSG, brown). (d) Estimated hierarchical multi-geodesic model (HMG) and the ground truth (yellow) visualized with selected values of c. (Color figure online)

3 Experiments

Synthetic Example on \(S^2\) Manifold: We tested our method with a synthetic example on a \(S^2\) manifold. We used the exponential map, log map, and parallel transport of \(S^2\) manifold as in [7]. 3527 points of 1000 subjects were generated by the following model with time \(t \in (20, 70)\) and a continuous covariate \(c \in (0, 5 )\),

$$\begin{aligned} y = Exp( Exp( f( c ), g( c ) t ) ), \ f( c ) = Exp( \beta _0, \beta _1 c ), \ g( c ) = \psi _{ \beta _0 \rightarrow f( c ) }( \gamma _0 + \gamma _1 c ), \end{aligned}$$

where we assigned \(\beta _0 = [ 1, 0, 0 ]\), \(\beta _1 = [ 0, 0, 0.4 ]\), \(\gamma _0 = [0, 0.01, 0.02 ]\), and \(\gamma _1 = [ 0, 0.002, -0.003 ]\) with random effects on intercepts \(\epsilon _{a} \sim N( 0, 0.05^2)\) and slopes \(\epsilon _{b} \sim N( 0, 0.001^2)\) and the data observation noise \(\epsilon \sim N( 0, 0.001^2 )\).

Figure 2(a) shows the synthetic data colored by c from blue to green. The estimated subject level geodesic models are plotted as translucent white curves. Figure 2(b) shows the estimated geodesic of a geodesic regression model (SG, magenta) with data points colored by t from black to white [3]. Figure 2(c) illustrates the results of a Hierarchical Single Geodesic model (HSG, brown) [5, 7]. The results of the proposed Hierarchical Multi-Geodesic model (HMG) and the ground truth (yellow curves) are displayed in Fig. 2(d) with uniformly selected values of c with a unit interval from 0.5 to 4.5. The generalized \(R^2\) with respect to individual data of SG, HSG, and the proposed HMG were 0.31, 0.29, and 0.96, respectively [3]. The \(R^2\) with respect to subject-wise intercepts \(R_a^2\) and slopes \(R_b^2\) of the proposed HMG were 0.98 and 0.90. \(R_a^2\) and \(R_b^2\) of HSG are zero since the HSG is the average trajectory of the subject-wise trajectories [7]. Validation with respect to subject-wise trajectories is not available for non-longitudinal SG. Standard deviations of the random effects estimated by HMG of subject-wise intercepts and slopes are \(\sigma _a = 0.09\) and \(\sigma _b = 0.002\), respectively.

Longitudinal Corpus Callosum Shape Changes: Previous research demonstrated differences of corpus callosum (CC) size in autism [10], stating that it is thicker in infants later diagnosed with autism by multi-level linear analysis of derived features from shapes, such as mean thickness. Applying the proposed framework, quantitative exploration of longitudinal shape changes as a function of diagnostic scores becomes possible, resulting in population level and subject-specific models. We modeled HMG with sex s and Autistic Diagnostic Observation Schedule (ADOS) severity score AS, which combines symptoms related to a social interaction and a repetitive behavior with scores ranging from 4 to 10 for autism spectrum disorder (ASD) subjects. Larger AS indicate higher severity of autism. Seventy-two CC shapes from 24 ASD subjects (9 females and 15 males) from the ACE-IBIS study were used for the experiment [10]. Each subject was repeatedly scanned three times. Because the development of CC is known to be asymptotic to a logarithm [10], we reparametrized time t by taking the natural log to model the asymptotic shape changes over time.

Table 1. Quantitative comparisons of SG, HSG, and the proposed HMG. \(R^2\) with respect to observations (\(R^2\)), subject-wise intercepts (\(R_a^2\)) and slopes (\(R_b^2\)), the standard deviations of subject-wise intercepts (\(\sigma _a\)) and slopes (\(\sigma _b\)), and the root mean squared boundary landmark distances (RMSE) are listed.
Fig. 3.
figure 3

Estimated longitudinal corpus callosum (CC) shape changes of 15 male subjects (A: Anterior, P: Posterior). (a) Population level longitudinal shape trends of subjects with the lowest and the highest ADOS scores. (b) Shape changes with varying ADOS scores of baseline shapes at 3-month and end point shapes at 24-month. (c) The observed shape sizes with estimated subject-level trends. (d) Population level longitudinal size trends with varying ADOS scores.

$$\begin{aligned}&y = Exp( Exp( f( s, AS ), g( s, AS ) \log (t) ) ), \\&f( s, AS ) = Exp( \beta _0, \beta _1 s + \beta _2 AS ), \ g( s, AS ) = \psi _{ \beta _0 \rightarrow f( s, AS ) }( \gamma _0 + \gamma _1 s + \gamma _2 AS ), \end{aligned}$$

with sex s, 0 for male and 1 for female, age(month) \(t=(6, 25)\), and \(AS=(4,10)\).

CC shapes were represented on a product manifold \(\mathbf {M} = \mathbb {R} \times \mathbb {C}P^{k-2}\) of a scale \(\rho \in \mathbb {R}\), which represents a shape size, and a Kendall shape \(\mathbf {z}\) with k points in 2D Kendall shape space \(\mathbb {C}P^{k-2}\), where translation, rotation, and scale of shapes are removed. The squared distance between \(p = (\rho _p, \mathbf {z}_p)\) and \(q = (\rho _q, \mathbf {z}_q)\) on \(\mathbf {M}\) is a weighted sum of the distances on the element spaces. The distance on \(\mathbb {C}P^{k-2}\) is normalized by the ratio of variances of data distributions of scales \(\rho \) and Kendall shapes \(\mathbf {z}\), \(d^2( p, q ) = \small {||} Log( \rho _{p}, \rho _{q} ) \small {||}^2 + \frac{ \sigma _\rho ^2 }{\sigma _\mathbf {z}^2 } \small {||} Log( \mathbf {z}_{p}, \mathbf {z}_{q} ) \small {||}^2\), where \(\sigma _\rho ^2\) and \(\sigma _\mathbf {z}^2\) are the variances of the input data of scales \(\rho \) and shapes \(\mathbf {z}\), respectively [8]. One hundred landmark points were sampled at corresponding locations from each shape boundary. We used the exponential map, log map, and parallel transport of \(\mathbb {C}P^{k-2}\) as in [3]. The exponential and log maps of \(\mathbb {R}\) are addition and subtraction, respectively. The parallel transport of \(\mathbb {R}\) is an identity function.

Figure 3 displays the estimated longitudinal trends with fixed \(s =0\) of 15 male subjects. Figure 3(a) shows the estimated longitudinal shape trends of the lowest and the highest ADOS scores over time. The difference of the shape changes is more evident in Fig. 3(b) illustrating estimated baseline and end point shapes with varying ADOS scores at 3 and 24-month of age. One can observe the increased expansion of the anterior CC (the genu and rostral body) for subjects with higher ADOS scores which confirms previous clinical finding that subjects diagnosed with autism tend to have larger corpus callosum [10]. The population level longitudinal trends of shape sizes estimated from the subject-wise trends in Fig. 3(c) quantitatively show increasing trends of shape sizes with higher ADOS scores in Fig. 3(d). The overall \(R^2\) values of SG, HSG, and HMG were 0.23, 0.23, and 0.27, respectively. The root mean squared error measured by average surface boundary landmark distances of SG, HSG, and HMG were 0.287, 0.287, and 0.277 (mm), respectively. Table 1 summarizes the quantitative evaluations. Mean and standard deviation of \(R^2\) values of the subject-wise trends were \(0.89 \pm 0.04\).

4 Discussion

The proposed hierarchical multi-geodesic model is a novel method for longitudinal analysis of subject-specific anatomical shape changes on a Riemannian manifold. It enables longitudinal analysis with multiple covariates directly on a nonlinear shape space which has not yet been possible for clinical studies. The application to subject-specific corpus callosum shape changes demonstrated promising results that confirmed clinical finding of the relationship between the anatomical development and diagnostic scores of individual subjects. We will focus on a hypothesis testing framework for the proposed model to further explore relationships between temporal change of anatomical structures and covariates.