Nothing Special   »   [go: up one dir, main page]

A Canonical Trajectory of Executive Function Maturation From Adolescence To Adulthood

Download as pdf or txt
Download as pdf or txt
You are on page 1of 17

Article https://doi.org/10.

1038/s41467-023-42540-8

A canonical trajectory of executive function


maturation from adolescence to adulthood

Received: 27 January 2023 Brenden Tervo-Clemmens 1,2,3 , Finnegan J. Calabro4,5, Ashley C. Parr 4
,
Jennifer Fedor 4,6, William Foran 4 & Beatriz Luna3,4,5
Accepted: 13 October 2023

Theories of human neurobehavioral development suggest executive functions


Check for updates mature from childhood through adolescence, underlying adolescent risk-
taking and the emergence of psychopathology. Investigations with relatively
1234567890():,;
1234567890():,;

small datasets or narrow subsets of measures have identified general executive


function development, but the specific maturational timing and independence
of potential executive function subcomponents remain unknown. Integrating
four independent datasets (N = 10,766; 8–35 years old) with twenty-three
measures from seventeen tasks, we provide a precise charting, multi-
assessment investigation, and replication of executive function development
from adolescence to adulthood. Across assessments and datasets, executive
functions follow a canonical non-linear trajectory, with rapid and statistically
significant development in late childhood to mid-adolescence (10–15 years
old), before stabilizing to adult-levels in late adolescence (18–20 years old).
Age effects are well captured by domain-general processes that generate
reproducible developmental templates across assessments and datasets.
Results provide a canonical trajectory of executive function maturation that
demarcates the boundaries of adolescence and can be integrated into future
studies.

Adolescence is a unique period of the lifespan, initiated by puberty and function changes during adolescence have been used in colloquial,
characterized by the maturation of cognitive, affective, and social legal9,14, and scientific (see4 for review) contexts to differentiate ado-
processes that culminate in a transition to independence and lescents from adults and clarify adolescence as a period of continued
adulthood1–3. Among maturational processes, theories from neu- development.
roscience and psychology have placed a particular emphasis on the Adolescent executive function development has been studied in
development of goal-directed cognitive abilities (e.g., response inhi- relatively small (N’s ~20015,16) independent investigations using a broad
bition, working memory, task-switching, and planning behaviors) that range of tasks or in relatively large studies (N’s ~ 1000)17,18 using very
are hypothesized to index a common process of executive function or narrow assessments of executive function. No large-scale, multi-
cognitive control4–6. In parallel to socioemotional development and assessment, multi-dataset reproducibility investigations of adolescent
environmental influences, a protracted maturation7,8 and/or stabiliza- executive function development have been performed. Further,
tion of executive function9 into adulthood has been suggested to common analytic approaches do not quantitatively define matura-
contribute to lifespan peaks in risk-taking behaviors (e.g., substance tional timing and/or plateaus toward adult-levels of performance.
use initiation10; though see also refs. 11,12) and increased vulnerability The magnitude of executive function changes during adolescence, the
to psychiatric disorders13 during adolescence. Ongoing executive precise timing of when adolescents reach adult-levels, and the

1
Department of Psychiatry & Behavioral Sciences, University of Minnesota, Minneapolis, MN, USA. 2Masonic Institute for the Developing Brain, University of
Minnesota, Minneapolis, MN, USA. 3Department of Psychology, University of Pittsburgh, Pittsburgh, PA, USA. 4Department of Psychiatry, University of
Pittsburgh, Pittsburgh, PA, USA. 5Department of Bioengineering, University of Pittsburgh, Pittsburgh, PA, USA. 6Department of Biostatistics, University of
Pittsburgh, Pittsburgh, PA, USA. e-mail: btervocl@umn.edu

Nature Communications | (2023)14:6922 1


Article https://doi.org/10.1038/s41467-023-42540-8

potential diversity of processes assessed by varying executive function trajectory of executive function during adolescence, as well as multiple
tasks, thus remains widely debated. large publicly available datasets, now allow for precise estimation of
Empirical research suggests that while adolescents can perform the maturational timing of executive function.
complex, goal-directed behaviors that rely on executive functions, A further challenge to defining the maturational timing of
their performance is not as accurate or as fast as adults5,15,16,19–24. Age- executive function development arises from the potential variability
related increases in correct response rates (accuracy) and decreases in among the many tasks designed to assess executive function. Empiri-
the speed of responses (i.e., latency/reaction time) have been observed cal studies have often focused on an individual, or a relatively narrow
for a broad range of laboratory-based and neuropsychological subset, of tasks (see for example discussion in refs. 5,42). Fewer studies
executive function tasks (e.g., working memory, response inhibition, have therefore investigated the developmental similarity among
switching, planning) during adolescence (see refs. 1,25 for reviews). potential subprocesses indexed by the dozens of laboratory-based and
Theoretical models built from these observations and related obser- neuropsychological executive function measures used in the broader
vations in animal studies26, as well as broader27 and historical per- literature. While there is a growing use of standardized tasks (e.g.,
spectives of psychological development28, have led to hypotheses Delis-Kaplan Executive Function System43 [D-KEFS], Cambridge Neu-
suggesting that cognitive development continues through adoles- ropsychological Test Automated Battery44 [CANTAB], Penn Compu-
cence and may reach maturity in the second decade of life (e.g., by 20 terized Neurocognitive Battery45 [Penn CNB]), systematic comparisons
years old9,29) or later (e.g., ≥ 25 years old8,27) in humans. A range of across these instruments are similarly limited. Many neurodevelop-
methodological, analytic, and data availability challenges, however, mental and psychological theories4,7–9 emphasize a broad unitary
have thus far prevented direct and comprehensive testing of the process of executive function development, suggesting the matura-
maturational timing of adolescent executive function development tion of performance on any one of these tasks may generalize to
and the specific age when executive functions reach adult-levels. broader executive function development. However, alternative per-
Nevertheless, understanding not just whether behaviors are changing spectives have also been proposed. Prior work in adults (both healthy
with age, but also their shape and form, is fundamental to develop- college students42, as well as patients with frontal lobe damage46), for
mental science15,30–32 and corresponding health policies and interven- example, has suggested that executive function tasks support a unity/
tion/prevention strategies for adolescents. Defining the shape and diversity framework, where commonality and correlation are observed
form of cognitive development likewise has key implications for amongst all executive function measures (unity), but individual
research on mechanisms of ongoing (potentially critical period) plas- aspects of executive function maintain a degree of separability
ticity and factors influencing the opening and closing of the adolescent (diversity). Owing to the focus on individual functions and tasks or
period3. narrow subsets in most adolescent research, it nevertheless remains
There are unique challenges to defining the normative matura- unclear whether adolescent executive function development is driven
tional timing of adolescent executive function development that arises by multiple, independent processes (diversity) and/or the maturation
from multiple sources, including inter-individual differences among of a more common domain-general process (unity).
participants and across datasets, difficulties in designing analytic fra- Here we aggregate four large-scale, independent datasets to
meworks that directly assess maturational timing33, and potential construct a comprehensive set of executive function data spanning the
variability among the many tasks designed to assess executive entire adolescent period as well as the relative transitional periods of
function5,34. The first of these challenges is beginning to be addressed late childhood and early adulthood (total age range: 8–35, total
through larger study designs (e.g., Nathan Kline Institute-Rockland N = 10,766, total visits = 13,817) that includes 23 executive function
Sample35 [NKI], National Consortium on Alcohol and Neurodevelop- measures from 17 distinct tasks. In addition to large-scale replication,
ment in Adolescence36 [NCANDA], Philadelphia Neurodevelopmental we directly address prior challenges in defining the maturational tim-
Cohort37 [PNC]) and data aggregation techniques, as increasing dataset ing and domain-generality of adolescent executive function develop-
sizes and the inclusion of multiple datasets can better overcome ment with multiple large independent cohorts (two longitudinal, two
sampling variability38,39 to estimate generalizable normative develop- cross-sectional), non-linear modeling approaches that directly define
mental trajectories. Addressing the latter challenges, however, maturational timing, and the inclusion of a broad executive function
requires conceptual and methodological advancements. battery that permit the investigation of both potential unitary and
Initial investigations in adolescent research have often relied on a diversity processes. Taken together, this work identifies a canonical
fixed-developmental shape (e.g., linear, inverse linear, quadratic non-linear developmental trajectory of executive function maturation
regression models) or categorical comparisons (e.g., adolescents ver- that generalizes across datasets and assessments, with rapid age-
sus children/adults) to identify age-related differences32,33. While related change from late childhood to early adolescence (10–15 years
essential to establish that age-related differences in executive function old), small but significant changes in mid-adolescence (15–18 years
generally occur during adolescence, such fixed-developmental, para- old), before stabilizing to adult-levels in late adolescence (18–20 years
metric comparisons prevent the systematic investigation of the rela- old). The similarity in developmental trajectories is well accounted for
tive rate and timing of maturation that is essential for precise by domain-general processes consistent with theories of unitary
developmental science. Such approaches likewise have prevented executive function and fluid cognition. The insights and data devel-
resolution of foundational theories of adolescent neurobehavioral oped here can inform neuroscientific and psychological theories of the
development, where distinct linear and non-linear shapes have been adolescent period and guide future translational research in health and
proposed4. Therefore, while prominent theories suggest adolescents disease.
may reach adult-levels of executive function between 20- and 25 years
old, such plateaus in developmental change have not been investi- Results
gated in most empirical research and are not testable within com- Executive function development follows a canonical trajectory
monly used analytic frameworks. This lack of resolution on the across datasets and tasks
maturational timing of adolescent executive function also poses Participants ranging from 8–35 years old (Supplementary Fig. S1) were
challenges for related lifespan research, where a potentially distinct drawn from two large longitudinal studies of executive function
developmental concept of emerging adulthood (~18–25 years old40) development, including data collected by our group (Luna, N = 196,
has likewise been justified, in part, by potential ongoing cognitive total visits = 666) and data collected as part of the National Con-
changes. New methods (e.g., general additive models41) that can sortium on Alcohol and Neurodevelopment (NCANDA36, N = 831, total
quantitatively define the potentially non-linear developmental visits = 3412), as well as two large cross-sectional studies, including

Nature Communications | (2023)14:6922 2


Article https://doi.org/10.1038/s41467-023-42540-8

data collected as part of the Nathan Kline Institute-Rockland Sample intervals for the first derivative of the fitted models to assess statisti-
(NKI35, N = 588), and data from the Philadelphia Neurodevelopmental cally significant age-related differences at each age (p < 0.05 [two-
Cohort (PNC37, N = 9151). Studies relied on community-based samples sided] via simultaneous confidence intervals51 to account for multiple
from across the United States (see Methods) that were balanced for tests across ages: see Methods). Age-ranges in which the simultaneous
biological sex at birth and in the aggregate, were consistent with 95% confidence interval of the first derivative of the GAM/GAMM fits
national patterns of race and ethnicity (Supplementary Table S1). did not include zero (p < 0.05, two-sided) were classified as statistically
Family income varied both within and between datasets, but as in significant. We note that a thresholded 95% confidence interval (an
previous reports across behavioral sciences47, was generally higher unthreholded version can be viewed in full in Supplementary Fig. S3),
than national averages (Supplementary Table S1). Secondary analyses instead of for example exact p-values, are provided here as in previous
however, suggested the sample composition of included datasets well work to highlight age ranges of statistical significance49,50. Consistent
approximated broader population patterns for primary results (See with theoretical models of adolescence, significant (p < 0.05 [two-
Supplementary Methods, Supplementary Fig. S2). Across the studies, sided] via simultaneous confidence intervals) age-related changes in
participants performed a variety of executive function tasks (twenty- executive function accuracy (increases) and latency (decreases) were
three measures from seventeen distinct EF tasks; Supplementary observed during early to middle adolescent periods (10–15 years old)
Table S2), including those designed to measure processes of response for nearly all measures (Fig. 2A–D). Effect size benchmarks do not yet
inhibition (e.g., Antisaccade, Stroop), working memory (e.g., Spatial exist for short-timescale developmental changes, however given the
Span), planning (e.g., Stockings of Cambridge), as well as those from short span of age examined via the derivative (units scaled to per-year
standard computerized neurocognitive batteries (Penn Computerized change) and the total age-related changes (Fig. 1A–E), local effect sizes
Neurocognitive Battery45) that include tasks designed to measure are judged to be large (e.g., mean z unit change from 10–15 years old:
executive function (e.g., Conditional Exclusion Test, N-Back Test, .142 per-year [accuracy]; −.175 per-year [latency], Fig. 2E; see Fig. 2A–D
Continuous Performance Test34) and a neuropsychological executive for z unit scaling for all measures). From middle to late adolescent
function battery (Delis-Kaplan Executive Function System [D-KEFs]: periods (15–18 years old), smaller but still statistically significant
Tower, Trails, Design Fluency, Color-Word Interference). Response (p < 0.05 [two-sided] via simultaneous confidence intervals) changes
types included button presses, eye movements, and experimenter- were observed for several measures (Fig. 2A–D). After late adolescence
administered neuropsychological performance (e.g., D-KEFs). For (>18 years old), very few measures exhibited statistically significant
most tasks, both latency (speed of responses) and accuracy (e.g., (p < 0.05 [two-sided] via simultaneous confidence intervals) age-
correct response rate) measures were available (see also Methods). related change (Fig. 2A–D).
We first examined the developmental trajectory of each executive Aggregate analysis across measures and tasks (three-level point-
function measure independently using non-linear regression models wise meta-analysis: see Methods) support the inference from indivi-
with penalized splines (general additive mixed models (GAMM) for dual measures and datasets (Fig. 1A–E; see also Supplementary Fig. S4),
longitudinal data; general additive models (GAM) for cross-sectional with statistically significant (p < 0.05 [two-sided] via simultaneous
data: see Methods). Unlike the fixed-developmental shape approaches confidence intervals) age-related differences detected throughout
that are typically used in adolescent research, this allowed us to esti- early to late adolescent periods (10–18 years old) for both accuracy and
mate flexible, data-driven trajectories and explore the shape of latency measures (Fig. 2E). While statistically significant (p < 0.05 [two-
development (functional form of age) for each executive function sided] via simultaneous confidence intervals) age-related differences
measure. These analyses revealed that nearly all executive function could also be observed in this highly powered aggregate analysis until
measures (20/23 measures) had corrected significant (corrected p’s < 20 years old for accuracy measures (Fig. 2E), the absolute magnitude of
0.004, [two-sided], calculated via default procedures of GAM that these effects were very small after 18 years old (mean z unit change in
performs an equality test of parameters of the smoothed term to accuracy per-year between 18–20-years old: .023 [~1/5th the average
zero48; see Supplementary Table S3 for full statistics as well repro- change observed between 10- and 15 years old]); Fig. 2E). A parallel
ducible variable names from public datasets) age-related differences analysis examining the magnitude of change among those measures
and followed a similar non-linear developmental trajectory, with rapid with statistically significant overall age effects (corrected p’s < 0.004,
development in late childhood to mid-adolescence (10–15 years old), [two-sided]; see also Fig. 1, Supplementary Table S3) likewise demon-
smaller changes through mid-adolescence (15–18 years old), before strates that, on average, over 95.0 and 99.7% of the total detectable
stabilizing to adult-levels in late adolescence (18–20 years old) age-related change between 8–35 years old occurs prior to 18 years old
(Fig. 1A–D). Critically, age-related differences were observed across for accuracy and latency, respectively (Supplementary Fig. S5). These
nearly all tasks from all four independent datasets, with accuracy results provide robust and reproducible evidence of statistically sig-
measures showing significant age-related increases and latency mea- nificant and developmentally specific changes in executive function
sures showing parallel significant age-related decreases (Fig. 1A–D). during early through mid-adolescence that reach maturity between 18
The average total age-related change (max-min of GAM/GAMM fits) years old and 20 years old and reinforce that adolescence is a period of
was large based on conventional effect size standards (mean across all ongoing development of goal-directed cognition and executive func-
measures from all datasets in standard deviation [z] units: 1.38; tion. A normative maturational stability towards adult-levels of
Fig. 1A–D). Overlapping visualization of all measures with significant executive function by late adolescence (18- to 20 years old) is highly
age-related differences from all datasets further highlights a potential consistent with what has been theorized in heuristic models of ado-
canonical shape of normative adolescent executive function devel- lescence (~20 years old), but notably earlier than lifespan accounts
opment (Fig. 1E). suggesting executive function changes continue to occur during
emerging adulthood (18–25 years old).
Executive function significantly develops through late
adolescence Adolescent executive function development is predominantly
To precisely quantify periods of significant developmental differences domain general
and estimate when measures reached adult-levels, we next examined Building from the observation that nearly all executive function mea-
the local slope (first derivative) of age-related differences across all sures showed the same developmental trajectory and relative
ages in 1/10th of year intervals for all non-linear GAMM/GAM models. maturational timing, we next examined the potential shared informa-
As in prior developmental research in other domains49,50, a simulation tion across measures at the per-participant level using between-person
approach (10,000 iterations) was used to construct confidence (all datasets) and within-person (Luna, NCANDA) correlations and

Nature Communications | (2023)14:6922 3


Article https://doi.org/10.1038/s41467-023-42540-8

A. Luna B. NCANDA
ANTI FIX MIX SSP corrected p > .05 PCET PCTP PNBK STRP corrected p > .05
DMS MGS SOC COMP corrected p < .05 COMP corrected p < .05

0.5
0
0.0
Executive Function

Executive Function
acc acc
−1 −0.5
(z−score)

(z−score)
−1.0
−2
1.0
1.0

0.5 0.5
lat lat
0.0
0.0
−0.5
10 15 20 25 30 35 10 15 20 25 30 35
Age (years) Age (years)
C. NKI D. PNC
CWI PCET PNBK TOW all corrected PCET PCTP PNBK COMP all corrected
DFL PCTP TMT COMP p's < .05 p's < .05

0.5 0.5
0.0
0.0
Executive Function

Executive Function

−0.5 acc acc


−0.5
(z−score)

(z−score)

−1.0
−1.5 −1.0
2 1.5
1.0
1
lat 0.5 lat
0 0.0
−0.5
10 15 20 25 30 35 10 15 20 25 30 35
Age (years) Age (years)

E. All Significant Measures

0
Executive Function

acc
−1
(z−score)

−2
2

1
lat
0

10 15 20 25 30 35
Age (years)

factor analysis (see Methods). Composite metrics were not used here, that the total executive function variance explained by a single domain
as they are by construction (linear sums of original measures) corre- general factor varied by age (Supplementary Fig. S6). While certain
lated with multiple measures. Consistent with a domain general, unity data-driven thresholds to determine the number of supported latent
process of executive function, per-participant scores across nearly all factors (parallel analysis, optimal coordinate, acceleration factor, and a
measures were moderately correlated (see Methods) in all datasets in factor analytic Kaiser rule; see Methods, Supplementary Fig. S7) sug-
both between-person (cross-sectional) and within-person (long- gest the inclusion of a second or third factor across datasets (Fig. 3B),
itudinal) analyses (Fig. 3A; mean linear, bivariate correlation from data these factors account for very small amounts of executive function
aggregation (“all measures”) |r| = 0.261; Supplementary Table S4 for variance (on average, ~6 and 2% respectively, see Fig. 3B; Supplemen-
correlation matrices). Exploratory factor analysis likewise demon- tary Fig. S7 for individual datasets) after accounting for the domain
strated that a single domain general factor (via bifactor rotation) general factor (via bifactor rotation). Visual inspection of loadings for
explains over 20% (21.9%) of total executive function variance on secondary and tertiary factors demonstrate that these factors tended
average, across datasets (Fig. 3B). There was no systematic evidence to capture residual effects from specific, single measures or methods

Nature Communications | (2023)14:6922 4


Article https://doi.org/10.1038/s41467-023-42540-8

Fig. 1 | Age trajectories of executive function measures. All measures scaled to covaried for a smoothed effect of visit number. Solid line indicates models with
per-dataset standard deviation (z) units. A Non-linear fits from the Luna dataset Bonferroni corrected significance (corrected p < 0.05 [two-tailed], unadjusted
(N = 196; 666 total visits) of general additive mixed model (GAMM; multilevel p < 0.006 [two-tailed]). Dashed indicates models that do not surpass this thresh-
penalized spline regression) for Antisaccade (ANTI), Fixation Breaks (FIX), Mixed old. C Non-linear fits from NKI dataset (N = 588) of general additive models (GAM;
Antisaccade/Visually Guided Saccade (MIX), Spatial Span (SSP), Delayed Matching penalized spline regression) for D-KEFS Color-Word Interference (CWI), Penn
to Sample (DMS), Memory Guided Saccade (MGS), Stockings of Cambridge (SOC), Conditional Exclusion (PCET), Penn N Back (PNBK), D-KEFS Tower (TOW), D-KEFS
and equally weighted composite metrics (z score sum of all accuracy, latency Design Fluency (DFL), DKEF Penn Continuous Performance (PCTP), and D-KEFS
measures; COMP) as a function of age for accuracy measures (acc; top) and latency Trails (TMT) tests and equally weighted accuracy and latency composite metrics
(lat; bottom). All models covaried for a smoothed effect of visit number. Solid line (COMP) as a function of age. All models were corrected, significant (corrected
indicates models with Bonferroni corrected significance (corrected p < 0.05 [two- p’s < 0.001 [two-tailed]). D Non-linear, GAM fits from PNC dataset (9151) for Penn
tailed], unadjusted p < 0.003 [two-tailed]). Dashed indicates models that do not Conditional Exclusion (PCET), Penn Continuous Performance (PCTP), and Penn N
surpass this threshold. B Non-linear, GAMM fits from NCANDA dataset (N = 831; Back (PNBK) and equally weighted accuracy and latency composite metrics
3412 total visits) for Penn Conditional Exclusion (PCET), Penn Continuous Perfor- (COMP) as a function of age. All models were corrected, significant (corrected
mance (PCTP), Penn N Back (PNBK), and Stroop (STRP) tests and equally weighted p’s < 0.001 [two-tailed]). E Fits from all corrected significant models from A–D. See
accuracy, latency composite metrics (COMP) as a function of age. All models also Supplementary Table S3 for accompanying statistics.

(e.g., eye-tracking) or similar, broad domain general patterns (see differences is that the datasets (NCANDA, PNC) with fewer executive
Supplementary Fig. S7). Additional factors beyond these (4 or more function measures have less precision to estimate a domain general
factors) were not suggested for any dataset, under any data-driven executive function process. Consistent with this, the percentage of
threshold (Fig. 3B; Supplementary Fig. S7). Combined, these results age-related information explained by a common executive function
provide evidence across studies for a single domain general factor or process decreased and became more variable across measures in Luna
unity framework of executive function factor that accounts for var- and NKI datasets in simulations that used iteratively smaller numbers
iance across tasks (see Fig. 3C), although further work with expanded of variables to estimate an executive function composite (see Sup-
measures can help clarify potential diversity and domain-specific plementary Fig. S10). Combined, these results provide the strongest
executive function performance (see Discussion). evidence for a core domain general or unitary process related to
Beyond the general dimensionality of participant-level, individual observed age-related differences in executive function that is repro-
differences, a primary goal of the current work was to determine the ducible across measures and datasets. Together with our previous
timing and complexity of age-related differences in executive function analyses, these results support adolescence as a potentially specific
from adolescence to adulthood. Therefore, we next tested the extent period of the lifespan of ongoing executive function, where a core
to which age-related, developmental differences in any one specific unitary maturational process may give rise to improvements across
executive function measure could be explained by the general related but distinct assessments.
executive function processes supported in our previous analyses.
Through nested model comparisons (see Methods), we determined Scaled domain general executive function scores generate
the percentage of age-related differences on each specific executive reproducible normative maturational templates across
function measure explained by a single domain general composite datasets and tasks
metric of the accuracy and latency metrics from the remaining tasks in Having established that executive function measures follow a canoni-
the dataset (“leave one task out” composite metric; see Supplementary cal developmental trajectory during adolescence and age-related
Fig. S8 for visualization of this procedure) versus a measure and/or changes are well captured by domain-general processes, our final
task-specific process. As the broadest test of such a domain-general analyses sought to build upon these results to create normative
executive function process and consistent with prior suggestions from maturational templates applicable across datasets and tasks. That is, if
related literature in aging52,53, in datasets (Luna, NKI) where multiple a substantial portion of executive function development follows the
measures had the same putative, primary executive function sub- same trajectory (Figs. 1, 2) and is driven by a common, domain-general
domain (see first listed Domain in Supplementary Table S2), these process (Figs. 3, 4), we tested whether a simplified normative template
measures were likewise left out of the composite metric (“leave out all of change would be representative across new datasets and tasks and
measures from the same domain”). To further maximize comparability could be used to quantitatively guide future research.
across studies and to prevent bias from shared, non-executive function A standard growth chart54 constructs a normative template of
visit effects (e.g., practice effects; see Sensitivity Analyses and Sup- developmental change and inter-individual variability (e.g., percentile)
plementary Fig. S9), analyses here were performed with the larger for a single assessment with a single scale of measurement (e.g., height
cross-sectional data, but were consistent with longitudinal data (cf., in inches). Executive function, however, is assessed with dozens of
within-person factor structure in Fig. 3B, D). different measurements5 and owing to the potential range of partici-
Results demonstrated that a general component of executive pant ages included in any one developmental dataset, the total extent
function (as a single composite metric) often explained more than half of observed individual variability may substantially differ across
of age-related information (via deviance testing in model comparison; datasets, even if developmental change proceeds according to the
see Supplementary Fig. S8) in individual executive function measures, shape of the canonical executive function trajectory. In the current
with age effects for several measures nearly fully explained by a gen- datasets, scaling to adult performance (standard deviation units based
eral executive function process (Fig. 4A–D). Aggregate analysis (three- on performance of 20–30-year-olds in each dataset; see Methods) to
level meta-analysis) revealed that on average, close to three-fourths approximate a common scale provides a further robust demonstration
(i.e., 75%) of age-related information in any one executive function of the shape of the canonical executive function trajectory for domain
measure could be explained by a domain-general executive function general accuracy (Fig. 5A) and latency (Fig. 5B) across datasets and
process (via a single composite metric of [equally weighted] out-of- tasks (given each dataset includes different measures; see Supple-
domain measures; percentage of explained age-related deviance mentary Table S2). Differences in the precise scaling (absolute y values
by common executive function for accuracy measures: 79.3%, latency at each age) persist, however, as to be expected by datasets taken from
measures: 70.6%; Fig. 4E). There was, however, notable variability different age ranges with different tasks. Furthermore, such universal
between the proportion of explained variance by common executive scaling to adult performance, while potentially useful for creating a
function across datasets. One possible explanation for these common metric across measures and tasks, would not be possible for

Nature Communications | (2023)14:6922 5


Article https://doi.org/10.1038/s41467-023-42540-8

Accuracy Latency
growth rate growth rate
A. Luna (sd units) (sd units)
COMP 0.2 0.0
ANTI
DMS
FIX
0.0 -0.2
MGS
MIX no data no data
SOC
SSP

B. NCANDA
COMP
PCET
PCPT
PNBK
STRP

C. NKI
COMP
CWI
DFL
PCET
PCPT
PNBK
TMT
TOW

D. PNC
COMP
PCET
PCPT
PNBK

E. All Measures

10 15 18 20 25 30 35 10 15 18 20 25 30 35
Age (years) Age (years)

Fig. 2 | Developmental periods with significant age-related change in executive measures from equally weighted accuracy, latency composite metrics (COMP),
function. Age-ranges in which the simultaneous (to account for multiple testing) Penn Conditional Exclusion (PCET), Penn Continuous Performance (PCTP), Penn N
95% confidence interval (generated via posterior simulation50 with 10,000 itera- Back (PNBK), and Stroop (STRP). C NKI dataset (N = 588) measures from equally
tions) of the first derivative of the GAM/GAMM fits did not include zero (p < 0.05, weighted accuracy and latency composite metrics (COMP), D-KEFS Color-Word
two-sided) were classified as statistically significant. Using this method, raster plots Interference (CWI), D-KEFS Design Fluency (DFL), Penn Conditional Exclusion
display color (red: age-related increases; blue: age-related decreases) when the (PCET), Penn Continuous Performance (PCTP), Penn N Back (PNBK), D-KEFS Trails
derivative is statistically significant (p < 0.05, two-sided) and white when the deri- (TMT), and D-KEFS Tower (TOW). D PNC dataset (N = 9151) measures from equally
vative is not statistically significant (p > 0.05, two-sided). Vertical black lines in each weighted accuracy and latency composite metrics (COMP), Penn Conditional
bar denote the minimum and maximum age of the included dataset. Gray bars Exclusion (PCET), Penn Continuous Performance (PCTP), and Penn N Back (PNBK).
indicate no data within the specified age range for that analysis dataset. Accuracy E An aggregate analysis (pointwise three-level meta-analysis), incorporating all
measures for all datasets are shown on the left and latency on the right. A Luna measures from all datasets was performed using the metafor package51 with effects
dataset (N = 196; 666 total visits) measures from equally weighted composite nested in measure and study (see also Methods) and thresholded in the same
metrics (z score sum of all accuracy, latency measures; COMP), Antisaccade (ANTI), manner as the other bars (red or blue denoting statistically significant [p < 0.05,
Delayed Matching to Sample (DMS), Fixation Breaks (FIX), Memory Guided Saccade two-sided; via simultaneous confidence intervals] age-related increases or decrea-
(MGS), Mixed Antisaccade/Visually Guided Saccades (MIX), Stockings of Cam- ses, respectively).
bridge (SOC), and Spatial Span (SSP). B NCANDA dataset (N = 831; 3412 total visits)

future studies that only assessed a narrower age range (e.g., 10–18 see Methods) and is then fit as a single parameter in a general linear
years old). model/general linear mixed effects model. This data-driven basis
We sought to establish a procedure for constructing normative function process (see analogous ideas in functional brain imaging55) is
maturational templates applicable to all relevant ages (8–35 years old) therefore the same as what occurs with standard parametric functional
that utilizes a linear scaling of the canonical executive function tra- forms of age (e.g., linear, inverse linear age [1/age], quadratic poly-
jectory to a specific measure (via basis function regression; see nomial age [age + age2]), but would have the added benefit of its shape/
Methods). Unlike the GAM/GAMMs used to initially derive the cano- functional form being directly informed by prior developmental data.
nical executive function trajectory or alternative, multiparameter non- We tested this procedure to directly assess whether the insights gen-
linear models of age, we tested a procedure that only requires a simple erated in the current work regarding a canonical executive function
linear transformation of the age variable in each dataset (via linear trajectory could quantitatively guide future research allowing for
interpolation to the canonical trajectory [estimated out-of-dataset]; simplified modeling approaches that are developmentally informed

Nature Communications | (2023)14:6922 6


Article https://doi.org/10.1038/s41467-023-42540-8

A. Execuitve Function Measure Correlations and computationally efficient. To mirror the use of this approach in
0.8 future developmental research with new datasets and new measures,
0.6 we tested the generalizability of this procedure through cross-
acc
0.4 w/ validation (“leave one dataset out”) and compared performance to
standard functional forms of age used in developmental research
Measure Correlation (r)

0.2 acc
Executive Function

0.0 (linear age, inverse linear age [1/age], quadratic polynomial age [age +
0.25
age2]) that may otherwise be used to understand age-related executive
0.00 lat function change and deviations from normative development.
w/ A canonical executive function trajectory, estimated out-of-
−0.25 acc
−0.50 sample (“leave one dataset out”) and used as a single parameter
basis function (e.g., shape of age model for Luna dataset determined
0.6 by NCANDA, NKI, and PNC datasets; see Supplementary Fig. S11 for
lat
0.4 w/ visualization of workflow), generally outperformed standard func-
0.2 lat tional forms of age (linear age, inverse linear age [1/age], quadratic
0.0 polynomial age [age + age2]) during model comparison testing that
aggregated multiple metrics of model fit and complexity (Fig. 5C, D).
lin a

di na

lin A

di DA

KI

re ll
su A
se Lun

se D

PN
N
e

N al

lo NC e

s
itu Lu

na
ba AN

itu N
n

ng A

Following model selection criteria based on all metrics across accuracy


C

ea
ba

ng

M
and latency measures, the simplified, single parameter basis function
lo

was the most selected model (55.6% of the time; compare to quadratic
B. Factor Analysis: Variance Explained, N Factors Supported
[age + age2]: 37.3%; inverse linear age [1/age]: 7.03%; linear age: 0%),
which was significantly higher than all other age models (vs. inverse
Total EF Variance Explained

0.35 linear age [1/age]: χ2 = 21.6, p < 0.001; vs. linear age χ2 = 30.6, p < 0.001;
0.30 all p values two-sided; chi-square test with Yate’s correction for con-
Luna baseline
0.25
Luna longitudinal
tinuity) other than the quadratic model (vs. age + age2 (best model
0.20 NCANDA baseline 37.3%), χ2 = 2.29, p [two-sided] = 0.130). Results were further unchan-
0.15 NCANDA longitudinal ged when specifically looking at generalizability between Luna and NKI
0.10 NKI
0.05 PNC datasets that do not share any measures (data-driven age basis model
0.00 was best age model overall 69.2%). Consistent with the strength of the
1 2 3 4 5 6 7 8 9 10 11 12 basis function being derived from its developmentally precise shape,
Factor # offsetting the basis function with respect to age led to lower and more
100
Inclusion 75 variable model performance (Supplementary Fig. S12). Combined,
%

Across 50 these results establish a simplified, single parameter data-driven basis


Thresholds 25
0 function version of the canonical executive function trajectory as an
1 2 3 4 5 6 7 8 9 10 11 12 alternative, developmentally informed functional form of age that is
Factor #
superior or highly competitive with standard, parametric functional
C. Domain General (Factor 1) Loadings forms of age when applied to new datasets and new measures.
Therefore, we suggest that, along with full, multi-parameter complex
.89 .72
ANTI_acc
DMS_acc .33 .23
spline models (GAM/GAMMs used throughout the rest of the manu-
FIX_acc
MGS_acc
.38
.36
.25
.22
DFL_acc .67
PCET_acc .36
script) and standard functional forms of age (e.g., linear, inverse linear,
MIX_acc .76 .42 PCTP_acc .51 quadratic), such a simplified, developmentally informed basis function
SOC_acc .4 .36 PCET_acc .1 .44 PNBK_acc .42
SSP_acc .44 .33 PCTP_acc .57 .67 TOW_acc .34 LNB_acc .5 may quantitatively (see Data and Code Availability) inform future
ANTI_lat −.44 −.86 PNBK_acc .28 .22 CWI_lat −.8 PCET_acc .3
DMS_lat −.26 −.35 PCET_lat −.2 −.42 PCET_lat −.41 PCTP_acc .64 research on normative development and deviations from normative
MGS_lat −.53 −.5 PCTP_lat −.63 −.02 PCTP_lat −.7 LNB_lat −.51
MIX_lat −.47 −.65 PNBK_lat −.55 −.05 PNBK_lat −.57 PCET_lat −.23 development in health and disease (see Discussion).
SOC_lat .12 −.08 STRP_lat −.54 −.38 TMT_lat −.76 PCTP_lat −.58
Luna Luna NCANDA NCANDA NKI PNC
baseline longitudinal baseline longitudinal Sensitivity analyses
Sensitivity analyses demonstrated that primary results concerning the
Fig. 3 | Correlation and factor structure of executive function measures.
magnitude and timing of executive function accuracy and latency
A Linear, bivariate correlation (r) for Luna (N = 196; 666 total visits), NCANDA
development were consistent across males and females (Supplemen-
(N = 831; 3412 total visits), NKI (N = 588), and PNC (N = 9151) datasets among
tary S13). Additional sensitivity analyses demonstrated that our pri-
accuracy measures (acc w/ acc), accuracy with latency measures (acc w/ latency),
and among latency measures (lat w/ lat). For longitudinal datasets (Luna,
mary results did not change when covarying for socioeconomic
NCANDA), baseline correlations were calculated from first visit, longitudinal indicators (parental education and family income, Supplementary S14,
correlations were calculated from disaggregation analysis (see Methods). “All S15) and assessments of culturally acquired knowledge (verbal rea-
measures” indicates estimate (black dot; measure of center) and 95% confidence soning and vocabulary, Supplementary S16), and remained consistent
interval (± 2 standard errors) from three-level meta-analysis (correlation pairs across mental health inclusion/exclusion thresholds (Supplemen-
nested in task pairs and datasets). B Top panel displays total executive function tary S17). This suggests that mathematically holding these factors
(EF) variance explained as a function of extracted factor using a bifactor rotation constant did not change the current results that focused on aggregate
for each dataset (maximum number of factors extracted per dataset based on and average executive function changes during adolescence. Thus, our
total measures per dataset [Luna, 12 measures/factors, NCANDA 7 measures/ results do not speak to for example past findings suggesting economic
factors, NKI 10 measures/factors, PNC 6 measures/factors]. Black line indicates disparities impact cognitive measures, and variability between indivi-
mean across datasets. Bottom panel displays factor inclusion across thresholds
duals (cf.,56,57). However, the tools and insights from the current work
(parallel analysis, optimal coordinate, acceleration factor, and a factor analytic
can be used for future studies focused on relationships between these
Kaiser rule; see Methods) and datasets (e.g., 100% indicates factor included across
factors and executive function in more detail (see Discussion). As in
all thresholds and datasets; see Supplementary Fig. S7 for individual datasets).
C Loadings for domain general factor (factor 1 via bifactor rotation) for each
previous longitudinal investigations of computerized and neu-
dataset and by baseline and longitudinal (via disaggregation; see Methods) for ropsychological performance58, age-independent visit effects (e.g.
Luna and NCANDA. practice effects) on cognitive testing were observed for many

Nature Communications | (2023)14:6922 7


Article https://doi.org/10.1038/s41467-023-42540-8

A. Luna B. NCANDA

% of Age−Related EF
100

75

50

25

0 E. All Measures
ANTI_acc

MGS_acc

PNBK_acc
DMS_acc

PCET_acc
PCTP_acc
SOC_acc
SSP_acc
MIX_acc

ANTI_lat

MGS_lat

PNBK_lat
DMS_lat

PCET_lat
PCTP_lat

STRP_lat
SOC_lat
FIX_acc

MIX_lat

% of Age−Related EF
100

75

C. NKI D. PNC 50
% of Age−Related EF

100 25

75 0

acc
lat
50

25
Specific EF
0
PNBK_acc

PNBK_acc
PCET_acc
PCTP_acc

PCET_acc
PCTP_acc
TOW_acc

PNBK_lat

PNBK_lat
PCET_lat
PCTP_lat

PCET_lat
PCTP_lat
DFL_acc

TMT_lat
CWI_lat

Domain General EF

Fig. 4 | Contributions from domain-general versus specific processes to age- out domain (e.g., for inhibition and inhibition/switching measures in the Luna
related differences in executive function. For each measure in each dataset sample [Antisaccade, Fixation, Mixed Antisaccade] the composite metric was
(A Luna; B NCANDA; C NKI; D PNC), model (GAM) comparisons were used to composed of working memory and planning measures; see Supplementary Fig. S8
identify the percentage of age-related deviance attributed to a measure and for workflow visualization, Methods). Through model comparison, nonspecific
domain-specific process (hashed bars) from that of a domain-general process (filled effects are attributed to domain general effects. E All Measures effects are esti-
bars). Specific effects represent the incremental deviance attributed to that mea- mated via three-level meta-analysis nesting effects in measures and datasets.
sure over a single, equally weighted composite metric of measures from the held-

executive function tasks in longitudinal samples (Supplementary S9). highlight adolescence as an essential period of transition during which
However, all longitudinal analyses (Luna, NCANDA samples) covaried individuals reach maturity in goal-directed cognition. This suggests
for a non-linear effect of visit number (see Methods) and we demon- that while adolescents clearly possess complex cognitive abilities,
strate replication to two cross-sectional datasets (NKI, PNC) where visit including the ability to inhibit prepotent responses, maintain and
effects could not have occurred, indicating that our primary results are update information in memory, and abstractly plan for future events,
likewise robust to practice effects on cognitive testing. such abilities do not reach their full potential until 18–20 years old (late
adolescence). Adolescent periods prior to this age-range (i.e., early to
Discussion mid-adolescence ~10–15 years old, and mid to late adolescence ~15–18-
Defining the adolescent period through a reproducible, years old) are therefore likely critical final stages of this type of cog-
canonical trajectory of executive function and significant nitive development, where deviations from normative development
periods of development may lead to poorer outcomes in adulthood. Identifying these sensitive,
The development of executive function has been studied in relatively or even critical3, periods of cognitive development is essential for
small (N’s ~20015,16) independent investigations using a broad range of advancing neurocognitive growth-charting to determine normative
tasks or in relatively large studies (N’s ~ 100017,18; although still smaller development and deviations from this normative development in
than the total sample used here: N = 10,766, total visits =13,817)) with health and disease19,45, in designing developmentally informed inter-
few, very narrow assessments of executive function in intelligence ventions/preventions for youth59–62, and policy concerning
testing. Collectively, prior work demonstrates significant improve- adolescents9,14.
ments from childhood through adolescence5,15,16,19–24, but the precise Given the reproducible and converging evidence for adolescence
magnitude, maturational timing, and significant periods of develop- as a distinct period of the lifespan, and one now better conceptualized
ment in executive function during the transition from adolescence to as a period of normative closure in goal-directed cognitive develop-
adulthood has not been defined. With four, large independent data- ment prior to the establishment of adult-level trajectories, the current
sets, and non-linear modeling techniques to identify specific periods of results support a broader understanding of the neurobehavioral basis
significant development, we provide reproducible and direct evidence for the adolescent period. Together with essential additional historical
that executive functions continue to develop into late adolescence, and sociocultural frameworks27, such charting of neurobehavioral
which has been widely suggested by theory4,7–9 but has rarely been processes throughout adolescence emphasize the importance of
directly tested in empirical research. Building from prominent developmentally relevant considerations for adolescents across
neurodevelopmental4,7–9 and psychological27,28 theories, these results research and clinical care. Thus, our identification of the maturational

Nature Communications | (2023)14:6922 8


Article https://doi.org/10.1038/s41467-023-42540-8

Accuracy Latency
Age Effects Across Datasets
A. B.
Luna Luna
1
NCANDA 4 NCANDA
NKI NKI
Executive Function Accuracy

Executive Function Latency


0 PNC PNC
All 3 All
(adult z−score)

(adult z−score)
−1
2

−2
1

−3
0

−4
−1
10 15 20 25 30 35 10 15 20 25 30 35
Age (years) Age (years)

Out-of-sample Performance of Data-Driven Age Basis Function

C. D.
1.00 1.00
48%
Age Model Performance Score

Age Model Performance Score

64%
(Percentile Among Models)

(Percentile Among Models)

0.75 0.75

0.50 0.50

0.25 0.25

0.00 0.00

2
data-driven age+age 1/age age no age data-driven age+age2 1/age age no age
age basis age basis
#age (out-of-sample) (out-of-sample)
params. 1 2 1 1 0 1 2 1 1 0

Fig. 5 | Scaled domain general executive function scores generate reproducible used to validate the age basis function derived from the canonical executive tra-
adolescent growth charts across datasets and tasks. Accuracy (A) and latency jectory. See Supplementary Fig. S11 for diagram of procedure. Potential age models
(B) composite (z score sum of all accuracy, latency measures; see Supplementary were evaluated with multiple metrics of model fit and complexity (see Methods and
Table S2) executive function scores for Luna (N = 196; 666 total visits), NCANDA Supplementary Fig. S11). Using the performance package (rank function) in R55,
(N = 831; 3412 total visits), NKI (N = 588), and PNC (N = 9151) datasets. Each measure model fit metrics were scaled 0 (worst model on that fit metric) to 1 (best model on
within each dataset is z scored to the performance of adults (participants 20–30 that fit metric, accounting for the directionality of improved fit for each metric [e.g.,
years old). Fit lines are from GAM/GAMM models. Error bars represent two times R2 larger values, RMSE lower values]) across candidate age models and the mean
the standard error added above and below these fits (measure of center). C, D Out- value across all model fit metrics was taken for each candidate age model to create
of-dataset performance as boxplots (center line, median; box limits, upper and an overall performance score (y-axis; C, D). Pie charts indicate the percent of times
lower quartiles; whiskers, 1.5x interquartile range; points, outliers) of the single that each age model was the top ranked according to this procedure; color in pie
parameter data-driven age basis function for accuracy (C) and latency (D) measures chart corresponds to age models color from boxplots. Number of age parameters
relative to typical age models (quadratic [age+age2], inverse age [1/age], linear age (# age params.) specifies the number age variables used in each candidate age
[age]) and an intercept only (no age) model. One dot per measure from all datasets model (see also Supplementary Fig. S11).
(N = 22 accuracy; N = 21 latency). Cross-validation (“leave one dataset out”) was

timing of executive function, in combination with similar investiga- research. To assist in this pursuit, we have made available summary
tions of affective and social processes2,63 may guide further discussion data (note participant-level data is also available with necessary data
on how to define the adolescent period and demarcate its use agreements; see Data availability) for the canonical executive
boundaries27, essential for basic and translational developmental function trajectory, with the goal that subsequent work may utilize and

Nature Communications | (2023)14:6922 9


Article https://doi.org/10.1038/s41467-023-42540-8

continue to refine empirically defined normative maturational tem- function has largely not been examined with respect to changes during
plates in executive function research. While such refinement should adolescence. The strongest evidence across the large-scale data
include ongoing model comparison of other candidate functional aggregated here suggests that age-related differences and longitudinal
forms of age (cf., Fig. 5), we suggest sharing of reproducible and well- changes across executive function tasks are driven predominantly by a
powered adolescent trajectories of executive function can be directly domain-general process. This indicates that across executive pro-
integrated in future analysis (e.g., basis function regression) in meth- cesses (e.g., inhibitory control, attention, working memory, planning)
ods that mirror the development, refinement, and use of summary there is a common system of goal-directed cognition that may lead to
statistics in other fields (e.g., polygenic risk scores64). As in these and developmental improvements across multiple contexts. Such domain-
related fields65, large-scale reproducible normative templates of general executive function development may help explain, for exam-
change can be leveraged to better understand risk factors or con- ple, wide-spread differences across executive function tasks in
sequences of mental and physical health conditions related to execu- clinical72,73 and/or population research (e.g., social determinants of
tive function during adolescence and across a range of experimental health71,74,75), as well as the tendency for many executive function tasks
conditions. to engage common neural circuitry76,77. Domain-general executive
Three of the four datasets used here (NCANDA, NKI, PNC), as function development during adolescence also provides support for
community samples, did not exclude participants on the basis of general heuristic perspectives of adolescence that emphasize a core
mental health presentations. However, our sensitivity analyses set of cognitive development4,7–9. The current work that focused on
demonstrated that our approach (used in an effort to maximize gen- multi-assessment and multi-dataset reproducibility of trajectories of
eralizability; c.f.,66,67) did not bias our results that focused on aggregate adolescent executive function across large-scale cross-sectional and
and average executive function changes during adolescence. The tools longitudinal data further sets priorities for additional within-person
and insights developed here can support future studies of executive modeling (e.g., multivariate sparse functional principal components
function differences in psychopathology both in new datasets, as well analysis78, multivariate growth curve modeling79) in future targeted
as targeted investigations within the current datasets. Normative investigations.
templates of age-related differences in executive function derived here Although we found a considerable degree of commonality in
may also be useful for future research to disambiguate developmental adolescent executive function development, as in related work from
effects and non-developmental visit effects (e.g., practice effects) that, adults42, current measures and methods do not rule out additional
consistent with prior reports58, we observed in longitudinal executive executive function variance relevant to development (even if such
function data. Future work may also use these insights towards opti- domain/measure-specific variance is less prominent than domain-
mizing developmental study designs with respect to the number of general processes). Our analyses were generally well accounted for by
participants, construct breadth of assessments, and the number of a domain general perspective of executive function, and further
longitudinal time points. exploring this allowed us to examine multi-assessment multi-dataset
The results of the current work provide support for prior theo- estimates towards reproducibility and generalizability. However, as in
retical and quantitative work suggesting non-linear developmental other reports42,69–71, executive function variance was not entirely cap-
trajectories of cognition during adolescence15,20,32. Updating theore- tured by a single factor. Future work, including using the tools and
tical models requires broad conceptual consideration, nevertheless, insights developed here, may address these questions in multi-dataset
the clear presence of non-linearity in age-related executive function reproducibility and generalizability investigations. With respect to
differences from late childhood through adulthood can directly help potential distinction among other cognitive processes, our sensitivity
refine neurodevelopmental models of adolescence. Our results for analyses did however demonstrate that the canonical executive func-
example provide less support for linear increases of executive function tional trajectory was robust to individual differences in measures of
development throughout adolescence8 as well as maturational timing culturally acquired knowledge (see Supplementary S16). The results
of this process after twenty years old27. Instead, our results clearly here nevertheless raise further questions regarding the conceptual
support a reproducible, canonical non-linear trajectory of executive distinction of executive function performance and development from
function development from adolescence to adulthood. The shape and that of related domain-general concepts like fluid cognition that are
timing of this canonical trajectory is consistent with prior theories of theorized to account for the coherence of performance-based cogni-
adolescence9 and empirical work with fewer executive function tive abilities (and the distinction from culturally acquired knowledge)
assessments and/or smaller samples that suggest non-linear cognitive in the context of general ability testing (see80 for additional discus-
development processes17,32. The robust, large-scale multi-dataset sion). Future empirical and theoretical work, to add to existing fra-
replication here provides key advances towards formalizing such a meworks, will be required to rectify these related but often historically
non-linear trajectory, and through the employed data-driven modeling distinct accounts. From either account, we suggest that commonality
approaches, explicitly defines significant periods of executive function across measures, while essential for basic and translational research
development that identify the potential closing of the adolescent and practical demarcations of adolescent development, be expanded
period for this process between 18–20 years old. Such distinctions on to consider broader sociocultural and historical perspectives as well.
the relative bounds of the adolescent period are not only essential for The increasing availability of future large-scale population-level
psychological and neuroscientific theories, but also for clinical care cohorts (e.g., Adolescent Cognitive Brain Development [ABCD]
and policy. Our work also sets key areas for future work regarding Study81), together with the methods used and developed in the current
maturational timing in more fundamental studies of executive func- work, can facilitate future empirical investigations into these areas.
tion development (e.g., disambiguating age-related change from
pubertal development, generalizability to populations outside of the Common driver of executive function development
United States, targets for brain imaging, and considerations for Conceptually, the potential cognitive and psychological mechanisms
affective versus nonaffective executive function tasks: see Considera- of such domain general executive function development remain
tions for Future Work). somewhat of an open question. As inhibitory control tasks (anti-
saccade, color-word interference, trail-making-test) often had both the
Domain general executive function development highest loadings on domain-general factors observed here (which is
While prior work in adults52,68 and younger children69–71 has provided consistent with similar prior work in adults68) and amongst the largest
evidence for a potential unity/diversity framework of executive func- developmental effects, it is possible a global inhibitory control process
tion, the relative domain-generality versus specificity of executive provides the most parsimonious explanation for domain-general/

Nature Communications | (2023)14:6922 10


Article https://doi.org/10.1038/s41467-023-42540-8

unitary executive function. If, as has been suggested, executive func- in the context of affective stimuli. That is, the included measures
tion tasks often fail to solely isolate a specific cognitive process (the so- focused on what have been considered affectively neutral cognitive
called task “impurity problem”5,68,82), global inhibitory control pro- measures. This allowed us to specifically isolate fundamental proper-
cesses may give rise to broad executive function changes through ties of executive function development as typically understood, but
adolescence across diverse tasks, each of which requires some level of future work with more diverse cognitive batteries should examine
global, goal-directed inhibition. Nevertheless, we suggest that future whether affective manipulations likewise follow the canonical execu-
work determining the common driver of executive function changes tive function trajectory established here. Another potential limitation
will benefit most from novel dense longitudinal study designs (e.g., is that the current work did not try to disambiguate age-related
repeated ambulatory smartphone/web-based assessment of changes from pubertal development, given challenges in indepen-
cognition83) and/or further multi-method investigations (e.g., fMRI76) dently estimating these effects in the presence of large cross-sectional
that provide a means to understand temporal processes and/or cor- age effects (cf.,86). However, it will be important for future work, par-
related neurobiology, respectively. This would help protect against the ticularly when focusing on early periods of adolescence to likewise
possible circularity of descriptions of a common driver of executive seek large-scale multi-assessment, multi-dataset reproducibility for the
function that are limited to functions assessed contemporaneously specific role of pubertal status in driving executive function develop-
and/or with the same methodology. As demonstrated in the current ment. A further potential limitation arises from our general focus on
work, however, even without a clear narrative description of the ori- the average executive function trajectory during adolescence. While
gins of domain-general executive function, the maturation of domain- we determined that our results were generally robust to multiple
general executive function provides a means to qualitatively under- participant-level factors, the results of the current work should be
stand the adolescent period and quantitatively guide future work. In interpreted as a normative template and individual and dataset-level
pursuit of these goals, the current results emphasize the utility of variability is expected. Relatedly, while the aggregated datasets and
research designs that include not just large sample sizes and/or long- inferences drawn here appear to approximate population patterns
itudinal data, but also multiple measures within a broader construct from the United States, further work with multinational and multi-
(executive function). Our results suggest the utility of shared infor- cultural samples is required to determine the generalizability of these
mation and/or the potential utility of convergent validity from multiple results to other countries and cultures. The tools and data developed
executive function indicators in outcome research when such con- here can nevertheless provide resources for additional research on
struct depth is available. Even when more domain-specific effects are deviations from this normative trajectory, promote improved esti-
of interest, our results suggest that the estimation of domain-general mates of uncertainty, and ultimately support potential translational
executive function via a broad battery is optimal, as developmental efforts seeking to identify clinically relevant executive function-related
differences on nearly all measures had sizeable influences from a more processes during adolescence.
general process.
Methods
Considerations for future work Participants
The identification of common adolescent executive function devel- Data for this project were provided from participants of four existing
opment may guide future translational and multidisciplinary research. projects (all with publicly available data). One internal dataset (Luna
For example, our results suggest that neuroimaging research of ado- Dataset) and three external datasets (National Consortium on Alcohol
lescent executive function may be well-suited by leveraging multiple & Neurodevelopment in Adolescence36 [NCANDA], Nathan Kline
executive function tasks to examine shared information in association Institute-Rockland Sample35 [NKI], Philadelphia Neurodevelopmental
with brain structure/function or to better isolate domain-specific Cohort37 [PNC]) were included based on (1) their inclusion of executive
effects. Likewise, as has become increasingly common72, translational function tasks performed in a developmental or lifespan dataset
research aiming to uncover adolescent executive function as a possible spanning the entirety of the adolescent period and (2) to aggregate the
predictor or consequence of clinical presentations and/or as a target largest possible dataset to explore the aims of this project. The primary
for intervention, may be best suited to approach executive function focus of the current work was on the adolescent period. To explicitly
from a unitary, domain-general process that follows the canonical capture transitions into and out of adolescence as well as the entire
executive function trajectory revealed here. Methodologically, com- adolescent period33, we included participants ranging from late
mon metrics of domain-general executive function, and normative childhood to adulthood (8–35 years old). Lower (8 years old) and
templates of change (even in scaled units: basis functions) may serve to upper (35 years old) age ranges were selected to be as inclusive as
increase reproducibility by facilitating overlap and replication efforts possible, given the overarching goal of capturing non-linear develop-
across instruments and datasets. mental trajectories, while also ensuring that at least two separate
The current project leveraged multiple large independent data- datasets had participants in each age range. This meant that only
sets, developed methodological improvements permitting the identi- participants from 8–35 years old from the NKI lifespan dataset were
fication of maturational timing of executive function, and investigated included (Full NKI Rockland Sample Range: 6–85). No participants
both common and specific components of executive function pro- were excluded based on age from the other datasets (Luna, NCANDA,
cesses, but nevertheless potential limitations and explicit suggestions PNC), which were designed to assess childhood to adolescence/
for future work should be considered. First, although this investigation adulthood and fully fell within this age range. In order to maximize
used a comprehensive approach to characterizing executive function, generalizability and representation within the datasets (see refs. 66,67
these analyses focused on the most prominent outcome measures for relevant discussion concerning neurodevelopmental studies), no
from these tests. This approach had the advantage of aligning the other participant-level demographic exclusion criteria were applied to
current analyses with predominant practices in the literature and the the datasets. Instead, we thoroughly examined the potential impact of
level of granularity supported by large-scale, public datasets, but such factors in a series of sensitivity analyses (see Supplementary
future work would benefit from alternative and/or model-based, Figs. S2; S13-S17).
computational parameterizations of behavioral performance84,85. Fur- One dataset was drawn from Dr. Beatriz Luna’s longitudinal study
thermore, the breadth of executive functions indexed by these tasks of neurocognitive development (Luna Dataset). From this dataset, the
was not exhaustive, and other domains of individual differences in current project included 196 participants (baseline age-range: 8–30
cognition were not explored. For example, by design, this study, and years old; 101 female participants, 92 male, 2 participants both sexes
many of the original datasets, did not examine executive function tasks were reported, 1 participant unknown/not reported) dataset in an

Nature Communications | (2023)14:6922 11


Article https://doi.org/10.1038/s41467-023-42540-8

accelerated longitudinal/cohort sequential design, with participants impaired cognition or motility (see ref. 37 for detailed inclusion
completing a range of follow-up visits (total participant visits = 666, information). Given the large community-based sampling procedure
median number of visits per-participant =3; range of visits per- of the PNC, this dataset included participants with psychiatric dis-
participant = 1–10; median months between visits = 13.3; range of orders that may be associated with neurocognitive performance. The
months between visits = 5.97–81.73; see Supplementary Fig. S1 for current project followed previous work with this dataset87 regarding
graphical depiction of dataset by visit structure). Exclusion criteria for data inclusion (see below) and sensitivity analyses examined the
this dataset were medical conditions or medications known to affect influence of these participants on the current project’s analyses (see
eye movements and a history of psychiatric disorders, developmental Supplementary Fig. S17).
cognitive disorders, or learning disabilities, in either the participant or In all four datasets, research protocols were approved by the
a first-degree relative, and IQ scores at baseline below 80. Participants relevant institutional review boards (Luna Dataset: University of
were recruited from the community surrounding the University of Pittsburgh; NCANDA: Duke University, Oregon Health and Sciences
Pittsburgh Medical Center. University, SRI International, University of Pittsburgh, University of
The second dataset was drawn from the multi-site, National California San Diego; NKI: Nathan Kline Institute; PNC: The University
Consortium on Alcohol & Neurodevelopment in Adolescence of Pennsylvania and Children’s Hospital of Philadelphia) and partici-
(NCANDA) (see ref. 36 for detailed sampling strategy and recruitment pants over 18 provided informed consent, while participants younger
information). The current project used data from 831 participants than 18 provided written assent and parental consent. To our knowl-
(baseline age-range:12–21 years old, 423 female participants, 408 male) edge, no participant was involved in more than one of the studies. For
in the first five visits of the accelerated longitudinal design (total par- the current analyses, no statistical method was used to predetermine
ticipant visits = 3412, median number of visits per-participant = 5; the included sample size. All four datasets were included in their
range of visits per-participant = 1–5; median months between visits = entirety, apart from analysis-specific exclusions detailed below (Data
12.17; range of months between visits = 4.98–23.97; see Supplemen- Processing). As observational studies, the included experiments were
tary Fig. S1 for graphical depiction of dataset age by visit structure). not randomized. Likewise, no blinding procedures were employed.
Exclusion criteria for NCANDA were Magnetic Resonance Imaging
(MRI) contraindications (e.g., claustrophobia, non-removable metal in Executive function measures
the body), head injury with a significant loss of consciousness, psy- Data from Luna, NCANDA, NKI, and PNC datasets were used in the
chiatric disorders that might influence study completion (e.g., psy- current project based on their inclusion of executive function tasks
chosis), and psychiatric medication (see36). A central goal of the performed in a developmental or lifespan dataset that spanned the
NCANDA study was to examine the transition to significant substance adolescent period. Classification of executive function tasks was based
use during adolescence and as a result, approximately 50% of the on prior theoretical5 and empirical work15,34,42,88, with a general oper-
dataset was recruited based on subclinical factors thought to increase ationalization of goal-directed cognitive behaviors that encompassed
the likelihood of alcohol use disorder (AUD; see ref. 36). The inclusion processes of inhibition, attention, working memory, switching, or
of participants with psychiatric conditions however was shown to not planning. Where possible, prior work with the included tasks and
substantively influence the current projects’ analyses through sensi- datasets and/or test authors34 was used to define whether specific tasks
tivity analyses (see Supplementary Fig. S17). indexed executive function. To avoid potential influences of verbal
The third dataset was drawn from the lifespan Nathan Kline skills potentially related to educational attainment, measures relying
Institute-Rockland Sample (NKI)(see ref. 35 for detailed sampling heavily on reading and language skills were not included (e.g., DKEFS-
strategy and recruitment information). The current project used data Twenty Questions, DKEFS-Proverb Test) as primary executive function
from 588 participants (age range of participants within the included assessments, but the influence of culturally acquired knowledge was
dataset [see above for age rationale]: 8–35 years old; 284 female par- shown to not influence primary results in a sensitivity analysis (Sup-
ticipants, 304 male). The NKI-Rockland Sample includes longitudinal plementary Fig. S16). Wherever possible, both accuracy and latency
follow-up data (up to two visits) on the included tasks here for a very measures were selected, except when precedence from research or
small number of participants (n = 10) within our specified age range. clinical assessment was clear on a predominant use of accuracy (e.g.,
However, given this represented such a small percentage of partici- DKEFS Tower) or latency (e.g., DKEFS Trail Making Test) measures
pants (<2% of dataset) and only included two visits, the current ana- owing to nearly universal ceiling/floor performance of the corre-
lyses only included the first visit from these participants and thus this sponding accuracy/latency measure and/or the corresponding mea-
dataset was utilized as cross-sectional (see Supplementary Fig. S1 for sure was not collected/available. See Supplementary Table S2 for the
histogram of included ages). The NKI-Rockland Sample was recruited conceptualized subdomains of the included executive function tasks
to match the ethnic and economic demographics of Rockland County, based on author consensus and original test descriptions. See Sup-
New York. Consistent with the community sampling approach, a plementary Table S3 for reproducible variable names for public data-
moderate number of participants in the NKI dataset used here sets (NCANDA, NKI, PNC).
(n = 286) met criteria (DSM-IV TR) for at least one lifetime diagnosis of Based on the above criteria, the Luna dataset included twelve
a psychiatric disorder. These factors were shown to not substantively measures from six executive function tasks that were completed at
influence the current projects’ analyses that focused on average and each visit: Antisaccade (ANTI), Memory Guided Saccade (MGS), a
aggregate developmental changes in executive function through sen- mixed (MIX) Antisaccade/Visually Guided Saccade/Fixation task,
sitivity analyses (see Supplementary Fig. S17). Cambridge Neuropsychological Test Automated Battery [CANTAB]
The fourth dataset was drawn from the Philadelphia Neurodeve- Delayed Matching to Sample (DMS), CANTAB Spatial Span (SSP),
lopmental Cohort (PNC) (see ref. 37 for detailed sampling strategy and CANTAB Stockings of Cambridge (SOC). Each of these tasks have been
recruitment information). The current project utilized data from 9151 described in detail elsewhere (see for example, refs. 15,44). Scoring
participants in the cross-sectional, PNC dataset (age range: 8–22 years procedures and outcome measures were based on previous work from
old; 4753 female participants, 4365 male, 19 participants both sexes our group and general use in the literature. Briefly, the Antisaccade
were reported, 14 participants unknown/not reported; see Supple- task required participants to inhibit a proponent response (saccade) to
mentary Fig. S1 for histogram of ages). Exclusion criteria for PNC were a peripheral stimulus (in four possible locations along the horizontal
being non-ambulatory and not in stable health, non-proficiency in meridian) and saccade towards the opposite hemifield. Both accuracy
English, physical and cognitive challenges in participation in interviews (correct response rate across trials) and latency (median speed of
and neurocognitive assessment, and the presence of a disorder that antisaccades on correct trials) of the Antisaccade task were examined.

Nature Communications | (2023)14:6922 12


Article https://doi.org/10.1038/s41467-023-42540-8

A second mixed version of the Antisaccade task was also performed, analytic age range (8–35) for the NKI dataset but was not used because
where participants performed an antisaccade but trials with different over two-thirds of the visits did not have this measure (66.82%),
task demands were also interleaved. Specifically, in 1/3rd of trials, whereas all other NKI measures included had at maximum <4%
participants were required to saccade towards the peripheral stimulus missingness.
(visually guided saccade) or in 1/3rd number of trials, simply maintain
fixation. Both accuracy and latency of this mixed version were exam- Data processing
ined, but only calculated for the antisaccade trials (with the same All data processing and statistical analyses were performed in R ver-
scoring procedure as above), given the visually guided saccade is not sion 4.1.2 (2021)90. Luna dataset eye-tracking data was scored with the
thought to rely on executive function (see ref. 15) and the number of same automatic scoring algorithms from our previous work85,91. Scores
fixation errors was included in a different measure that captured this for all other tasks were generated through released software from the
performance in a goal-oriented context (see below). The Memory instrument (e.g., Luna dataset CANTAB) and/or included in official data
Guided Saccade task required participants to saccade towards a per- releases (NCANDA, NKI, PNC datasets).
ipheral stimulus (in four possible locations along the horizontal mer- Aggregated data, either from distributed data releases
idian), remember its location during a subsequent fixation period, and (NCANDA, NKI, PNC) or our in-house database (Luna dataset) were
then saccade towards the remembered location when no stimulus was first screened to ensure each visit (participant at testing session) had
presented. Both accuracy (difference in degrees between initial sac- a valid age, anonymous id variable, and if longitudinal data, visit (i.e.,
cade and the most precise saccade the final phase85, when no stimulus these variables were not missing and were within the expected range,
was presented) and latency (median speed of the initial saccade during based on the study design) and included expected data. Data that did
the final phase across trials85) of the Memory Guided Saccade task were not meet these minimum criteria were removed from all analyses. As
examined. We also calculated the number of fixation breaks (FIX) in our prior work, eye-tracking tasks in the Luna dataset (specific task
during the middle phase of the memory guided saccade task as a at specific visit) with more than 30% of trials dropped due to poor
putative measure of inhibition. In addition to the three eye movement eye-tracking or missing (i.e., early session termination; cf.,91) were
tasks, the Luna dataset also included the Delayed Matching to Sample, also removed from all analyses. Next, data inclusion criteria were
Spatial Span, and Stockings of Cambridge tasks from the CANTAB used to maximize the included dataset sizes and result general-
Battery, each of which have been broadly used and whose stimuli can izability, while also ensuring no considerable outlier (i.e., 4 standard
be found online (see www.cambridgecognition.com/cantab/). Stan- deviations and more extreme than 99.9% of the distribution) biased
dard accuracy (Delayed Matching to Sample: Percent Correct; Spatial results. Within these procedures, individual executive function
Span: Span Length; Stockings of Cambridge: Problems Solved in measures were first screened for potential univariate leverage points
Minimum Moves) and latency (Delayed Matching to Sampe: Median in the association between age and each specific measure within
Correct Latency; Stockings of Cambridge: Mean Initial Thinking Time) general additive models (GAM: see below) or general additive mixed
measures from each of the three CANTAB tasks were examined. For models (GAMM: see below). Leverage points were defined as those
interpretive consistency across measures in the Luna dataset, the observations (measure for participant at testing session) with a
direction of the scoring of two accuracy measures (Memory Guided residual from this model that was four standard deviations above the
Saccade inaccuracy [see above]; Number of Fixation Breaks) was mean and removed from all subsequent analyses. Second, data were
multiplied by −1 to ensure that higher scores indexed better perfor- examined for potential multivariate outliers among all included
mance on all accuracy measures. executive function measures within each dataset using Mahalanobis
The NCANDA, PNC, and NKI datasets used versions of the Uni- distance within the psych package in R92. Sessions (all executive
versity of Pennsylvania Computerized Neurocognitive Battery (CNB; function measures for participant at testing session [i.e., study visit])
https://webcnp.med.upenn.edu/). The current project utilized data with a Mahalanobis distance four standard deviations above the
from three CNB tasks that met our operationalization of executive mean were removed from all subsequent analyses.
function and have been classified as executive by the CNB authors34,
the Penn Conditional Exclusion Test (PCET), a Penn N-Back Test (PNBK; Data analysis
NCANDA: Penn Short Fractal N-back Test [PNB-F]; PNC & NKI: Penn General additive models. General additive models (cross-sectional
Letter N-Back Test [PNB-L]), and the Penn Continuous Performance data: PNC dataset) and general additive mixed models (longitudinal
Test: Number and Letter version (PCPT). Standard outcome measures data: Luna, NCANDA, NKI datasets) with penalized smooth plate
for each task were included for accuracy (PCET: calculated accuracy regression splines via the mgcv package41 were used to quantify non-
measure [PCET ACC2]; PNB-F: true positive [correct] responses for linear associations between age and executive function measures.
1-back and 2-back trials; PNB-L: true positive [correct] responses for Primary cross-sectional analyses (NKI, PNC) utilized a simple bivariate
1-back and 2-back trials; PCPT: sum of true positives for number and model examining the smoothed association between age (the inde-
letter trials) and latency (PCET: median response time for correct pendent variable) and executive function (the outcome measure).
responses; PNB-F: mean of median response time for 1-back and 2-back Primary longitudinal analyses (Luna, NCANDA) additionally included a
trials; PNB-L: mean of median responses for 1-back and 2-back trials, smoothed term for visit number to account for potential non-
PCPT: median response time for correct response to number trials and developmental visit effects (e.g., practice: see Supplementary S9)
letter trials). The NCANDA dataset also included a standard Stroop and per-participant random intercepts and age slopes via mgcv GAMM.
Test (STRP), where the primary measure of average latency over all MGCV defaults were used for all parametrization with the exception
correct trials was included. The NKI dataset also included four that the maximum basis dimension for visit number in the NCANDA
executive function tasks from the Delis-Kaplan Executive Function dataset was adjusted from 10 (the default) to 5 (given there were
System43 (D-KEFS) that were included in the current study: color-word maximally five visits in this analysis dataset). Age-related fits from
interference (CWI), design fluency (DFL), tower (TOW), and the trail- these primary GAM/GAMM models are presented in Fig. 1. Pointwise
making test (TMT). Again, standard outcome measures were used for confidence intervals (displayed in Fig. 5) were generated by multi-
these tasks (CWI latency: average of inhibition and inhibition/switch- plying standard error estimates from the mgcv GAM/GAMM predict
ing conditions; correlation amongst these measures: r = 0.806; DFL function by 2 and summing this with the predicted fit estimate. Sen-
Accuracy88,89: switching total correct; TOW: Total Achievement Score sitivity analyses (Supplementary S13–S17) examining socio-
Total Raw; TMT: Number-Letter Switching). The DKEFS Sort Task was demographic and cognitive covariates followed the same procedures,
also available for a small percentage of participant visits within our with continuous variables (e.g., parental education) modeled as

Nature Communications | (2023)14:6922 13


Article https://doi.org/10.1038/s41467-023-42540-8

smooth terms and categorical variables (e.g., biological sex) modeled As in primary analyses, the relationship between age and each mea-
as parametric terms. sure was modeled with penalized splines. For each model (A–C), the
percent of deviance explained in age was extracted (following stan-
Periods of growth and maturational timing. As in previous develop- dard estimation in mgcv GAM model). Next, the incremental
mental research in different domains49,50,87, periods of significant age- deviance of age explained by measurex_i over composite metric M ∌ x
related change (age ranges) were defined by estimating the first deri- was computed. Finally, the resulting measure specific age-related
vative (finite differences method) in 1/10th of a year intervals of GAM deviance was scaled to the original deviance estimate for the specific
fits and performing a posterior simulation based on the GAM/GAMM measure (model A) to create a percent of the original measure’s age
model coefficients. Simultaneous (used given the multiple testing) effect. The remaining percentage of model A’s deviance was assigned
confidence intervals (CI) were generated with the gratia package51 with as the domain-general percentage. To ensure consistent interpret-
10,000 simulations. Age ranges in which the simultaneous 95% CI did ability of the directionality of composite metric M ∌ x, measures from
not include zero (p < 0.05) were classified as significant. Using this the opposing response type were sign flipped (e.g., latency sign
method, raster plots in Fig. 2 display color (red or blue) when the flipped before creating equally weighted composite with accuracy
derivative is significant and white when the derivative is not significant. measures). Sensitivity analysis examined the influence of the com-
An aggregate analysis, pointwise three-level meta-analysis, incorpor- posite measure’s precision in the estimation of domain-general
ating all measures from all datasets was performed using the metafor accounts of age-related differences in executive function (see Sup-
package93 with effects nested in measure and study. A cross-dataset plementary Fig. S10).
label was used to nest measures from the same tasks (e.g., Penn CNB)
across datasets. As in prior methodological work on point-wise meta- Normative maturational templates of age-related differences in
analysis with GAMs94, meta-analytic estimates were computed across a executive function. We used basis function regression with cross-
common span of the independent variable: here, 1/10th year age bins, validation (“leave one dataset out”) to determine whether normative
following linear interpolation of GAM/GAMM first derivatives. The maturational templates of executive function could improve devel-
same pointwise, three-level meta-analytic approach was used in opmental inferences in new datasets and measures. A diagram of this
aggregate analysis of GAM/GAMM fits in Fig. 5 and Supplementary procedure is likewise presented in Supplementary Fig. S11. In each
Fig. S5. Secondary analyses that used an effect size threshold to define iteration of the procedure, three (out of four) datasets were used to
maturation scaled the GAM/GAMM fits from 0 (min) to 1 (max) to generate canonical executive function trajectories for accuracy and
determine the percentage of total age-related that had occurred for latency measures (measures aggregated across datasets via a point-
each age (see Supplementary Fig. S5). wise three-level meta-analysis of GAM/GAMM age fits). The resulting
output was then smoothed (via a subsequent GAM model), inter-
Interdependence of performance across executive function tasks. polated to the ages of the test (“left out”) dataset, and fit as a single age
Cross-sectional and longitudinal correlations (linear, bivariate) were parameter to each accuracy and latency measure of the left out dataset
computed among executive function measures in each dataset and compared to typical age models (age+age2, inverse age [1/age],
(Fig. 3A). For longitudinal datasets (Luna, NCANDA), baseline refers to linear age [age]) as well as an intercept only (no age) model. Potential
the first visit, longitudinal refers to the pooled within-person correla- age models were evaluated with multiple metrics of model fit and
tion via disaggregation with the statsBy function in the psych package complexity via the performance package in R96 (longitudinal models
in R. This approach was chosen to balance interpretability with model [Luna, NCANDA]: R2, adjusted R2, Intraclass Correlation Coefficient
complexity for the accelerated longitudinal designs of Luna and [ICC], Root Mean Square Error [RMSE], residual standard deviation
NCANDA datasets. Aggregate analysis (“all measures”) in Fig. 3A uti- [Sigma], Akaike’s Information Criterion [AIC], Bayesian Information
lized a three-level meta-analysis via metafor with correlation pairs Criterion [BIC]); cross-sectional models [NKI, PNC)]: R2, adjusted R2,
nested in task pairs and datasets. Exploratory factor analysis (Fig. 3B) RMSE, Sigma, AIC, BIC). An additional sensitivity analysis explored the
via maximum likelihood method and a bifactor rotation was per- influence of the exact developmental timing of the developmental
formed with the psych package in R from between- (Luna and NCANDA function with a similar procedure that offset in years (earlier or later)
baseline and NKI, PNC datasets) and within-person correlation matri- the canonical executive function trajectory (see Supplementary
ces (Luna and NCANDA longitudinal). Multiple data-driven thresholds Fig. S12).
for the number of extracted factors (Fig. 3C) were examined via par-
allel analysis and the nScree function in the nFactors R package95 (95% Reporting summary
CI from parallel analysis, factor analytic Kaiser rule, optimal coordi- Further information on research design is available in the Nature
nate, acceleration factor). Portfolio Reporting Summary linked to this article.
Contributions from domain-general versus specific processes to
age-related differences in executive function were determined via Data availability
model comparison that is also presented with the same description as This project used publicly available data for all analyses. Deidentified
well as additional visualization in Supplementary Fig. S8. To maximize data for all datasets used in this project are available in public reposi-
comparability across studies and to prevent bias from shared, non- tories pending appropriate data use agreements. Luna sample: nda.-
executive function visit effects (e.g., practice effects; see Sensitivity nih.gov/edit_collection.html?id=2831. NCANDA: ncanda.org (Release 4Y
Analyses and Supplementary S9) analyses here were performed with V02). NKI: fcon_1000.projects.nitrc.org/indi/enhanced/. PNC: ncbi.nlm.-
cross-sectional data, although results are consistent with longitudinal nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000607.v3.p2. The
data (cf., within-person factor structure in Fig. 3B, D). data supporting the individual figures are provided in the Source Data
First, three GAM models were fit for each dataset for each Files. Summary data for the canonical executive function trajectory have
measure assessing the relationship between age and the specific been made available at https://github.com/tervoclemmensb/Executive_
measure i from subdomain x (measurex_i): model A, a composite Function_Charting. Source data are provided with this paper.
metric created from all measures not in the same putative sub-
domain as measurex_i: composite metric M ∌ x, where M ∌ x represents Code availability
the set (M) of executive function measures that do not contain Analysis code for the current project is available at https://github.com/
measures from subdomain x: model B, and a model where age is tervoclemmensb/Executive_Function_Charting. Tervo-Clemmens, B., A
estimated from both measurex_i and composite metric M ∌ x: model C. Canonical Trajectory of Executive Function Maturation from

Nature Communications | (2023)14:6922 14


Article https://doi.org/10.1038/s41467-023-42540-8

Adolescence to Adulthood, Executive Function Charting, https://doi. 22. Passler, M. A., Isaac, W. & Hynd, G. W. Neuropsychological
org/10.5281/zenodo.8302417, 2023. development of behavior attributed to frontal lobe
functioning in children. Developmental Neuropsychol. 1,
References 349–370 (1985).
1. Luna, B., Marek, S., Larsen, B., Tervo-Clemmens, B. & Chahal, R. An 23. Cromer, J. A., Schembri, A. J., Harel, B. T. & Maruff, P. The nature and
integrative model of the maturation of cognitive control. Annu. Rev. rate of cognitive maturation from late childhood to adulthood.
Neurosci. 38, 151–170 (2015). Front. Psychol. 6, 704 (2015).
2. Blakemore, S.-J. & Mills, K. L. Is adolescence a sensitive period for 24. Luciana, M., Conklin, H. M., Hooper, C. J. & Yarger, R. S. The
sociocultural processing? Annu. Rev. Psychol. 65, 187–207 (2014). development of nonverbal working memory and executive control
3. Larsen, B. & Luna, B. Adolescence as a neurobiological critical processes in adolescents. Child Dev. 76, 697–712 (2005).
period for the development of higher-order cognition. Neurosci. 25. Best, J. R. & Miller, P. H. A developmental perspective on executive
Biobehav. Rev. 94, 179–195 (2018). function. Child Dev. 81, 1641–1660 (2010).
4. Shulman, E. P. et al. The dual systems model: review, reappraisal, 26. Spear, L. P. The adolescent brain and age-related behavioral man-
and reaffirmation. Dev. Cogn. Neurosci. 17, 103–117 (2016). ifestations. Neurosci. Biobehav. Rev. 24, 417–463 (2000).
5. Jurado, M. B. & Rosselli, M. The elusive nature of executive func- 27. Sawyer, S. M., Azzopardi, P. S., Wickremarathne, D. & Patton, G. C.
tions: a review of our current understanding. Neuropsychol. Rev. 17, The age of adolescence. Lancet Child Adolesc. Health 2,
213–233 (2007). 223–228 (2018).
6. Denckla, M. B. A theory and model of executive function: a neu- 28. Dahl, R. E. & Hariri, A. R. Lessons from G. Stanley Hall: Connecting
ropsychological perspective. In: Attention, memory, and executive new research in biological sciences to the study of adolescent
function, pp. 263–278 (1996). development. J. Res. Adolesc. 15, 367–382 (2005).
7. Casey, B. J., Getz, S. & Galvan, A. The adolescent brain. Dev. Rev. 28, 29. Organization, W. H. Young people’s health-a challenge for society:
62–77 (2008). report of a WHO Study Group on Young People and” Health for All
8. Steinberg, L. A dual systems model of adolescent risk-taking. Dev. by the Year 2000”[meeting held in Geneva from 4 to 8 June 1984].
Psychobiol. 52, 216–224 (2010). (World Health Organization, 1986).
9. Luna, B. & Wright, C. Adolescent brain development: Implications 30. Wohlwill, J. F. The age variable in psychological research. Psychol.
for the juvenile criminal justice system. In: APA handbook of psy- Rev. 77, 49–64 (1970).
chology and juvenile justice, pp. 91–116 (2016). 31. Robinson, K., Schmidt, T. & Teti, D. M. Issues in the use of long-
10. Tervo-Clemmens, B., Musket, C. W., Calabro, F. J. & Luna, B. Ado- itudinal and cross-sectional designs. In: Handbook of research
lescent neurocognitive development and cannabis use. In: Factors methods in developmental science, pp 1–20 (2005).
affecting neurodevelopment, 537–550 (Elsevier, 2021). 32. Kail, R. V. & Ferrer, E. Processing speed in childhood and adoles-
11. Willoughby, T., Heffer, T., Good, M. & Magnacca, C. Is adolescence cence: longitudinal models for examining developmental change.
a time of heightened risk taking? An overview of types of risk-taking Child Dev. 78, 1760–1770 (2007).
behaviors across age groups. Dev. Rev. 61, 100980 (2021). 33. Luna, B., Tervo-Clemmens, B. & Calabro, F. J. Considerations when
12. Tervo-Clemmens, B., Quach, A., Calabro, F. J., Foran, W. & Luna, B. characterizing adolescent neurocognitive development. Biol. Psy-
Meta-analysis and review of functional neuroimaging differences chiatry 89, 96–98 (2021).
underlying adolescent vulnerability to substance use. NeuroImage 34. Gur, R. C. et al. Age group and sex differences in performance on a
209, 116476 (2020). computerized neurocognitive battery in children age 8- 21. Neu-
13. Kessler, R. C. et al. Lifetime prevalence and age-of-onset distribu- ropsychology 26, 251 (2012).
tions of DSM-IV disorders in the National Comorbidity Survey 35. Nooner, K. B. et al. The NKI-Rockland sample: a model for accel-
Replication. Arch. Gen. Psychiatry 62, 593–602 (2005). erating the pace of discovery science in psychiatry. Front. Neurosci.
14. Steinberg, L. The influence of neuroscience on US Supreme Court 6, 152 (2012).
decisions about adolescents’ criminal culpability. Nat. Rev. Neu- 36. Brown, S. A. et al. The National Consortium on Alcohol and Neu-
rosci. 14, 513–518 (2013). roDevelopment in Adolescence (NCANDA): a multisite study of
15. Luna, B., Garver, K. E., Urban, T. A., Lazar, N. A. & Sweeney, J. A. adolescent development and substance use. J. Stud. alcohol drugs
Maturation of cognitive processes from late childhood to adult- 76, 895–908 (2015).
hood. Child Dev. 75, 1357–1372 (2004). 37. Calkins, M. E. et al. The Philadelphia Neurodevelopmental Cohort:
16. Demetriou, A. et al. The development of mental processing: effi- constructing a deep phenotyping collaborative. J. Child Psychol.
ciency, working memory, and thinking. Monogr. Soc. Res. Child Psychiatry 56, 1356–1369 (2015).
Dev. 67, 1–55 (2002). 38. Schönbrodt, F. D. & Perugini, M. At what sample size do correlations
17. McArdle, J. J., Ferrer-Caja, E., Hamagami, F. & Woodcock, R. W. stabilize? J. Res. Personal. 47, 609–612 (2013).
Comparative longitudinal structural analyses of the growth and 39. Marek, S. et al. Reproducible brain-wide association studies require
decline of multiple intellectual abilities over the life span. Dev. thousands of individuals. Nature 603, 654–660 (2022).
Psychol. 38, 115 (2002). 40. Arnett, J. J. Emerging adulthood: What is it, and what is it good for?
18. Moffitt, T. E. et al. A gradient of childhood self-control predicts Child Dev. Perspect. 1, 68–73 (2007).
health, wealth, and public safety. Proc. Natl Acad. Sci. 108, 41. Wood, S. mgcv: Mixed GAM computation vehicle with GCV/AIC/
2693–2698 (2011). REML smoothness estimation. University of BATH (2012).
19. Quach, A. et al. Adolescent development of inhibitory control and 42. Miyake, A. et al. The unity and diversity of executive functions and
substance use vulnerability: a longitudinal neuroimaging study. their contributions to complex “frontal lobe” tasks: a latent variable
Dev. Cogn. Neurosci. 42, 100771 (2020). analysis. Cogn. Psychol. 41, 49–100 (2000).
20. Ordaz, S. J., Foran, W., Velanova, K. & Luna, B. Longitudinal growth 43. Delis, D. C., Kaplan, E. & Kramer, J. H. Delis-Kaplan executive
curves of brain function underlying inhibitory control through function system. (2001).
adolescence. J. Neurosci. 33, 18109–18124 (2013). 44. De Luca, C. R. et al. Normative data from the CANTAB. I: develop-
21. Anderson, V., Northam, E. & Wrennall, J. Developmental neu- ment of executive function over the lifespan. J. Clin. Exp. Neu-
ropsychology: a clinical approach. (Routledge, 2018). ropsychol. 25, 242–254 (2003).

Nature Communications | (2023)14:6922 15


Article https://doi.org/10.1038/s41467-023-42540-8

45. Moore, T. M., Reise, S. P., Gur, R. E., Hakonarson, H. & Gur, R. C. 67. Cosgrove, K. T. et al. Limits to the generalizability of resting-state
Psychometric properties of the penn computerized neurocognitive functional magnetic resonance imaging studies of youth: an
battery. Neuropsychology 29, 235 (2015). examination of ABCD Study® baseline data. Brain. Imaging Behav.
46. Duncan Roger Johnson Michaela Swales Charles Freer, J. Frontal 16, 1919–1925 (2022).
lobe deficits after head injury: unity and diversity of function. Cogn. 68. Miyake, A. & Friedman, N. P. The nature and organization of indivi-
Neuropsychol. 14, 713–741 (1997). dual differences in executive functions: four general conclusions.
47. Henrich, J., Heine, S. J. & Norenzayan, A. The weirdest people in the Curr. Dir. Psychol. Sci. 21, 8–14 (2012).
world? Behav. Brain Sci. 33, 61–83 (2010). 69. Cirino, P. T. et al. A framework for executive function in the late
48. Wood, S. N. On p-values for smooth components of an extended elementary years. Neuropsychology 32, 176 (2018).
generalized additive model. Biometrika 100, 221–228 (2013). 70. Blair, C., Zelazo, P. D. & Greenberg, M. T. The measurement of
49. Bridgwater, M. et al. Developmental influences on symptom executive function in early childhood. Dev. Neuropsychol. 28,
expression in antipsychotic-naïve first-episode psychosis. Psychol. 561–571 (2005).
Med. 52, 1698–1709 (2020). 71. Blair, C. Developmental science and executive function. Curr. Dir.
50. Calabro, F. J., Murty, V. P., Jalbrzikowski, M., Tervo-Clemmens, B. & Psychol. Sci. 25, 3–7 (2016).
Luna, B. Development of hippocampal–prefrontal cortex interac- 72. Gur, R. C. et al. Neurocognitive growth charting in psychosis
tions through adolescence. Cereb. Cortex 30, 1548–1558 (2020). spectrum youths. JAMA Psychiatry 71, 366–374 (2014).
51. Simpson, G. L. & Singmann, H. R Package: gratia. Ggplot-based 73. Willcutt, E. G., Doyle, A. E., Nigg, J. T., Faraone, S. V. & Pennington, B.
graphics and other useful functions for GAMs fitted using Mgcv, F. Validity of the executive function theory of attention-deficit/
0.1-0 (Ggplot-based graphics and utility functions for working with hyperactivity disorder: a meta-analytic review. Biol. Psychiatry 57,
GAMs fitted using the mgcv package).[Google Scholar] (2018). 1336–1346 (2005).
52. Tucker-Drob, E. M. Global and domain-specific changes in cogni- 74. Hackman, D. A., Gallop, R., Evans, G. W. & Farah, M. J. Socio-
tion throughout adulthood. Dev. Psychol. 47, 331 (2011). economic status and executive function: developmental trajec-
53. Deater-Deckard, K. & Mayr, U. Cognitive change in aging: identify- tories and mediation. Dev. Sci. 18, 686–702 (2015).
ing gene–environment correlation and nonshared environment 75. Lawson, G. M., Hook, C. J. & Farah, M. J. A meta-analysis of the
mechanisms. J. Gerontol. Ser. B: Psychol. Sci. Soc. Sci. 60, relationship between socioeconomic status and executive function
24–31 (2005). performance among children. Dev. Sci. 21, e12529 (2018).
54. Kuczmarski, R. J. CDC growth charts: United States. (US Department 76. Zhang, Z. et al. Neural substrates of the executive function con-
of Health and Human Services, Centers for Disease Control and struct, age-related changes, and task materials in adolescents and
…, 2000). adults: ALE meta-analyses of 408 fMRI studies. Dev. Sci. 24,
55. Friston, K. J. Models of brain function in neuroimaging. Annu. Rev. e13111 (2021).
Psychol. 56, 57–87 (2005). 77. Fu, Z. et al. The geometry of domain-general performance mon-
56. Noble, K. G. et al. Family income, parental education and brain itoring in the human medial frontal cortex. Science 6,
structure in children and adolescents. Nat. Neurosci. 18, eabm9922 (2021).
773–778 (2015). 78. Jiang, L. et al. Bayesian multivariate sparse functional principal
57. Engelhardt, L. E., Church, J. A., Paige Harden, K. & Tucker-Drob, E. components analysis with application to longitudinal microbiome
M. Accounting for the shared environment in cognitive abilities and multiomics data. Ann. Appl. Stat. 16, 2231–2249 (2022).
academic achievement with measured socioecological contexts. 79. Tucker-Drob, E. M. et al. A strong dependency between changes in
Dev. Sci. 22, e12699 (2019). fluid and crystallized abilities in human cognitive aging. Sci. Adv. 8,
58. Sullivan, E. V. et al. Effects of prior testing lasting a full year in eabj2422 (2022).
NCANDA adolescents: contributions from age, sex, socioeconomic 80. Friedman, N. P. et al. Not all executive functions are related to
status, ethnicity, site, family history of alcohol or drug abuse, and intelligence. Psychol. Sci. 17, 172–179 (2006).
baseline performance. Dev. Cogn. Neurosci. 24, 72–83 (2017). 81. Volkow, N. D. et al. The conception of the ABCD study: from sub-
59. Romer, D. Adolescent risk taking, impulsivity, and brain develop- stance use to a broad NIH collaboration. Dev. Cogn. Neurosci. 32,
ment: Implications for prevention. Dev. Psychobiol. 52, 4–7 (2018).
263–276 (2010). 82. Burgess, P. W. Theory and methodology in executive function
60. Dennis, M. et al. The Cannabis Youth Treatment (CYT) Study: main research. In: Methodology of frontal and executive function 87–121
findings from two randomized trials. J. Subst. Abus. Treat. 27, (Routledge, 2004).
197–213 (2004). 83. Germine, L., Strong, R. W., Singh, S. & Sliwinski, M. J. Toward
61. Skiba, D., Monroe, J. & Wodarski, J. S. Adolescent substance use: dynamic phenotypes and the scalable measurement of human
reviewing the effectiveness of prevention strategies. Soc. work 49, behavior. Neuropsychopharmacology 46, 209–216 (2021).
343–353 (2004). 84. Weigard, A. et al. Cognitive modeling informs interpretation of go/
62. Diamond, A. & Lee, K. Interventions shown to aid executive function no-go task-related neural activations and their links to externalizing
development in children 4 to 12 years old. Science 333, psychopathology. Biol. Psychiatry. Cogn. Neurosci. Neuroimaging
959–964 (2011). 5, 530–541 (2020).
63. Ernst, M. & Fudge, J. L. A developmental neurobiological model of 85. Montez, D. F., Calabro, F. J. & Luna, B. The expression of established
motivated behavior: anatomy, connectivity and ontogeny of the cognitive brain states stabilizes with working memory develop-
triadic nodes. Neurosci. Biobehav. Rev. 33, 367–382 (2009). ment. Elife 6, e25606 (2017).
64. Dudbridge, F. Power and predictive accuracy of polygenic risk 86. Wierenga, L. M. et al. Unraveling age, puberty and testosterone
scores. PLoS Genet. 9, e1003348 (2013). effects on subcortical brain development across adolescence.
65. Marquand, A. F. et al. Conceptualizing mental disorders as devia- Psychoneuroendocrinology 91, 105–114 (2018).
tions from normative functioning. Mol. Psychiatry 24, 87. Larsen, B. et al. Longitudinal development of brain iron is linked to
1415–1424 (2019). cognition in youth. J. Neurosci. 40, 1810–1818 (2020).
66. LeWinn, K. Z., Sheridan, M. A., Keyes, K. M., Hamilton, A. & 88. Callahan, B. L., Plamondon, A., Gill, S. & Ismail, Z. Contribution of
McLaughlin, K. A. Sample composition alters associations between vascular risk factors to the relationship between ADHD symptoms
age and brain structure. Nat. Commun. 8, 1–14 (2017). and cognition in adults and seniors. Sci. Rep. 11, 1–11 (2021).

Nature Communications | (2023)14:6922 16


Article https://doi.org/10.1038/s41467-023-42540-8

89. Suchy, Y., Kraybill, M. L. & Larson, J. C. G. Understanding design Additional information
fluency: Motor and executive contributions. J. Int. Neuropsychol. Supplementary information The online version contains
Soc. 16, 26–37 (2010). supplementary material available at
90. Team, R. C. R: a language and environment for statistical comput- https://doi.org/10.1038/s41467-023-42540-8.
ing. http://www.R-project.org/ (2013).
91. Tervo-Clemmens, B. et al. Neural correlates of rewarded response Correspondence and requests for materials should be addressed to
inhibition in youth at risk for problematic alcohol use. Front. Behav. Brenden Tervo-Clemmens.
Neurosci. 11, 205 (2017).
92. Revelle, W. & Revelle, M. W. Package ‘psych’. Compr. R. Arch. Netw. Peer review information Nature Communications thanks James Ogilvie
337, 338 (2015). and the other, anonymous, reviewer(s) for their contribution to the peer
93. Viechtbauer, W. & Viechtbauer, M. W. Package ‘metafor’. The review of this work. A peer review file is available.
Comprehensive R Archive Network. Package ‘metafor’. http://cran.
r-project.org/web/packages/metafor/metafor.pdf (2015). Reprints and permissions information is available at
94. Sørensen, Ø. et al. Meta-analysis of generalized additive models in http://www.nature.com/reprints
neuroimaging studies. NeuroImage 224, 117416 (2021).
95. Raiche, G., Magis, D. & Raiche, M. G. Package ‘nFactors’. Repository Publisher’s note Springer Nature remains neutral with regard to jur-
CRAN, 1–58 (2020). isdictional claims in published maps and institutional affiliations.
96. Lüdecke, D., Ben-Shachar, M. S., Patil, I., Waggoner, P. & Makowski,
D. performance: An R package for assessment, comparison and Open Access This article is licensed under a Creative Commons
testing of statistical models. J. Open Source Softw. 6, 3139 (2021). Attribution 4.0 International License, which permits use, sharing,
adaptation, distribution and reproduction in any medium or format, as
Acknowledgements long as you give appropriate credit to the original author(s) and the
This work was supported by the National Institutes of Health: source, provide a link to the Creative Commons license, and indicate if
R03MH113090 (Calabro, Luna), R01MH067924 (Luna), an American changes were made. The images or other third party material in this
Psychological Foundation Visionary Grant (Tervo-Clemmens), and the article are included in the article’s Creative Commons license, unless
Staunton Farm Foundation (Luna). indicated otherwise in a credit line to the material. If material is not
included in the article’s Creative Commons license and your intended
Author contributions use is not permitted by statutory regulation or exceeds the permitted
Conception: B.T.-C., B.L. Design: B.T.-C., F.J.C., A.C.P., B.L. Data acqui- use, you will need to obtain permission directly from the copyright
sition, analysis, and interpretation: B.T.-C., F.J.C., A.C.P., J.F., W.F., B.L. holder. To view a copy of this license, visit http://creativecommons.org/
Manuscript writing, revising: B.T.-C., F.J.C., A.C.P., J.F., W.F., B.L. licenses/by/4.0/.

Competing interests © The Author(s) 2023


The authors declare no competing interests.

Nature Communications | (2023)14:6922 17

You might also like