Introduction

An essential feature of human cognition is the accurate perception of the multiple sensory inputs coming from the environment, requiring our brain to encode stimuli from different sensory modalities on the basis of their spatiotemporal features (Spence, 2007; Stein & Stanford, 2008). The temporal characteristics of sensory stimuli drive multisensory integration processes, determining the functional integration of sensory inputs having the same origin, and the adaptive segregation of information coming from separate events/objects (Murray et al., 2016; Pasqualotto et al., 2016; Stevenson et al., 2013).

In the case of audiovisual (AV) temporal perception, visual and auditory sensory inputs are transmitted from the external environment to sensory cortical areas at different propagation speeds, requiring different sensory and neural processing times (for example, see Recanzone, 2009; Vroomen & Keetels, 2010). A certain degree of temporal tolerance of AV asynchrony has a functional value for constructing coherent sensory percepts (Wallace & Stevenson, 2014), reflecting a fundamental aspect for functional multisensory integration. Indeed, even when AV stimuli are not physically synchronous, they are perceptually bound when presented asynchronously within a limited temporal range, a process that is best operationalized in the construct of the “temporal binding window” (TBW; Colonius & Diederich 2004; Noel et al., 2016; Stevenson et al., 2017). The TBW is a probabilistic measure, indexing the likelihood that two sensory stimuli are perceived as a unique percept across a range of stimulus onset asynchronies (SOAs). Interestingly, recent evidence has shown that occipital brain oscillations in the alpha band (8–12 Hz) represent a key neural mechanism to orchestrate temporal integration/segregation of sensory stimuli, as it would reflect the temporal unit of AV temporal perception (Bastiaansen et al., 2020; Cecere et al., 2015; Keil & Senkowski, 2018; Cooke et al., 2019; Ronconi et al., 2023). The current idea is that two sensory inputs are integrated when they fall into the same alpha cycle. On the contrary, two sensory inputs are segregated when presented in different alpha cycles, leading to higher temporal acuity (Samaha & Postle, 2015; Bastiaansen et al., 2020; Cecere et al., 2015; Migliorati et al., 2020). To disclose the causal link between alpha oscillations and TBWs, noninvasive brain stimulation (NIBS) techniques have been used in few studies in order to demonstrate that stimulations at faster or slower frequencies of alpha oscillations can shrink or expand, respectively, the width of the TBW, modulating AV temporal perception (Cecere et al., 2015; Venskus et al., 2021).

Interestingly, similarly to what happens for transcranial Alternating Current Stimulation (tACS), it has been shown that the application of a rhythmical sensory force can act as “sensory entrainment”, inducing a phase alignment to the external rhythmical stimulation of the endogenous brain rhythms, that start to oscillate at the same stimulation frequency, thus leading to resonance phenomena in neural and perceptual activity (De Graaf et al., 2013; Lakatos et al., 2019; Ronconi et al., 2016a, b; Spaak et al., 2014; for a review, see Haegens & Golumbic, 2018). Regarding the implementation of sensory entrainment with the aim to shape temporal perception, the results are currently mixed. In a first study of Ronconi and Melcher (2017) in the unisensory (visual) domain, results showed that although alpha sensory entrainment as compared with entrainment at other frequencies (i.e., theta, beta) improved temporal segregation, the comparison between slower vs. faster alpha showed an opposite (but non significant) trend compared with findings of previous tACS evidence (Cecere et al., 2015; Venskus et al., 2021). In a subsequent study, it has been shown that AV stimulation at the upper boundary of the alpha band improves segregation of visual stimuli, but only for a short period after the stimulation offset (Ronconi et al., 2018). Although this evidence suggests a relative efficacy of audiovisual entrainment in modulating the temporal perception in unisensory (visual) domain, whether alpha-band sensory stimulation can shape the width of AV TBW remains unknown.

Visual and auditory signals require different neural processing timings (Noel et al., 2016; Vroomen & Keetels, 2010), and these temporal differences characterize the complex dynamic of how AV stimuli interact in order to be temporally integrated/segregated (Recanzone, 2009). Visual and auditory stimuli induce a “phase reset” of neural oscillations within (i.e., from visual stimuli to visual areas; Landau & Fries, 2012) and also across sensory modalities (i.e., from auditory stimuli to the visual areas; Fiebelkorn et al., 2013; Romei et al., 2012), enhancing the oscillatory activity in the primary sensory cortices of other sensory modalities (Ghazanfar & Schroeder, 2006; Senkowski et al., 2008; for a review see Bauer et al., 2020). Moreover, the cross-modal phase reset dynamics underlying the binding mechanisms of AV stimuli is strongly influenced by the leading sense (Thorne & Debener, 2014; Cecere et al., 2016). In the condition in which auditory stimuli precede visual ones (Auditory Leading; AL), the auditory stimulus resets the phase of the brain oscillations in the visual cortex to anticipate the imminent presentation of the visual input, ultimately increasing the efficiency in AV integration/segregation processes (i.e., narrow and less malleable TBW). On the contrary, the visual leading (VL) condition might be driven by higher-level prediction mechanisms, resulting in an increased perception of synchrony of the AV stimulus pairs (i.e., a large and more plastic TBW; Thorne & Debener, 2014; Cecere et al., 2016, 2017).

Another critical point to consider is that, differently from previous tACS evidence that investigated the “online” effects of alpha-band stimulation on AV temporal acuity (Cecere et al., 2015; Venskus et al., 2021), sensory entrainment can be administered trial-by-trials for a short prestimulus period prior to onset of the AV target pairs, allowing to investigate the “offline” effects on the AV TBW for a short period after the stimulation offset.

In the present study we have employed an online version of a Simultaneity Judgment (SJ) task, a consolidated experimental paradigm to measure TBWs (Roach et al., 2011; Vatakis et al., 2008; Zampini et al., 2005), to investigate the potential modulations of the AV TBW induced by slower and faster alpha frequencies. Using a SJ task combined with EEG technique, recent evidence highlighted that faster individual alpha frequency (IAF) accounted for narrower AV (Bastiaansen et al., 2020) and tactile-visual (Migliorati et al., 2020) TBWs. In our study, the SJ task was preceded, trial by trial, by AV sensory entrainment at different frequencies in the alpha band: (i) ~8.50 Hz (lower alpha condition); (ii) ∼12 Hz (upper alpha condition). Furthermore, to control for a potential increment in temporal expectation induced by prestimulus rhythmic sequences, we implemented a nonrhythmic condition where only the first and the last entrainers were employed. Since AV temporal perception is driven by the occipital alpha rhythm (e.g., Bastiaansen et al., 2020; Cooke et al., 2019), we expect a lower simultaneity rate and a narrower AV TBW in the trials of the SJ task following the upper alpha (∼12 Hz) condition of sensory stimulation. On the contrary, we expect an enlargement of the TBW in trials following AV rhythmic stimulation delivered at the lower alpha frequency (~8.5 Hz). Furthermore, we might expect that sensory entrainment modulates simultaneity judgments differently as a function of the leading sense (e.g., Powers et al., 2009; Zerr et al., 2019).

Finally, a peculiar aspect of our study concerns the implementation of this paradigm in an online, web-based modality. In our recent study (Marsicano et al., 2022), we demonstrated the possibility of implementing an accurate SJ task through a web-based platform. Adapting lab-based experiments to online context allows to collect data from a heterogeneous sample, optimizing the timing of testing (Bridges et al., 2020; Sauter et al., 2020). Recently, online sensory alpha-band entrainment has been successfully implemented (De Graaf & Duecker, 2022; Kawashima et al., 2022), opening new and broader applications for rhythmic sensory stimulation protocols. Following such evidence, in the current web-based study, we aim to investigate whether AV alpha-band entrainment may shape the AV temporal acuity.

Methods

Participants

A total of 61 volunteer participants were recruited among university students, through advertisement and word of mouth. Participants did not receive compensation or course credits. All were volunteers and presented normal or corrected to normal vision and hearing. Exclusion criteria were self-reported neurological and attention disorders, and epilepsy/photosensitivity. One subject was excluded from analyses due to inability to do the SJ task, as he reported to the experimenters the lack of understanding of the instructions for the correct execution of the task. Furthermore, during the data collection process, the refresh rate of the monitor/display was recorded for each participant. The use of a 60 Hz refresh rate monitor during the execution of the task, which ensures a correct timing of auditory and visual stimulation, was confirmed for 59 participants, based on log file information returned in output at the end of the experiment. Thus, two subjects were excluded since they performed the task using a monitor with invalid refresh rate compared with the optimized presentation of the stimuli at a refresh rate of 60 Hz. The final sample of participants included in the analyses comprised 58 participants (32 females, mean age = 24.6 years, SD = 3.26). All participants were provided with a document containing details about the procedure for correctly completing the web-based experimental paradigm. We underlined the importance of following the instructions we provided for an optimal execution of the online task. We emphasize the critical importance of sitting in a dimly lit and quiet room, using headphones/earbuds at a comfortable volume, keeping a viewing distance of ~50 cm. We highlighted that the task could be performed only from a PC and not from mobile devices. The research project was approved by the Ethical Committee of the University of Bologna (Prot. n. 0159726), and all participants gave their informed consent.

Apparatus and stimuli

The task was created with PsychoPy3 (Peirce, 2007) and translated to PsychoJS, in order to administer it using Pavlovia (https://pavlovia.org/), a web-based platform for the presentation of psychophysics experiments via common web browsers. This allowed us to collect remotely data for the experimental task. Audio and visual stimuli were created using Psychtoolbox on MATLAB 2019a (The MathWorks, Inc) and subsequently manipulated with Wondershare Filmora 9 (Wondershare) in order to create videos of AV stimuli at different SOAs. Also, the AV stimuli used for the entrainment were created in a similar way. All stimuli of the experimental paradigm were optimized for a 60 Hz monitor refresh rate. We collected directly from PsychoPy/Pavlovia information about the type of OS used by the participants (Windows = 37 participants, MacOS = 23 participants). We asked subjects to run the experiment using Mozilla Firefox, Google Chrome, or Microsoft Edge as web browsers. During an extensive pilot observation, it has been shown that such browsers are the most reliable in terms of stimuli presentation across the different OS.

Assuming that participants kept the recommended distance of ~50 cm from the screen, the visual stimulus used for the entrainment was a white square with a diameter of 6° of visual angle presented at the center of the screen. The auditory stimuli used for the entrainment were sinusoidal 500 Hz sounds presented binaurally through headphones/earbuds at a comfortable volume. The AV target stimuli of the SJ task were a white circle sized 6° of visual angle presented at the center of the screen and a sinusoidal 750 Hz sounds presented binaurally through headphones/earbuds at a comfortable volume. All visual stimuli were presented at the center of the screen on a black background.

Experimental design

We employed an AV SJ task, preceded trial by trial by a rhythmic sensory stimulation (entrainment) at two different frequencies in the lower (~8.50 Hz) or upper (∼12 Hz) alpha range, and by a non-rhythmic control stimulation condition, to investigate the possible modulations of AV temporal acuity (i.e., TBW) following sensory entrainment (see Fig. 1). We decided to use AV rhythmic stimuli (instead of unisensory stimuli) to maximize the effect of neural entrainment (Ronconi et al., 2018; Ronconi & Melcher, 2017). In our experimental paradigm, all trials started with the onset of one of the two types of entrainment (lower or upper alpha) or with the onset of the control prestimulus condition. These conditions lasted for ~2,000 ms, and the duration of each AV stimulus was set to three refresh cycles (49.98 ms). In the ∼8.5 Hz condition (lower alpha), the AV stimuli were presented repeatedly for three refresh cycles, separated by four cycles of a blank screen, resulting in a SOA of 116.62 ms. In the ∼12 Hz condition (upper alpha), the AV stimuli were presented again repeatedly for three refresh cycles, separated by two cycles of a blank screen, resulting in a SOA of 83.3 ms. Regarding the control condition, the AV stimulus was presented in the first 3 refresh cycles and in the last three cycles of the 2,000-ms time window preceding the appearance of the AV target of the SJ task. In details, this nonrhythmic prestimulus condition involved the presentation of only the first and the last entrainers stimuli, to control for a potential increment in temporal expectation induced by prestimulus rhythmic sequences, and at the same time to evaluate the natural individual temporal sampling of each participant. The presentation of these experimental prestimulus conditions was randomized and counterbalanced.

Fig. 1
figure 1

Schematic representation of the experimental paradigm. A Each trial began with the AV entrainment (lower alpha or upper alpha stimulation) or control condition (no stimulation). After a variable time (ISIs) between 83.3 ms and 233.23 ms (in steps of 16.67 ms) the AV target of the SJ task appears at the center of the screen. A fixed array of SOAs between the first and second stimulus was used across trials (±400, ±300, ±200, ±100, 0). Trials were counterbalanced between the following two conditions: B Auditory Leading (AL) trials: when the auditory stimulus was followed by the visual stimulus. C Visual Leading (VL) trials: when the visual stimulus was followed by the auditory stimulus

The ISIs between the end of the entrainment and the presentation of the AV target were randomly manipulated with values ranging between 83.3 ms and 233.24 ms in steps of one refresh cycle (16.66 ms). To investigate the extent of the AV temporal acuity, AV target could be presented at the following SOAs: ± 400, ± 300, ± 200, ± 100, 0 ms. Negative SOAs corresponded to auditory leading trials (AL) and positive SOA to visual leading (VL) trials, similarly to the previous literature (e.g., Conrey & Pisoni, 2006; Hillock-Dunn & Wallace, 2012).

All AV target stimuli in the SJ task lasted for three frames (49.98 ms). After each trial, participants were required to report the simultaneity of auditory and visual stimuli on a 3-point scale, providing their responses from a keyboard, with rating 1 indicating the presence of simultaneity (synchronous), ratings 2 indicating the absence of simultaneity (asynchronous), and rating 3 indicating a ‘not sure/not seen’ response. The latter was used to ensure that participants report simultaneity judgments only when confident, to limit the contamination of the data analysis by random responses. Indeed, these responses were later discarded from analyses as non-informative. All the responses were given with no time constraints and the total time taken by the participants to complete the experiment was ~25 min. In detail, the running time of the real task lasted ~15–20 minutes, preceded by ~5 minutes dedicated to reading the instructions and performing practice trials.

Given the limited control and oversight that can be accomplished in online testing by the experimenters, participants were stressed to focus their attentional resources on the stimulation period, and to selectively judge the temporal synchrony of the AV stimuli presented in the SJ task. Furthermore, the participants were instructed to blink and rest their eyes only during the response to the SJ task, if necessary. To further prevent drops in sustained attention during the task execution, participants were informed that they could have short breaks during the presentation of the response screen (since responses were given with no time constraints). The total number of trials administered for each participant were 276, consisting of six practice trials and 270 experimental trials. The 270 experimental trials were presented in a single block, which included 10 repetitions for each combination of SOAs and entrainment conditions, randomly distributed across participants.

Data analysis

Entrainment modulations: Simultaneity rate and Gaussian fitting analysis

This analysis aims to investigate whether AV temporal acuity is modulated by the different conditions of sensory entrainment. To this end, we investigated the modulations deriving from the entrainment on the overall simultaneity rate and on the TBW indexed by the Gaussian fitting (see procedure below) of the responses to the SJ task. After obtaining the psychometric values of each participant, we excluded from the statistical analysis the subjects whose results fitted poorly to the Gaussian fitting (adjusted R2 < 0.3), following the previous literature (for example, see Bedard & Barnett-Cowan, 2016; Hillock-Dunn et al., 2016). Following this criterion, we excluded 4 subjects from the data analysis whose data fitted poorly, obtaining a final sample of 54 participants on which all the analyses were performed. In the final sample, the mean adjusted R2 across subjects were the following for each experimental condition: lower alpha: adjusted R2 = 0.844; upper alpha: adjusted R2 = 0.843; control: adjusted R2 = 0.845. We first performed a repeated-measures analysis of variance (ANOVA) on the rate of synchronous responses with the aim of testing whether performance was influenced by the stimulation condition (three levels: lower alpha, upper alpha, control condition) and SOAs (nine levels: −400 ms, −300 ms, −200 ms, −100 ms, 0 ms, +100 ms, +200 ms, +300 ms, +400 ms), which were used as within-subjects factors. The Greenhouse–Geisser correction was applied when the sphericity assumption was violated.

In a second step the observed distribution of responses to the SJ task for each stimulation condition was fitted to a Gaussian function using the Curve Fitting Toolbox in MATLAB 2019a (The MathWorks, Inc). In details, the Gaussian fitting was performed using the following formula through the nonlinear least squares method: a1*exp(-((x-b1)/c1)^2). In this formula, x represents the SOA, a is the height of the curve’s peak (the higher bound was set at 1 and the lower bound was set at 0), b is the position of the center of the peak, and c is the standard deviation indicating the width of the curve. In such analysis, the parameter b indicates the point of subjective simultaneity (PSS), and the parameter c indexes the TBW. We chose to use the Gaussian fitting according to these criteria as this method is a gold-standard in this type of analyses (McGovern et al., 2022; Noel et al., 2017; Simon et al., 2017; Stecker, 2018; Van der Burg et al., 2013; Venskus et al., 2021; Wallace & Stevenson, 2014). We performed two separate repeated-measures ANOVAs, one on TBW and one on PSS, to evaluate differences between the stimulation conditions (within-subjects factor, three levels: lower alpha, upper alpha, control). The Greenhouse–Geisser correction was applied in cases where the sphericity assumption was violated. All post-hoc comparisons were two-tailed paired-sample t tests on the measures of interest (i.e., Simultaneity rate and TBW), performed separately for each SOA, and corrected for multiple comparisons using the Bonferroni–Holm method.

Entrainment leading sense modulations: Logistic fitting analysis

This analysis aims to investigate possible asymmetries in the modulations of the AV temporal acuity induced by the entrainment sequences, which would show separate effects for AL and VL trials. We excluded from the statistical analysis the participants whose results fitted poorly in the logistic fitting (adjusted R2 < 0.3; see procedure below), performing the statistical analysis of the data in a sample of 45 participants. The mean of all subjects adjusted R2 was the following for each experimental condition and for each leading sense: AL lower alpha adjusted R2 = 0.914; AL upper alpha adjusted R2 = 0.925; AL control adjusted R2 = 0.93; VL lower alpha adjusted R2 = 0.869; VL upper alpha adjusted R2 = 0.885; VL control adjusted R2 = 0.888. We performed the fitting of the psychometric logistic curve for each subject, separately for the AL and the VL condition, and for each stimulation condition (lower alpha, upper alpha and control). Thus, we obtained the individual 50% threshold values from the fitting of the psychometric logistic curve. We decided to use the individual 50% threshold values as it more sensitively represents the simultaneity rate distribution obtained in our experiment. In detail, for each participant we used a logistic equation and a non-linear least squares method to fit the proportion of simultaneity rate reported to the SJ task as a function of SOA. The formula used was the following: y = 1/(1 + exp (b × (t - x))). In this equation, x represents the SOA between audio and visual stimuli and y represents the proportion of simultaneity responses to the SJ task. The lower y bound was set at 0 and the higher y bound was set at 1 (y = 0 indicates that AV stimuli were never perceived as synchronous, and y = 1 indicated that they were always perceived as synchronous). The only free parameters of the function were b (the function slope) and t (the 50% threshold), which were restricted to assume positive values above zero. Both AL and VL curves were fitted also using the data point corresponding to SOA = 0 ms. In the case of AL trials, we determined the absolute value of the threshold (that is normally negative, as is extracted from the left side of the psychometric curve), while the value of the VL threshold is typically a positive value, extracted from the right side of the psychometric curve. The best fitting parameters were found for each participant separately. We performed two separate repeated measures ANOVAs on 50% threshold, with the aim of testing whether performance was influenced by the stimulation condition (three levels: lower alpha, upper alpha, control condition) and leading sense factor (two levels: AL and VL), which were used as within-subjects factors. The Greenhouse–Geisser correction was applied in cases where the sphericity assumption was violated, and all post hoc comparisons were performed with the Bonferroni–Holm correction test.

Results

Simultaneity rate

As expected, the time interval between the auditory and visual stimuli influences performance, and the different stimulation conditions modulate the SJ task performance (Fig. 2A). The ANOVA performed on simultaneity rate testing whether performance was influenced by the stimulation condition and SOA, revealed a significant main effect of stimulation condition, F(1.87, 99.30) = 7,681, p < .001, ηp2 = 0.127, and SOA, F(2.49, 132.20) = 130.713, p < .001, ηp2 = 0.712. Furthermore, such analysis revealed a significant interaction between stimulation conditions and SOA, F(10.76, 570.51) = 2.152, p = .016, ηp2 = 0.039. The main effect of stimulation condition suggests that entrainment modulates the average simultaneity rate reported by subjects on the SJ task (Fig. 2B). In line with our hypotheses, in the upper alpha entrainment condition the participants showed a lower simultaneity rate (M = 0.505, SD = 0.113), compared with the lower alpha entrainment (M = 0.551, SD = 0.133), t(53) = −3.481, p = .006, and to the control condition (M = 0.549, SD = 0.122), t(53) = −3.301, p = .008. On the contrary, no significant difference emerged from the comparison between the lower alpha entrainment and the control condition, t(53) = 0.180, p = 0.865. Regarding the interaction between stimulation condition and SOAs, a post hoc paired-sample t test (Bonferroni–Holm corrected) revealed a lower simultaneity rate in the upper alpha entrainment compared with control (−300, −200, and −100 ms SOAs) and lower alpha conditions (−300 and −200 ms SOAs) in AL SOAs (Table 1). Regarding VL SOAs, despite the absence of statistically significant effects, post hoc paired-samples t tests revealed a trend towards significance in the upper alpha stimulation condition, where participants showed a lower simultaneity rate compared with the control and lower alpha conditions only at +200 ms SOA (Table 1). No other significant effects emerged when comparing the simultaneity rate among the different entrainment conditions at the different SOAs (all p values > .078).

Fig. 2
figure 2

A Gaussian curves obtained across participants in different entrainment conditions. Each individual TBW was derived at a 50% criterion (gray dotted line). Circles, squares and diamonds show simultaneity rates as a function of the SOA in the three experimental conditions, where different types of pretarget entrainment were employed (circle = Lower Alpha; square = Upper Alpha; diamond = Control condition). The asterisks indicate the significant difference in simultaneity rate to specific SOA between the upper alpha entrainment and the lower alpha and control conditions (see the Results section and Table 1 for a detailed description). The error bars indicate the standard error of the mean (SEM). B Bar plot showing the effect of stimulation condition, showing that entrainment modulates the average simultaneity rate reported by participants. The error bars indicate the standard error of the mean (SEM); squares show the individual values. C Bar plot of TBW measures with individual values, showing the modulations induced by different entrainment conditions, with significantly higher AV temporal acuity in the upper alpha condition compared with the lower alpha and control conditions. The error bars indicate the standard error of the mean (SEM); squares show the individual values. *p < .05. **p < .01. (Color figure online)

Table 1 Descriptive statistics (mean and standard deviation) of the simultaneity rate averaged across participants reported at each SOA (±400, ±300, ±200, ±100) and stimulation condition (~12 Hz, ~8.5 Hz, and control), separately for auditory leading (AL) and visual leading (VL) trials, and the paired-sample t tests (Bonferroni–Holm corrected) showing the differences in simultaneity as a function of the stimulation condition at each SOA

Gaussian and logistic fitting

The ANOVA performed on TBW confirmed a significant main effect of stimulation condition, F(1.75, 93.03) = 3.946, p = .027, ηp2 = 0.069 (Fig. 2C). Post hoc comparisons revealed a narrower TBW (i.e. higher multisensory temporal precision) in the upper alpha entrainment (M = 281.75, SD = 97.18) compared with the lower alpha entrainment (M = 309.91, SD = 95.21), t(53) = −2.194, p = .016, and to the control condition (M = 0.549, SD = 82.77), t(53) = −2.718, p = .004. On the contrary, the ANOVA performed on PSS values did not reveal a significant effect of the stimulation condition, F(1.89, 100.16) = 1.610, p = 0.206, ηp2 = 0.029. Crucially, the ANOVA performed on 50% threshold derived from logistic fitting did not reveal a significant interaction between stimulation condition and leading sense, F(1.99, 87.60) = 2.329, p = 0.104, ηp2 = 0.050, suggesting that TBW is modulated in a similar way in both AL and VL trials.

Discussion

The main purpose of the current study was to probe whether AV alpha-band sensory entrainment could be applied for a short period in the prestimulus interval to shape AV temporal acuity. In line with our hypotheses, our results represent the first empirical report on the possibility of adequately administering AV rhythmic stimulation, effectively inducing modulations on the width of the TBW. We found that the upper alpha sensory stimulation (∼12 Hz) improves AV temporal acuity, shrinking the width of the AV TBW, and reducing the perceived simultaneity, with respect to the rhythmic stimulation delivered at the lower boundary of the alpha oscillations (~8.5 Hz) and to the control condition. These results support previous evidence linking alpha oscillations to AV integration and segregation processes, strengthening the idea that alpha rhythm reflects the temporal unit of AV temporal perception (Bastiaansen et al., 2020; Cecere et al., 2015; Cooke et al., 2019; Ronconi et al., 2023; Venskus & Hughes, 2021). We speculate that the entrainment of neural oscillations could have contributed to increase the temporal sampling capacity of AV perception, by synchronizing endogenous oscillatory activity to the sensory stimulation frequency. However, our web-based behavioral study does not allow any definitive conclusion regarding this assumption, as it does not represent clear mechanistic evidence. Hence, future EEG/MEG studies are needed to examine the possible influence of neural entrainment on the current findings. Contrary to our hypothesis, our results did not reveal a broadening of the TBW and an increase in the simultaneity rate following sensory stimulation at lower alpha, when compared with the control condition. Using tACS and NIBS protocols, previous evidence has shown that rhythmic stimulation at slower alpha frequencies induces an increase in AV sensory integration (i.e., larger TBW) compared with segregation processes (Cecere et al., 2015; Venskus et al., 2021), as well as a slowdown of endogenous alpha oscillations (Coldea et al., 2022; Di Gregorio et al., 2022). For instance, Venskus et al. (2021) found that applying occipital tACS at either 14 Hz or 8 Hz during a sound-induced double-flash illusion (DFI) task induces a narrowing or widening of the AV TBW, respectively. Similarly, Cecere et al. (2015) applied occipital tACS at IAF or at off-peak alpha frequencies (IAF ± 2 Hz), while participants performed a DFI task, showing a widening or narrowing of the TBW for IAF-2 Hz and IAF+2 Hz occipital tACS, respectively. A potential explanation for our null result following the ~8.5 Hz stimulation condition could arise from the possibility that the actual stimulation frequency was out of the participants’ IAF range and thus incapable of synchronizing neuronal oscillations at the lower boundary of the alpha-band. Indeed, the entrainment mechanism of brain oscillations grounds on the principle of the Arnold Tongue phenomena (Huang et al., 2021; Notbohm et al., 2016; Pikovsky et al., 2003; Regan, 1982), which describes that synchronization between two coupled oscillators is more likely to occur when the external driving rhythmic force (in this case: entrainment) is centered to the intrinsic endogenous frequency (in this case: IAF). The participants tested in the current study were predominantly young adults, whose IAF is typically faster as compared with other periods of the life cycle (Chiang et al., 2011; Scally et al., 2018; Surwillo, 1961). Hence, it is conceivable that the upper alpha stimulation employed in this study was more likely to be included in their IAF range, synchronizing neuronal oscillations to the delivered rhythmic stimulation. On the contrary, lower alpha stimulation may have been administered more likely outside their IAF range, thus resulting ineffective in inducing a slowdown of alpha oscillations. Given the impossibility to measure IAF this explanation remains purely speculative, and it is conceivable that other factors may also play a role in explaining our findings. Indeed, multisensory integration abilities vary according to task and stimulus features (Stevenson & Wallace, 2013), and the above mentioned tACS studies employed a different illusory (i.e., DFI) AV integration task compared with the SJ task implemented in the present study, which might also account for the observed differences. Furthermore, it is conceivable that sensory entrainment and tACS stimulations might produce different physiological and behavioural effects, which might explain the null result in the current study following ~8.5 Hz stimulation. Sensory entrainment protocols modulate endogenous oscillatory activity targeting the cortical and subcortical sensory pathways, whereby tACS modulates the neuronal activity underlying the site of stimulation, presumably with propagation of stimulation to connected areas (Albouy et al., 2022; Hanslmayr et al., 2019; Thut et al., 2011). Furthermore, previous tACS studies (Cecere et al., 2015; Venskus et al., 2021) administered alpha-band stimulation continuously as subjects performed the audiovisual integration task, investigating the “online effects” on audiovisual temporal acuity. Sensory entrainment employed in the current study, on the contrary, was delivered trial-by-trial prior to onset of the target stimulus, allowing us to investigate the modulations of TBW at the offset of the rhythmic stimulation. Importantly, our results, highlighting influences of rhythmic sensory stimulation on TBWs lasting for a short period after the stimulation offset, could assume considerable importance in interventions for AV temporal perception abilities in clinical and sub-clinical populations where these perceptual processes are anomalous (e.g., ASD, SCZ; Zhou et al., 2018).

Our results are in line with previous literature (Cecere et al., 2016; Powers et al., 2009; Stevenson et al., 2013) showing that temporal acuity appears higher for AL stimulus pairs compared with VL condition. Indeed, regardless of the stimulation condition, subjects showed higher simultaneity rate in the VL as compared with the AL trials. This asymmetry is presumably grounded in the different neurocognitive mechanisms that drive the cross-modal phase reset of brain oscillations (Cecere et al., 2016, 2017; Lakatos et al., 2009; Thorne & Debener, 2014). This neurocomputational difference underlying the AL and VL conditions have been proposed to produce different consequences on the potential malleability of AV TBW (Cecere et al., 2016; Powers et al., 2009). For example, several evidence has shown that TBW is less modifiable in the AL condition, and more flexible in the VL condition, as demonstrated in perceptual learning trainings aiming at improving AV temporal acuity (Cecere et al., 2016; Powers et al., 2009; Stevenson et al., 2013). More recently, however, it has also been demonstrated a training-dependent symmetrical modulation of the width of the AV TBW that could improve AV temporal acuity in both leading senses (McGovern et al., 2022).

Our results seem more in line with these last findings (McGovern et al., 2022), as we showed significant improved AV temporal acuity in AL trials (SOAs: −300 ms, −200 ms, −100 ms) and a statistical tendency toward significance in VL trials (SOA: +200 ms) when an upper alpha stimulation preceded the AV stimuli. Furthermore, the logistic analysis did not reveal asymmetries in the modulations of AL and VL thresholds, suggesting that AV temporal acuity was modulated in a similar way in both AL and VL conditions. Thus, our results suggest that the employed AV rhythmic sensory stimulation was effective in maximizing the effect of cross-modal phase reset between auditory and visual cortex, potentially through the entrainment mechanism of neural oscillations, with a subsequent modulation of both visual-to-auditory and auditory-to-visual temporal acuity. Intriguingly, we found that the effects induced by sensory stimulation modulates a wider range of SOAs in AL trials with respect to VL condition. A plausible explanation to account for these differences between AL and VL conditions may rely on the different temporal resolution of visual and auditory systems (van Wassenhove, 2013). Since the auditory system has a higher temporal resolution with respect to the visual system, when the auditory stimuli drive AV interactions, the higher temporal sampling capacity could facilitate the modulatory effects induced by the entrainment, extending the effects on temporal binding of AV information at different time latencies.

Finally, a noteworthy aspect of our study concerns the implementation of the entire experimental procedure through a web-based platform. First of all, the results of the SJ task confirm the possibility of administering a psychophysical task through a web-based modality (Marsicano et al., 2022), obtaining results that closely mimic findings from classic lab-based studies in a less controlled setting (e.g., Fenner et al., 2020; Stevenson et al., 2012; Zampini et al., 2005). Furthermore, similarly to recent findings (De Graaf & Duecker, 2022; Kawashima et al., 2022), we demonstrated the possibility of implementing an alpha-band sensory stimulation online, obtaining modulations of AV temporal acuity in the same direction of the lab-based studies that used tACS as a neuromodulation technique (Cecere et al., 2015; Venskus et al., 2021).

Although online experiments allow to collect data from a large and heterogeneous sample of individuals with lesser workload (Buhrmester et al., 2016; Mason & Suri, 2012; Reips, 2002; Sauter et al., 2020), there are methodological limitations that could have impacted our findings. Indeed, a major concern regarding online testing is the potential low level of sustained attention during task execution, which cannot be minutely controlled as in the lab-based context. In the current study we implemented different procedures, in agreement with recent guidelines (Newman et al., 2021; Sauter et al., 2020), to prevent such potential issues. In particular, we rigorously pilot our experiment in the laboratory, and we kept the experiment short and employed simple instructions. A further main challenge of online experiments is to accurately set with high precision the timing of stimuli administration, a relevant point in our study considering the fast rhythmic stimulation administered. Recently, Bridges et al. (2020) found that studies administered online using PsychoPy (the software used in the current study) showed a high precision in the timing of stimulus presentation, supporting the consistency of our findings. Accordingly, recent evidence has successfully implemented sensory stimulation in the alpha band through web-based experiments (De Graaf & Duecker, 2022; Kawashima et al., 2022). Moreover, Marsicano et al. (2022) compared simultaneity judgements of audiovisual events obtained with online testing with similar lab-based results, showing a high degree of overlapping. However, despite the measures adopted, we cannot entirely exclude that a suboptimal stimulus presentation timing could have influenced our results.

Another potential limitation of the current study is that the analysis performed with logistic fitting of individual data might have introduce a selection bias toward subjects that did not show the expected pattern of simultaneity rates. Among other factors (e.g., the small number of trials per experimental condition), poor psychometric fitting could be due to the lack of sensitivity of this analysis in capturing different aspects that drive AV perception. As suggested by recent studies (Buergers & Noppeney, 2022; Di Gregorio et al., 2022), the measures derived from the signal detection theory (SDT; i.e., d′ and criterion) may be more sensitive to some aspects of visual and AV temporal resolution (i.e., subjective confidence, prior expectations, top-down biases). Hence, future studies adopting the measures deriving from the SDT will be required to increase analyses sensitivity and differentiate how sensory entrainment and other neuromodulatory approaches can influence different aspects of AV temporal perception.

Overall, our findings might open a new scenario for the online and lab-based use of rhythmic sensory stimulation protocols aimed at improving AV temporal acuity.