Nothing Special   »   [go: up one dir, main page]

Academia.eduAcademia.edu

Auditory rhyme processing in expert freestyle rap lyricists and novices: An ERP study

2019, Neuropsychologia

Accepted Manuscript Auditory rhyme processing in expert freestyle rap lyricists and novices: An ERP study Keith Cross, Takako Fujioka PII: S0028-3932(19)30077-6 DOI: https://doi.org/10.1016/j.neuropsychologia.2019.03.022 Reference: NSY 7059 To appear in: Neuropsychologia Received Date: 9 June 2018 Revised Date: 2 February 2019 Accepted Date: 28 March 2019 Please cite this article as: Cross, K., Fujioka, T., Auditory rhyme processing in expert freestyle rap lyricists and novices: An ERP study, Neuropsychologia (2019), doi: https://doi.org/10.1016/ j.neuropsychologia.2019.03.022. This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain. ACCEPTED MANUSCRIPT Auditory Rhyme Processing in Expert Freestyle Rap Lyricists and Novices: An ERP Study Corresponding Author: Keith Cross e-mail: kcross2@hawaii.edu phone: 310-740-1845 Figures: 5 Tables: 3 Abbreviated title: Auditory Rhyme Processing in Freestyle Rap AC C EP TE D M AN US C Keywords: improvisation; Hip-Hop; plasticity; language; music; syntax RI PT Keith Cross1, Takako Fujioka2,3 1 College of Education, University of Hawai`i at Mānoa, Honolulu, HI 2 Centre for Computer Research in Music and Acoustics, Department of Music, Stanford University, Stanford, CA, USA 3 Stanford Neurosciences Institute, Stanford University, Stanford, CA, USA 1 ACCEPTED MANUSCRIPT Abstract Music and language processing share and sometimes compete for brain resources. An extreme case of such shared processing occurs in improvised rap music, in which performers, or ‘lyricists’, combine rhyming, rhythmic, and semantic RI PT structures of language with musical rhythm, harmony, and phrasing to create integrally meaningful musical expressions. We used event-related potentials (ERPs) to investigate how auditory rhyme sequence processing differed between expert lyricists and non-lyricists. Participants listened to rhythmically presented pseudo-word triplets each of which terminated in a full-rhyme (e.g., STEEK, PREEK; FLEEK), half-rhyme (e.g., STEEK, PREEK; FREET), or non-rhyme (e.g., STEEK, M AN US C PREEK; YAME), then judged each sequence in its aesthetic (Do you ‘like’ the rhyme?) or technical (Is the rhyme ‘perfect’?) aspect. Phonological N450 showed rhyming effects between conditions (i.e., non vs. full; half vs. full; non vs. half) similarly across groups in the parietal electrodes. However, concurrent activity in frontocentral electrodes showed leftlaterality in non-lyricists, but not lyricists. Furthermore, non-lyricists’ responses to the three conditions were distinct in morphology and amplitude at left-hemisphere electrodes with no condition difference at right-hemisphere electrodes, while lyricists’ responses to half-rhymes they deemed unsatisfactory were similar to full-rhyme at left-hemisphere electrodes, and similar to non-rhyme at right-hemisphere electrodes. The CNV response observed while waiting for the second and third D pseudo-word in the sequence was more enhanced to aesthetic rhyme judgments tasks than to technical rhyme judgment TE tasks in non-lyricists, suggesting their investment of greater effort for aesthetic rhyme judgments. No task effects were observed in lyricists, suggesting that aesthetic and technical rhyme judgments may engage the same processes for experts. EP Overall, our findings suggest that extensive practice of improvised lyricism may uniquely encourage the neuroplasticity of AC C integrated linguistic and musical feature processing in the brain. 2 ACCEPTED MANUSCRIPT 1 Introduction While neuropsychological evidence has demonstrated that music and language are neuroanatomically separable RI PT abilities (Peretz and Coltheart, 2003), there is also ample evidence that certain brain areas such as Broca’s area and its righthemispheric homotope, as well as bilateral auditory cortices, are involved in the processing of both music and language (Patel et al., 1998; Steinbeis and Koelsch, 2008), and that music and language can compete for cognitive resources (Koelsch et al., 2005). Studies of the relationship between the processing of pitch in music and language reveal transfer effects M AN US C between both cognitive domains. For example, enhanced perception of pitch manipulations in speech has been observed in both adult musicians (Alexander, Wong, & Bradlow, 2005) and child musicians (Magne et al., 2006), compared to their non-musician counterparts. Conversely, the perceptual demands of tonal languages give tonal language speakers an advantage in musical pitch perception as compared to speakers of non-tonal languages (Bidelman et al., 2013; Bradley, 2012). Several studies have investigated singing to determine the degree of overlap (if any) between music and language processing in the brain (Besson et al., 1998; Gordon et al., 2010; Poulin-Charronnat et al., 2005). A number of these studies have focused on the relationship between the melodic and lexical (i.e., lyric) aspects of song. For example, Schön et al. D (2010) provided evidence that lexical/phonological and melodic processing occur within a common network of brain TE regions, and that these regions are engaged to different degrees depending on whether the attention task involves sung words, spoken words, or non-linguistic melodic vocalizations. Studies focused on the aesthetic processing of lyrics have EP demonstrated that the presence of lyrics can affect the perception of sad music but not of happy music (e.g., Ali & Peynircioğlu, 2006; Brattico et al., 2011). Other studies have focused on the relationship of music and speech with respect to rhythm (Cason and Schön, 2012; Schön and François, 2011; Stahl et al., 2013, 2011). For example, Cason & Schön AC C (2012) found that phonological processing is enhanced when the rhythmic and metric expectations set by a preceding stimulus are met. Collectively, these studies suggest that insights regarding the relationship between music and language processing in the brain can differ depending on the song feature investigated (e.g., pitch, melody, rhythm, emotion). Another important linguistic feature in songs, rhyme, has been relatively underexplored with respect to its relation to music. Rhyme is common in song lyrics, wherein the pairing and placement of words with phonologically congruent endings entails the integrated processing of speech and music (especially rhythm). An extreme case of such integration occurs in improvisational rap performance, otherwise known as ‘freestyle lyricism’, where expert performers, ‘lyricists’, spontaneously generate coherent expressions that adhere to rhyming, rhythmic, and semantic language constraints and are 3 ACCEPTED MANUSCRIPT coordinated with musical rhythmic, harmonic, and phrasing structures in the creation of highly sophisticated music expressions in real-time. Neurocognitive research on improvisational rap musicianship is virtually non-existent, except for one neuroimaging study (Liu et al., 2012), which used fMRI to investigate creative behavior by examining the brain activity of RI PT expert freestyle lyricists during their performance of improvised lyrics, and comparing this to their brain activity during their performance of pre-rehearsed lyrics. The results showed that improvised lyrical performance is characterized by dissociated activity in medial and dorsolateral prefrontal cortices, suggesting that improvisational creativity is associated with the absence of conscious monitoring and volitional control, in line with similar findings regarding improvisation in M AN US C jazz pianists during improvisation (Limb and Braun, 2008). However, the study does not answer a fundamental question as to what cognitive functions are involved in lyric practice. Theoretically, freestyle lyricists must exercise acute awareness of both musical and speech structures at many different levels, combined with rapid lexical search processing to meet the rhythmic constraints at a given moment. The key to successful and meaningful improvisation relies mainly on three formulaic parallelisms (Kiparsky, 1973), fulfilled between corresponding rap phrases: 1) similarity in the number of syllables assigned to a beat; 2) similarity in prosody (i.e., distribution of strong and weak syllables); and, 3) identity or similarity in phonological material, in the case of rhyme. Amongst these three, however, rhyme is the central feature to D freestyle lyricism, such that other linguistic and rhythmic considerations are subordinated to its required fulfillment at salient beats: rap is typically performed in a 4/4 time signature, wherein the final beat in even-numbered measures must TE almost always coincide with a completed rhyme (Alim, 2003; Bradley, 2017). Figure 1A and 1B illustrate the decisionmaking processes involved in negotiating the constraints of rhyme in real-time. In Figure 1, each syllable is assigned to a EP 16th-note for ease of illustration. During improvisation, a freestyle lyricist can start the first phrase, but might not know in that instant to which part of the phrase she will create a corresponding rhyme later in the second phrase. This can be the AC C word in the final beat of the first phrase, but could potentially be words from any part of that phrase. In case A, the corresponding rhyme is drawn from the final part of the first phrase and made at the same place in the second phrase, while in case B, the rhyme is drawn from the first downbeat in the first phrase and made at the last part of the second phrase. This type of operation requires the lyricists to retain previously uttered phonological sequences in working memory, construct matching or approximate sequences from lexical items in long-term memory, and make quick determinations of which of the eligible sequences creates the most technically and/or aesthetically acceptable rhyme while mapping these onto the concurrent rhythmic context. Given their regular participation in such complex tasks, expert freestyle lyricists may be unique in their auditory processing of rhyme sequences, and in their neural preparation to make rhyme judgments. 4 ACCEPTED MANUSCRIPT Cross & Fujioka, Auditory rhyme processing in freestyle rap [Figure 1] When expert lyricists make a split-second decision on choices of rhyme patterns, their goal is to make the music meaningful and artistically sensible as a whole. Obviously, this goal cannot be achieved by making any sort of rhyme in RI PT random places. What considerations inform such decision making? One strong possibility is that rhyme becomes an important feature that determines musical structure in lyrical practice. In music theory, prolongation refers to the process whereby relative stability of pitch events in local and global contexts is encoded (Jackendoff, 2009; Schenker, 1979). Lerdahl (2001) proposes that when considered within an “analysis of the hierarchical patterns of recurrence of sounds” (p. M AN US C 349), poetry is viewed and processed as music. Stress hierarchies in poetry are the counterpart to pitch hierarchies in music. Lerdahl elaborates that prosodic prolongation (i.e., recurrence) in poetry can be viewed as a kind of atonal musical prolongation, in which “the elements being prolonged are not pitch structures but degrees of timbral similarity mediated by stress within the prosodic hierarchy” (p. 350). Rhyme, according to Lerdahl, is the strongest class of prosodic prolongation in poetry as it is the strongest form of prosodic repetition, followed by weaker forms of repetition like alliteration and assonance, followed by non-repetition (i.e., of vowels or consonants). Thus, rhyme sequences such as those in freestyle rap may help us understand how lyricists play on words to create temporally and hierarchically organized sound patterns to D imply complex musical structure. Another factor to consider is that putting decent rhyme sequences together must require TE aesthetic processing in lyricists. Aesthetic appraisal is a hallmark of artistic experience in general, regardless of its modality (e.g., visual, literal, musical, etc.). Recent empirical studies revealed that, aesthetic evaluation of artistic objects engages a different set of psychological and neurological processes from those for cognitive evaluation for both audience and EP performers, while expert performers develop a strong sense of aesthetics facilitated by neuroplastic changes through longterm experience across sensory cortices and reward and memory related limbic systems (Brattico and Pearce, 2013; AC C Chatterjee and Vartanian, 2014). In terms of musical structure, listeners acquire knowledge via statistical learning and build expectation for the next event given the preceding context (for a review, see Ettlinger, Margulis, & Wong, 2011; Rohrmeier & Rebuschat, 2012). Thus, expert lyricists may have also developed specific tastes for expectation of rhyme-related musical structure, in parallel to their technical sensitivity to rhyme. Neural correlates of auditory rhyme processing and preparation for rhyme judgment have been examined in previous event-related potential (ERP) studies using the phonological N450, an ERP component sensitive to violation of phonological expectations (Rugg, 1984), and the contingent negative variation (CNV), an ERP component related to anticipation of critical events (Walter et al., 1964). The N450 is considered to be a variant of the N400, an ERP component 5/31 ACCEPTED MANUSCRIPT Cross & Fujioka, Auditory rhyme processing in freestyle rap sensitive to the violation of expectations induced by semantic context (Dumay et al., 2001; Kutas and Federmeier, 2011), although the N450 itself appears solely sensitive to phonological manipulations (Coch et al., 2008; Wagensveld et al., 2012). In rhyme judgment tasks using word pairs, the N450 response after the onset of the second word stimulus is greater RI PT to non-rhyming words than to rhyming words. This difference is referred to as the N450 rhyming effect (RE). In parietal scalp regions, REs manifest as a more negative voltage to non-rhymes as compared to rhymes, with the polarities reversed in frontal regions. Despite its name, the RE has also been observed more broadly in response to alliteration tasks, manifesting when the initial sound of a target word does not match that of the prime (Praamstra et al., 1994), and when M AN US C sentences end with words with phonologically improbable onsets (Perrin and García-Larrea, 2003). REs have been observed in relation to real and pseudo-words (Rugg, 1984), and in response to both visual and auditory stimuli (Coch and Grossi, 2002; Grossi et al., 2001; Praamstra and Stegeman, 1993). In the visual modality, REs begin 250–300 msec after target onset, peak at 400–450 msec, and are distributed maximally across midline and right temporo-parietal sites (Coch et al., 2005; Rugg, 1984); in the auditory modality, REs may have an earlier onset and a more symmetrical distribution (Coch et al., 2005). By obtaining REs even when primes and targets are spoken by different voices, Praamstra and Stegeman (1993) found that REs are determined by phonological factors, and not dependent upon physical-acoustic match. By demonstrating D that REs could be obliterated when subjects attended to melodies simultaneously presented with rhyming pairs, Yoncheva et TE al. (2013) demonstrated that REs depend, to some extent, on attention to phonology. Wagensveld et al. (2012) found that rhyme judgment performance was impaired in relation to the degree of phonological overlap shared by word pairs that do not rhyme (e.g., bell/ball), although this finding was only manifest behaviorally (i.e., response times) whereas N450 EP response was similar for non-rhyme pairs with and without phonological overlap. Of particular importance to the present study are the findings of Coch et al. (2005), who sought to investigate the developmental course of phonological processing AC C abilities. They found that the onset of REs was earlier in adults and in children whose phonological awareness scores were higher than their age-level counterparts. Thus, REs may be also correlated with experience with rhyme. The CNV is a slow negative wave that occurs during the time interval between a warning stimulus and a subsequent critical stimulus, associated with anticipation of the latter. The CNV appears most prominently at the vertex and is laterally symmetrical over the two hemispheres (Tecce, 1972), although this can vary with context (e.g., human developmental stage, Grossi et al., 2001). Several studies link CNV to processes for estimation of time (Macar and Besson, 1985; Macar and Franck, 2004; van Rijn et al., 2011). The CNV can be elicited either with a requirement of a motor response (Gaillard, 1977; Walter et al., 1964), or without when the stimuli are conceptually linked to each other (Ruchkin et 6/31 ACCEPTED MANUSCRIPT Cross & Fujioka, Auditory rhyme processing in freestyle rap al., 1986; van Boxtel and Brunia, 1994). When the warning stimulus provides information about the nature of an upcoming task, the CNV is modulated according to task type, likely indexing different neural states for preparation (Falkenstein et al., 2003). Relevant to the present study, Brattico and colleagues used alternating task cues and asked non-musician RI PT participants listen to musical chord sequences varying in their level of harmonic congruence, and to respond ‘yes’ or ‘no’ to whether they liked the last chord or not (an aesthetic judgment task), or alternately whether they thought the chord sounded correct or incorrect (a descriptive judgment task) (Brattico et al., 2010). While the ERP responses after the onset of the last chord reflected their answers, the response during the presentation of the first chords showed differences between the tasks, M AN US C suggesting that the listening mode matters before hearing the critical stimulus. Further, using the same paradigm, Müller et al. (2010) investigated how the relationship between aesthetic judgment and expertise interact with the CNV response, as observed after the task cue before the onset of the chord sequence, and during the sequence towards the critical chord. During the cue interval expert musicians exhibited an enhanced CNV for the aesthetic task compared to the correctness task, suggesting more effort into preparation for aesthetic processes, whereas such task effect was absent in non-musicians. In contrast, the experts showed no task-related difference once the chord sequence started, whereas non-musicians showed a larger CNV for the correctness task. Their results highlight an interaction between expertise and neural preparation for D different types of judgment tasks. TE In the present study, we compared expert freestyle lyricists and non-lyricists in their behavioral and electrophysiological response to rhyme judgment tasks, to better understand the relationship between music and language processing, and how this relationship might be modulated by expertise in freestyle lyricism. Importantly, the demands of EP different performance styles and musical learning strategies significantly affect auditory processing, as shown in the enhanced preattentive auditory evoked responses in instrumental jazz musicians, compared to those who play rock and AC C classical music styles (Vuust et al., 2012). The results likely reflect highly sophisticated musicianship in jazz musicians, who deal with huge demands of jazz improvisation techniques, such as incorporating complex chord and rhythmic changes, as well as rich harmonies, on the fly. Similarly, amongst numerous styles of rap musicianship, rappers who perform by improvisation may achieve unique expertise in processing rhyme. We hypothesized that expert freestyle lyricists, because of their habitual use of rhyme in real-time, would differ in their rhyme judgment preparation and processing as reflected in N450 and CNV. Specifically, considering that rhyme sequences might be processed (according to their structural rules) like musical syntax, we anticipated that expert lyricists, like expert musicians, may show enhanced N450 rhyme effects to stimuli that violated established rhyme contexts. 7/31 ACCEPTED MANUSCRIPT Cross & Fujioka, Auditory rhyme processing in freestyle rap Furthermore, because of lyricists’ frequent use of half-rhymes with phonetically similar endings (e.g. snack/hat), we expected that lyricists’ N450 responses to half-rhyme might be more similar to those to full-rhyme (e.g., snack/pack), as compared to non-lyricists. Finally, we expected the two groups to differ in their neural preparation to make aesthetic vs. RI PT technical rhyme judgments as reflected in CNV, given that lyricists must determine what will be aesthetically pleasing and/or technically acceptable as a rhyme. To test the above hypotheses, we designed a set of pseudo-words and presented them auditorily as a sequence of triplets in which the first and second stimuli always rhymed perfectly, but the third stimulus varied in the degree of rhyming (see Methods). Importantly, this triplet design required participants to evaluate the M AN US C level of rhyme cohesion in a temporally organized sequence rather than make a simple judgement of rhyme or non-rhyme between a pair of words, and thus enabled us to compare novice and expert judgment under conditions ecologically similar to those in which rhyme is encountered in rap practice. We recorded electroencephalography (EEG) from a group of expert freestyle lyricists and demographically similar non-lyricists while they listened to these stimuli and performed aesthetic and technical rhyme judgment tasks, which were cued visually at the beginning of each triplet presentation. Pseudo-words were used to avoid any confounding semantic processing that may accompany real words. Pseudo-words were designed such that most of the representative sounds in American Standard English were present in the stimulus set. The same stimulus was D used as part of the prime (the first and second stimuli within a triplet) or the target (the third stimulus) across different trials. Material and Methods 2.1 Participants EP 2 TE We examined RE after the third target stimulus, and CNV before the second and the third stimuli. Seventeen freestyle lyricists and 17 non-lyricists were recruited from Stanford University and the surrounding AC C communities. Inclusion criteria for both groups were: right-handed, native speaker of American English, standardized phonological test performance compatible to age-related normative data, no substantial experience in musical instrumental playing and no history of neurological, psychological, or hearing problems. Non-lyricists had no freestyle lyric performance experience. For the lyricist group, a 7-year minimum experience with demonstration of highly accomplished freestyle lyric performance was required. Potential participants in the lyricist category were pre-screened by an audition with a short improvised lyrical performance in front of the first author, who had 23 years of public performance experience in freestyle lyricism at the time of the study. Performances were evaluated by the first author and scored across four dimensions (rhythmic performance, cohesiveness of content, flow and continuity, avoidance of crutch phrases), each of which had a 1-4 8/31 ACCEPTED MANUSCRIPT Cross & Fujioka, Auditory rhyme processing in freestyle rap scale, a score of 4 being the highest. Dimension scores were averaged, and only lyricists with more than 3 points were invited to participate in the study in the lyricist category. The four dimensions are an adaptation of those used by Liu et al. (2012). Our final samples consisted of 11 lyricists (average age 30.0, SD = 3.2 years) and 10 non-lyricists (average age RI PT 30.8, SD = 5.6), after discarding datasets due to excessive noise or non-compliance during the EEG testing as well as failure in fulfilling the above requirements. Lyricists had on average 14.5 years of improvisational experience (SD = 4.7), 8.1 weekly practice hours (SD = 6.5), and had no substantial instrumental music experience (mean 0.2 years, SD = 0.6). Nonlyricists had no lyrical experience and no substantial instrumental music experience (mean 0.5 years, SD = 1.1). Procedures M AN US C were explained to participants before the test session and any questions were addressed. All participants signed consent forms after they understood the nature of tasks in the study. All methods and procedures were approved by the Stanford University Institutional Review Board. 2.2 Phonological Processing Test All participants were administered the core subtests of the Comprehensive Test of Phonological Processing (CTOPP; Wagner et al., 1999), with the exception of the Phonological Awareness core, which was replaced by the CTOPP Alternate Phonological Awareness subtests: Blending Nonwords (i.e., combining individual phonemes to construct a D pseudo-word) and Segmenting Nonwords (i.e., deconstructing pseudo-words into individual phonemes). These subtests, TE which assess phonological awareness exclusively with pseudo-words, were chosen because of their compatibility with our task in the EEG recording. Composite scores of CTOPP subtests reflect an examinee’s ability relative to the constructs EP incorporated into the CTOPP: Phonological Awareness, Phonological Memory, and Rapid Symbolic Naming. Percentile rank scores for phonological memory, alternate phonological awareness, and rapid symbolic naming composites from the AC C CTOPP were analyzed to ensure that only participants who possessed normal phonological processing ability were included in the final samples. 2.3 Stimuli A master list of 60 full-rhyming triplets was constructed from 180 unique pseudo-words. Pseudo-words were designed such that all possible American English vowels and word final consonants were represented in the stimulus, with the exception of the consonant /ʒ/ (e.g., beige), which was excluded due to its infrequency of occurrence in English. Vowelconsonant (VC) combinations were not exhaustive. All pseudo-words were single-syllabled, containing only one vowel, and terminating in a single consonant. Pseudo-words containing the r-colored vowel /ɝ/ succeeded by a consonant are the only 9/31 ACCEPTED MANUSCRIPT Cross & Fujioka, Auditory rhyme processing in freestyle rap exception (e.g., gɝv, pronounced “gerv”). Unlike other r-colored English sounds in which vowels are followed by /ɹ/ with some r-coloring at the end of the vowel (e.g., beard), /ɝ/ has r-coloring throughout, which is why it is included in the stimulus as its own vowel (Ladefoged and Johnson, 2011), and permitted to precede a final consonant. Additionally, onsets RI PT and final consonants of individual pseudo-words, and the onsets of targets and their immediate predecessors were precluded from containing identical phonological material (e.g., skak would not be acceptable, nor would the sequence, yeek, yame). To ensure that no pseudo-word in the master list could be misidentified with an actual English word (e.g., ‘chope’ perceived as ‘choke’), each of the 180 pseudo-words was screened by 10 native American-English speakers prior to the EEG M AN US C experiment. For each pseudo-word, screeners listened to an audio recording and were asked to respond “yes” or “no” to whether the pseudo-word sounded as if it could be identified with any English word with which they were familiar, in any dialect. Of the 180 pseudo-word stimuli, 117 (65%) had an inter-rater reliability of 90% or higher, 147 (82%) had an interrater reliability of 80% or higher, and 158 (88%) had an inter-rater reliability of 70% or higher. All 180 pseudo-words were included in the final stimulus sequence, such that the target position was occupied by words with the highest inter-rater reliability (> 70%) for negative responses to possible misidentification with actual English words. The rest of pseudo-words with less than 70% inter-rater reliability were used in position one (19 pseudo-words, 31%) and in position two (3 pseudo- D words, 5%). Given these precautions, and that we informed participants that the stimulus contained only pseudo-words, it is TE highly unlikely that any pseudo-words in the stimulus were mistaken for actual words. [Figure 2] EP From the master list of 60 full-rhyming triplets, an additional two lists of 60 triplets were created, one representing a non-rhyme condition, and the other, a half-rhyme condition. In total, the stimulus contained 180 triplets, with 60 triplets AC C in each of three conditions: full-rhyme, half-rhyme, and non-rhyme (see Figure 2). Each prime (a rhyming couplet) and each target occur only once in the full-rhyme, once in the non-rhyme, and once in the half-rhyme condition, such that 60 unique primes are paired with 60 unique targets in each condition. In the full-rhyme condition, pseudo-word primes and targets had identical vowels and final consonants (e.g., speet, freet). The confusability matrix from Cutler et al. (2004) was consulted in the creation of half-rhyme and non-rhyme conditions. Confusability has proven a reliable index of perceptual similarity between two items, and confusion matrices, which organize speech sounds in terms of the percentages with which they were confused with other speech sounds (e.g., /k/ mistaken for /t/) experimentally in noise, are a useful way of measuring and analyzing perceptual similarity (Johnsen, 2011). In our stimulus, half-rhyme consonants were paired according to highest 10/31 ACCEPTED MANUSCRIPT Cross & Fujioka, Auditory rhyme processing in freestyle rap rate of confusability in the Cutler et al. (2004) matrix, while non-rhyme vowels and consonants were paired according to lowest rate of confusability. In the half-rhyme condition, primes and targets had identical vowels, but differed in final consonant. Additionally, the final consonant of the target was perceptually similar to that of the prime (e.g., yeek, freet) in RI PT 30 triplets, and differed in voicing from that of the prime (e.g., chibb, yith) in 30 triplets. In the non-rhyme condition, pseudo-word primes differed from targets in both vowel and final consonant, and the final consonant was perceptually dissimilar to that of the prime (e.g., zug, freet). Triplets were grouped into three primary blocks, each block containing 60 triplets (20 from the full-rhyme, 20 from M AN US C the half-rhyme and 20 from the non-rhyme condition). Each triplet in these three blocks was assigned to each of two judgment tasks (described in Procedure). Thus, six blocks in total were created, representing the three stimulus conditions by the two judgment tasks. In each block, pseudo-word triplets were presented in random order (with order of pseudo-word triplets in each block fixed across participants). A second randomized version of each of the six blocks was created to produce 12 blocks in total. The order of task presentation was counterbalanced across participants. Stimuli were spoken by a female native speaker of American English who was born in California, and had lived as a child and adolescent in two different states. Pseudo-words were digitally recorded (44.1 kHz, 16-bit resolution) using an D AKG C460B microphone with an omni CK-62 capsule, connected to a IBM-compatible computer running a sound editing TE program (Audacity, v. 2.0.5). Each pseudo-word triplet was stored in a separate file and was carefully edited for precise time of onset to permit synchronization with EEG digitization. During the experiment, pseudo-words were presented by STIM2 software (Compumedics Neuroscan Inc., El Paso, TX, USA), and delivered binaurally using insert-earphones (ER- EP 1, Etymotic Research, Elk Grove Village, IL) at a comfortable listening level of 85 dB SPL on average. [Figure 3] AC C The time line of stimulus presentation in a single trial is illustrated in Figure 3. During the EEG recording, a fixation cross appeared in the middle of a white screen in front of the participant; 800 ms later, the fixation cross disappeared and the task cue (‘like?’, ‘perfect?’) appeared and remained on the screen for 4900 ms. A pseudo-word triplet was played through stereo headphones, beginning 67 ms after the appearance of the cue and lasting for 3200ms. Stimulus onset asynchrony (SOA) between consecutive pseudo-words in a triplet was fixed at 1200ms. After the last pseudo-word stimulus, participants provided an answer. Approximately 1700 ms after the completion of the last pseudo-word, the cue disappeared from the screen. The interval between the disappearance of the cue and the start of the following trial was 11/31 ACCEPTED MANUSCRIPT Cross & Fujioka, Auditory rhyme processing in freestyle rap varied and uniformly distributed between 2000 and 3000ms (See Figure 3). Behavioral responses and latencies were 2.4 RI PT recorded starting at target onset. Task and Procedure Participants were instructed not to blink and to “keep their eyes on the cue” while the sounds were playing and the cue was on the screen, and to respond “yes” or “no” to each task. The response was made using a keypad where two M AN US C different buttons were assigned to “yes” and “no” response. The buttons were pressed by the thumb of the left and right hand respectively. The assignment remained the same throughout the testing session within each participant, but was counterbalanced across all participants. If the ‘perfect?’ cue was presented, participants were instructed to respond to whether they thought the third pseudo-word created a perfect rhyme with the previous two pseudo-words. If the ‘like?’ cue was presented, participants were instructed to respond to whether they liked the third pseudo-word as a rhyme with the previous two pseudo-words. They were asked to respond as swiftly and as accurately as possible. No additional instructions were given. 2.5 Behavioral Data Analysis D Individual mean response times and response frequencies were computed separately for Condition (full, half, non), TE Task (perfect, like), and Response (yes, no). In calculating individual response time, instances were excluded if a response time was earlier than 200ms from the onset of the target vowel or later than the next trial start (about 4.4 to 5.4 s), or when a EP response occurred outside of 2 standard deviations from individual mean response time across all conditions. Response times as well as ratio of yes/no responses were analyzed by a mixed-design analysis of variance (ANOVA) with one AC C between-subject factor Group, and three within-subject factors Condition, Task, and Response. Following the ANOVAs additional post hoc t-tests with Bonferroni correction were conducted. 2.6 EEG Recording Procedure and Data Analysis The electroencephalogram (EEG) recordings were conducted in a seated position in an electrically and acoustically shielded room while the experimenter monitored the compliance of the participants through a window of the shielded room from the next room. The EEG was continuously recorded using the Neuroscan SymAmpRT amplifier and Curry 7 acquisition software with a whole-head 64-channel Quikcap (Compumedics Neuroscan Inc., El Paso, TX, USA) for DC to 200 Hz at a sampling rate of 500 Hz. All scalp electrodes were referenced to a reference electrode located between the CPz 12/31 ACCEPTED MANUSCRIPT Cross & Fujioka, Auditory rhyme processing in freestyle rap and the Cz electrodes during the recording, but re-referenced using the common average reference offline. Electrodes were also placed above and below the left eye, and at the left and right temples to record vertical and horizontal electrooculogram (EOG). All electrode impedances were maintained below 10 KΩ, as the SynAmpRT amplifier is tolerant of higher RI PT impedances while still obtaining a good signal to noise ratio. To remove eye artifacts, source space projection technique in the Brainstorm toolbox (Tadel et al., 2011) was used to construct a set of projectors per participant based on 400-msec epoch centered around the detected stereotypical eye-artifacts (blink and movements) in horizontal and vertical EOGs. These projectors were applied to all the continuous data blocks when single epochs for the CNV and N450 responses were M AN US C extracted. Bad channels were rejected if they exceeded a peak-to-peak threshold of ±70 V. To observe the CNV, the ERPs were extracted using a time window of -100 – 4000 ms with a 100-msec prestimulus baseline before the onset of the triplet, averaged separately for ‘perfect’ and ‘like’ tasks. Note that this analysis was concerned with the brain response preceding the second and third stimuli, and required no separate averaging for full, half, or non-rhyming conditions or different types of behavioral response made after the third stimuli. To increase the signal quality, the response at six scalp areas were further averaged in the following electrode clusters: frontocentral left (fcl: F7, F5, F3, FT7, FC5, FC3, T7, C5, C3), frontocentral midline (fcm: F1, Fz, F2, FC1, FCz, FC2, C1, Cz, C2), frontocentral right D (fcr: F4, F6, F8, FC4, FC6, FT8, C4, C6, T8), parietal left (pl: TP7, CP5, CP3, P7, P5, P3, PO7, PO5), parietal midline (pm: TE CP1, CPz, CP2, P1, Pz, P2, PO3, POz, PO4), and parietal right (pr: CP4, CP6, TP8, P4, P6, P8, PO6, PO8). The presence of CNV was assessed using the entire time window of 400ms between the first and second stimuli, as well as the 400-ms window between the second and third stimuli at fcm and pm electrode clusters, which for CNV, were the only two electrode EP sites where a negativity was observed. The mean amplitude of ERPs of each window was then separately submitted to a mixed-design ANOVA, with the between-subjects factor Group (lyricists, non-lyricists) and the within-subject factors Task AC C (perfect, like) and Electrode cluster (fcm, pm). RE in N450 was analyzed only in the perfect task, in which participant responses were concerned with the “correctness” of a given rhyme sequence. The RE was assessed as the difference between the N450 response to non-rhyme versus full-rhyme conditions, half-rhyme versus full-rhyme conditions, and non-rhyme versus half-rhyme conditions. Note that the acoustic information relevant for determination of rhyme congruence begins with the vowel onset, after the initial consonant clusters. Thus, RE in N450 specifically phase-locked to that time point was examined (abbreviated hereafter as RE-v and N450-v, respectively). The PRAAT software (Boersma, 2002) was used to identify the approximate vowel-onset time in all individual pseudo-words, which ranged between 27 ms and 229 ms from the onset of the stimulus itself. After 13/31 ACCEPTED MANUSCRIPT Cross & Fujioka, Auditory rhyme processing in freestyle rap this was set as time zero for the third stimulus, the ERPs for N450-v were obtained separately for full-rhyme, half-rhyme, and non-rhyme conditions, and visualized in a time window of -50 – 1500 ms. Following the recommendation by Keil et al. (2014) for finding a region of interest without selection bias, the ERPs across all three conditions from both groups were RI PT averaged into one grand average to extract the peak latency of N450-v at the pm electrode site where N450-v was expected to appear. Then, a 100-ms time window centered around the peak at 488 ms was used to compute the amplitude in each condition and each electrode site which was subsequently submitted to a mixed-design ANOVA with the between-subjects factor Group (lyricists, non-lyricists) and the within subject factor Electrode cluster (fcl, fcm, fcr, pl, pm, pr). Note that for M AN US C visualization purposes a 40-Hz low-pass filter was applied, but the statistical examination was always based on the unfiltered data. All p-values were adjusted with the Greenhouse-Geisser epsilon correction for nonsphericity when necessary. 3 Results 3.1 CTOPP scores Percentile rank scores for phonological memory, alternate phonological awareness, and rapid symbolic naming D composites from the CTOPP are summarized in Table 1 for lyricists and non-lyricists. Two-sample t-tests revealed no significant difference between lyricists and non-lyricists in phonological processing ability in any of the three CTOPP TE subtests: phonological memory [t(19) = 1.66 p = 0.113], alternate phonological awareness [t(19) = 0.86, p = 0.402], and 3.2 EP rapid symbolic naming [t(19) = 0.61, p = 0.546]. [Table 1] Behavioral Performance in Rhyme Judgment Tasks AC C The response times in the two types of rhyme judgment task during EEG recording are reported in Table 2. A mixed design ANOVA with a between-subject factor Group (lyricists, non-lyricists), and three within-subject factors, Task (like, perfect), Condition (full, half, non), and Response (yes, no) revealed a main effect of Condition [F(2,38) = 11.05, p < 0.001, ηp2 = 0.368]. This main effect reflects that response times in the non-rhyme condition were significantly faster than response times in both the full-rhyme condition [t(20) = 2.66, p < 0.05] and the half-rhyme condition [t(20) = 4.72, p < 0.001]. A significant interaction effect of Task × Condition [F(2,38) = 4.76, p = 0.014, ηp2 = 0.200] reflects that response times during the “perfect” task were faster than response times during the “like” task, at a moderate level of significance [t(20) = -1.934, p = 0.087] in the non-rhyme condition. There was a Condition × Response interaction [F(2,38) = 28.72, p < 0.001, ηp2 = 14/31 ACCEPTED MANUSCRIPT Cross & Fujioka, Auditory rhyme processing in freestyle rap 0.602], reflecting that “no” responses were significantly slower than “yes” responses in the full-rhyme condition [t(20) = 4.68, p < 0.001], “yes” responses were significantly slower than “no” responses in the non-rhyme condition [t(20) = 4.88, p < 0.001], and “yes” responses were slower than “no” responses in the half-rhyme condition, approaching significance [t(20) RI PT = 1.86, p = 0.078]. There was no Task × Response interaction [F(1,19) = 2.887, p = 0.1062, ηp2 = 0.132]. A significant Task × Response × Group interaction [F(1,19) = 5.30, p = 0.033, ηp2 = 0.218] reflects that “yes” responses were faster than “no” responses for non-lyricists only [t(9) = 2.41, p < 0.05], in the “like” task. M AN US C Rates of responding “yes” to either task across three conditions are reported in Table 3. A mixed design ANOVA with a between-subject factor, Group (lyricists, non-lyricists), and two within-subject factors, Task (like, perfect) and Condition (full, half, non), revealed significant main effects of Task [F(1,19) = 32.4, p < 0.001, ηp2 = 0.630] and Condition [F(1.29,24.5) = 280.16, p < 0.001, ηp2 = 0.936]. A main effect of Task reflects a significantly higher yes-rate in the like task compared to the perfect task [t(20) = 5.82, p < 0.0001]. A main effect of Condition was observed because the rate of responding “yes” increased from non-rhyme, to half-rhyme, to full-rhyme conditions. Post-hoc tests revealed that the rate of responding “yes” for the full-rhyme condition was significantly higher than for half-rhyme [t(20) = 11.49, p < 0.0001] and the same was true when comparing full-rhyme or half-rhyme to non-rhyme [non vs. full: t(20) = 44.67, p < 0.0001; non vs. D half: t(20) = 7.53, p < 0.0001]. There was a significant interaction of Group × Condition [F(1.29,24.6) = 3.96, p =0.048, ηp2 TE = 0.173]. This interaction reflected mainly a much higher rate of responding “yes” for lyricists in the half-rhyme condition compared to non-lyricists, approaching significance [t(19) = 2.05, p = 0.0540], while the rate of responding “yes” for full- EP rhyme and non-rhyme conditions were compatible between the groups. There was also a significant interaction of Task × Condition [F(2,38) = 19.06, p <0.001, ηp2 = 0.501]. Post-hoc tests revealed that this interaction reflects a significant AC C increase in the rate of “yes” response for the “like” task compared to the “perfect” task in the half-rhyme condition [t(20) = 6.88, p < 0.0001], while such increase was only moderately significant in the non-rhyme condition [t(20) = -1.74, p = 0.0775]. No task difference was observed in the full-rhyme condition [t(20) = 0.37, p = 0.717]. 3.3 [Table 2] [Table 3] Task effects in the CNV We examined whether CNV-like slow potentials existed during the waiting period for the second and third pseudo- word stimuli and how they differed between tasks and groups. Figure 4 shows the grand average ERP waveforms at 15/31 ACCEPTED MANUSCRIPT Cross & Fujioka, Auditory rhyme processing in freestyle rap frontocentral midline (fcm) and parietal midline (pm) electrode clusters for the “perfect” and “like” tasks, demonstrating the existence of such slow waves. Two separate ANOVAs were used for the amplitudes during the periods 800-1200ms (lyricist, non-lyricist), Task (like, perfect) and Electrode cluster (fcm, pm). RI PT between the first and second primes, and 2000-2400ms between the second prime and the target, using the factors Group For the time window 800-1200ms, no main effect of Group, and no Group × Task interaction were found. However, there was a significant Group × Task × Electrode cluster interaction [F(1,19)=6.286, p = 0.021, ηp2 = 0.249]. Post-hoc tests showed that at the fcm electrode cluster, a more negative CNV for lyricists to “like” tasks as compared to M AN US C “perfect” tasks was found, but this did not reach the significance level [t(10) = 1.741, p = 0.1123], while no such difference was found for non-lyricists. At the pm electrode cluster, in contrast, a more negative CNV for non-lyricists to “like” tasks as compared to “perfect” tasks was significant [t(9) = 2.888, p = 0.0179], while no such difference was found for lyricists. For the time window of 2000-2400ms between the second prime and target, ANOVA again revealed Group × Task × Electrode cluster interaction effects [F(1,19) = 3.531, p = 0.076, ηp2 = 0.157] which were marginally significant, reflecting that, similar to the previous time window, non-lyricists continue to show a significantly more negative CNV during “like” tasks than in “perfect” tasks at pm sites [t(9) = 3.889, p = 0.0037], while no such difference was found in D lyricists. As seen in Figure 4, CNV were different between the tasks in both time-windows and in both groups, partly due to 3.4 TE the carryover of the CNV response observed preceding the second stimuli on those preceding the third stimuli. Rhyming Effects in vowel-onset adjusted N450 (RE-v) EP RE-v were analyzed by observing the vowel-onset adjusted N450 response when participants answered “yes” to the full-rhyme, and “no” to either the half-rhyme or non-rhyme condition in the perfect task. Given that our hypothesis AC C about group difference is with respect to conscious decision-making about rhyme acceptability (i.e., rather than automatic response to rhyme), we focused our analysis only on “yes” responses to full-rhymes and “no” responses to non-rhymes because they can reasonably be assumed to reflect decision making for stimuli that either satisfactorily or unsatisfactorily completed a rhyme sequence. Because the degree to which a half-rhyme satisfies the criteria for completing a rhyme sequence is more subjective, we analyzed both “yes” and “no” responses to half-rhyme. However, responses of “yes” to half-rhyme could not be included in further analysis because non-lyricists did not respond “yes” in enough trials to obtain ERPs for group comparison. Thus, our half-rhyme analysis refers only to half-rhymes that were deemed unsatisfactory. Additionally, one lyricist was excluded from this analysis, since he responded “yes” to nearly all half-rhyme trials. This analysis therefore included 10 lyricists and 10 non-lyricists. 16/31 ACCEPTED MANUSCRIPT Cross & Fujioka, Auditory rhyme processing in freestyle rap The amplitude of N450-v in a time window of 438-538ms were examined using a mixed-design ANOVA of Group (lyricist, non-lyricist), Condition (full, half, non), and Electrode cluster (fcl, fcm, fcr, pl, pm, pr). Figure 5 displays grandaveraged ERP waveforms to full-rhyming, half-rhyming and non-rhyming targets at the six electrode clusters for lyricists RI PT (Figure 5A) and non-lyricists (Figure 5B). While ERP waveforms at parietal sites between groups are similar during all conditions, at frontal sites, groups clearly differed in their response to relative conditions in frontocentral scalp sites. Although there was no main effect of Group [F(1,18) = 0.196, p = 0.663, ηp2 = 0.011], a main effect of Condition [F(1.45, 26.13) = 4.892, p = 0.024, ηp2 = 0.214], and a main effect of Electrode cluster [F(3.10, 55.86) = 16.644, p < 0.001, ηp2 = M AN US C 0.480] were significant. A significant interaction of Condition × Electrode cluster [F(3.97, 71.56) = 5.847, p < 0.001, ηp2 = 0.245] reflects that the N450-v response is more negative to full-rhyme than to non-rhyme at fcl [t(18) = -3.67, p < 0.01], more negative to full-rhyme than both half-rhyme [t(18) = -4.31, p < 0.001] and non-rhyme [t(18) = -2.70, p < 0.05] at fcr, and more positive to full-rhyme than both half-rhyme [t(18) = 6.89, p < 0.001] and non-rhyme [t(18) = 4.58, p < 0.001] at pm. A significant three-way interaction of Group × Condition × Electrode cluster [F(3.97, 71.56) = 2.727, p = 0.036, ηp2 = 0.132] reflects that lyricists have a more positive response than non-lyricists in the non-rhyme condition, approaching significance at both fcr [t(18) = -1.782, p = 0.0917] and pr [t(18) = -1.798, p = 0.0898]. D Because topographic maps (see Figure 5C & 5D) indicate that group contrast in hemispheric asymmetry occurred TE mainly in frontal central electrodes, the visible difference between groups in their response to relative conditions at fcl and fcr was further analyzed using a mixed-design ANOVA with factors of Group, Condition Difference (non – full; non – EP half), and Hemisphere (fcl, fcr). There was a main effect of Condition Difference [F(1, 18) = 58.106, p < 0.001, ηp2 = 0.763], because of an overall larger positive contrast for non vs. full compared to non vs. half [t(19)=7.255, p < 0.001]. AC C There was a marginally significant interaction of Group × Hemisphere [F(1, 18) = 4.390, p = 0.051, ηp2 = 0.196], reflecting an overall larger contrast in the left hemisphere than the right hemisphere in non-lyricists [t(9)=3.128, p =0.012] but there was no hemispheric difference found in lyricists. In fact, when comparing the two groups in the left hemisphere only, there was a significantly larger contrast for non-lyricists compared to lyricists [t(18) = -2.484, p = 0.0231]. Additionally, a significant three-way interaction of Group × Condition Difference × Hemisphere [F(1, 18) = 4.814, p = 0.042, ηp2 = 0.211] was also found. This interaction reflects that the non-lyricists’ strong left laterality was selective to condition contrast for the non – full Condition Difference [t(18) = -2.484, p = 0.0231], while their left laterality became only moderately significant for the non – half Condition Difference [t(9)=2.059, p = 0.0695]. As already described above, no significant hemispheric difference was found in lyricists across contrasts. 17/31 ACCEPTED MANUSCRIPT Cross & Fujioka, Auditory rhyme processing in freestyle rap 4 Discussion RI PT We investigated how behavioral and neural correlates of rhyme judgments differed in expert lyricists and nonlyricists, based on their responses to rhythmically organized rhyme sequences that varied in their degree of rhyme congruence. The main findings are: (1) Behaviorally, lyricists and non-lyricists differed greatly in the rates at which they determined half-rhymes acceptable, despite no group difference existing in standardized phonological test performances; (2) M AN US C During presentation of target stimuli, prior to target onset, non-lyricists exhibited a more enhanced CNV response to aesthetic rhyme judgments tasks than to technical rhyme judgment tasks in parietal electrodes, while this task-related difference was only weakly present in frontocentral cites in lyricists and not significant ; (3) Lyricists and non-lyricists differed in their rhyming effects in N450 at the frontocentral area, both in how they responded to the three stimulus conditions relative to one another, and how these responses varied by hemisphere in each group. Lyricists and non-lyricists demonstrated compatible phonological processing ability, according to the constructs incorporated into the CTOPP. The CTOPP is a fairly general measure for screening out deficits in phonological processing, D and scores of our two groups are quite compatible to the age group normative data in American monolingual English speakers (Wagner et al., 1999). This ensures that any difference observed in our experimental task between the groups was TE not caused by a pre-existing group difference, and suggests that acquiring lyrical expertise is not necessarily related to exceptional phonological processing ability. EP Group response times were statistically similar across tasks and conditions, showing a consistent pattern of solid and reliable judgment for full and non-rhyme, and increased reaction times for the half-rhyme condition which requires AC C more detailed phonological processing to distinguish final consonants. No rhyme priming effects (e.g., faster response time) were observed in this study in either the full or half rhyme condition relative to the non-rhyme condition, because the presence of half-rhymes in our paradigm required participants to wait through full and half-rhyming trials until the onset of the final consonant of the target before making a decision, whereas non-rhymes were determined much earlier at the vowel onset. More interestingly, for both groups, the overall rate of responding “yes” was higher in the ‘like’ task than in the ‘perfect’ task, indicating a more liberal judgment of rhyme acceptability by both groups in the ‘like’ task. Specifically, it is remarkable that lyricists were more than twice as likely as non-lyricists to classify a half-rhyme as ‘perfect’ and approximately 50% more likely to ‘like’ a half-rhyme, although the group difference was only approaching the significance 18/31 ACCEPTED MANUSCRIPT Cross & Fujioka, Auditory rhyme processing in freestyle rap level. Behaviorally, non-lyricists appear to approach rhyme judgment with identity of phonological material (i.e., fullrhyme) as the focus, for binary decision making (e.g., all or nothing), whereas lyricists appear to approach rhyme judgment with degree of similarity as the focus. This group difference may be related to lyrical pragmatism; half-rhymes typically RI PT serve as necessary devices for lyricists to satisfy the semantic and rhythmic constraints and creative concerns of freestyle lyric performance, given the limitations of language for expressing all ideas by using only full-rhymes. Thus, lyricists may have developed criteria to classify phonological materials by attending to similarities in the features of phonological segments differently than non-lyricists. In using a binary (yes/no) task, our design is limited in its ability to evaluate group M AN US C differences with respect to gradients of phonological similarity. Use of a Likert scale may have been more illuminating; however, we were most interested in understanding rhyme judgment processes under conditions ecologically similar to the real-time decision making required in improvisational rap performance, which ultimately requires a binary decision. While the conscious (or possibly pre-conscious) evaluation of phonological similarity may involve fine-grained degree differences, they should quickly culminate in binary decisions to either include or exclude a given word as a rhyme option, depending on whether a certain threshold of phonological similarity has been achieved. Group differences with respect to gradients in phonological similarity will be further investigated in the future by examining closely which consonant pairs in our stimuli D were more differently processed than other pairs between groups. TE CNV demonstrated differences in task-related preparation within and between groups, interacting with the nature of upcoming stimuli. Because CNV has been characterized as a pre-activation of brain areas required for completing a specific task (Wild-Wall et al., 2007), our evidence suggests that lyricists and non-lyricists prepare for and execute rhyme EP judgments using brain areas that differ to some extent. In particular, during stimulus presentation, in the moments preceding the critical stimulus, non-lyricists exhibited a CNV to the “like” task which was more negative than that to the “perfect” task AC C at parietal midline electrodes, while no significant task-related difference was observed in lyricists. This result is in line with that in Müller et al. (2010), who also observed task-effects in novices but not in experts, during stimulus presentation, in the moments preceding the critical stimulus. However, our studies diverge in that that our observation of this task difference in laypersons occurred at parietal electrodes rather than frontal. Furthermore, whereas Müller et al. (2010) reported more negative CNV for laypersons during descriptive tasks than to aesthetic tasks, we observed the opposite. In Müller et al. (2010), the researchers interpreted the more negative CNV as an indicator of laypersons’ investment of more mental effort into preparation for descriptive tasks than for aesthetic tasks. Our results may conversely indicate laypersons’ investment of more mental effort for aesthetic rhyme judgment tasks. The discrepancies between the findings in Müller et al. (2010) and 19/31 ACCEPTED MANUSCRIPT Cross & Fujioka, Auditory rhyme processing in freestyle rap those in the present study may have to do with the difference in the nature of the task and stimulus between their experiment and ours. For example, the notes of a single musical chord (if not an arpeggio) are heard simultaneously, whereas the phonemes of a single pseudo-word are processed in serial, thus likely related to more dynamically configured anticipatory RI PT processing. Furthermore, the ways in which tonal relationships determine an appropriate chord sequence might differ from the ways in which phonological relationships determine an appropriate rhyme sequence. In addition to such qualitative distinctions in the experimental paradigms, it is important to note that, as shown in a number of behavioral studies, nonmusicians are able to detect violations of musical syntactic rules without knowledge of music theory, but at their conscious M AN US C level of processing, they tend to label the violations by emotional/aesthetic terms as 'unpleasant' or 'not settling' (Koelsch, 2011; Tillmann and Bigand, 2004). Because of this, in other words, the two tasks used in Müller et al. (2010) might have been almost identical for non-musicians, likely related to the fact that their engagement with music has been only listening, but has not included producing music, or explicitly learning music theories. Rhyme judgment, by contrast, is fairly simple for anyone who has age-appropriate facility with a language. Apart from the novelty of pseudo-words, non-lyricists, in being accomplished speakers and literate, have extensive experience both perceiving and producing language, spoken and written. Given non-lyricists’ depth of linguistic experience and even rhyme experience, the observed group differences are D likely related to lyricists’ consistent engagement with rhyme under various linguistic and musical constraints, and their TE resources for explicit rhyme-related decision-making likely developed over time. Despite behavioral evidence showing that both non-lyricists and lyricists are more liberal in their aesthetic judgements of rhyme compared to their descriptive judgments, the CNV task-effects in lyricists was not clearly present. In EP our study, we purposely chose the word “perfect” as opposed to correct, allowing for an open-ended interpretation, which in the case of half-rhyme, resulted in lyricists approving of half-rhyme sequences twice as often as non-lyricists. Interestingly, AC C non-lyricists appear to have interpreted “perfect” to mean that rhymes should end in identical phonological material, whereas lyricists appear to have applied different criteria. The lyricists’ behavior is consistent with the use of similar consonants for half-rhyme in a variety of languages and cultures (Kawahara, 2007; Steriade, 2003). Aesthetically speaking, a perfect rhyme might be considered by a lyricist to be more satisfying than a correct rhyme, as the act crafting a suitable half-rhyme might be considered a reflection of creativity and expertise in the artistic employ of speech sound similarity. In this sense, for lyricists, both the “perfect” and “like” tasks may have been aesthetic in nature (albeit with a less stringent threshold in the latter case), potentially engaging the same neurological processes. 20/31 ACCEPTED MANUSCRIPT Cross & Fujioka, Auditory rhyme processing in freestyle rap The two groups showed a clear difference between full-rhyme, half-rhyme (unsatisfactory half-rhyme only) and non-rhyme conditions around 500 ms after the target stimulus, in line with the phonological N450 rhyming effect (RE) in the literature (Rugg, 1984). A significant Group x Condition x Hemisphere interaction suggests that rhyme conditions are RI PT processed by groups differently in different brain regions. Notably, the two groups were similar in the parietal scalp site but differed largely in their frontocentral laterality when contrasting rhyme incongruence to congruence (e.g. non – full, non – half). We observed left-greater-than-right hemispheric asymmetry for rhyme effects in non-lyricists, especially for non – full contrasts. This is actually in line with the findings by Coch et al. (2005) who reported left-greater-than-right M AN US C hemispheric asymmetry for rhyming effects in non – full contrasts. This asymmetry involved anterior sites during both N240 and N400 time windows and did not vary with the age group. Because lyrical expertise must be acquired through massive practice rather than implicit and automatic learning, our non-lyricists should be considered as representing a typically developed normal population, i.e., samples documented in most studies in the literature. Interestingly, nonlyricists’ data show sensitivity to the degree of rhyme congruence only in the left hemisphere electrodes. Specifically, at left frontal electrodes during the N450 time window, non-lyricists’ responses to full-rhyme were most negative, followed by responses to half-rhyme, followed by responses to non-rhyme, suggesting their perception and processing of different D degrees of rhyme congruence. By contrast, at right frontal electrodes, their responses exhibited no difference across TE conditions. Lyricists differ from non-lyricists in at least two respects. First, we observed a significant reduction of left hemispheric response for rhyming effects, suggesting that lyricists process rhyme differently in this region. Secondly, this EP was accompanied by the absence of laterality, as well as the absence of significant rhyming effects in either hemisphere. However, because they clearly distinguish between non-, half-, and full-rhyme in our behavioral task and given the nature of AC C their virtuosity, the lack of hemispheric characteristics for condition contrasts should not be the indication of impaired rhyme processing. Rather, we speculate that because lyricists are accustomed to making decisions about rhyme congruence in real time, their brains must have developed neural populations dedicated to quickly processing phonological distinctions and similarities, and that these neuronal populations are highly capable of executing the computations without much overhead. Eagan & Chein (2012) conducted a study of working memory relevant to this speculation. They found that the degree to which irrelevant speech (or speech that should be ignored) shared phonetic characteristics (i.e., consonant sounds and articulations) with relevant speech (i.e., speech that should be remembered), predicted the degree to which participants could remember the relevant speech. They also found that when there was a high level of similarity in phonetic material 21/31 ACCEPTED MANUSCRIPT Cross & Fujioka, Auditory rhyme processing in freestyle rap between relevant and irrelevant speech items, participants were more likely to report incorrect items containing phonological material from the irrelevant speech items. The researchers concluded that overlap in phonetic features between relevant and irrelevant speech impairs working memory. Thus, improvised rap practice, which regularly engages working RI PT memory in the process of prioritizing phonological information both despite and for the sake of phonological overlap, may have encouraged an adaptation to make working memory processes much more energy efficient. In line with this proposed adaptation, our ERP and behavioral data suggest the possibility that lyricists deal with degrees of rhyme using the same neural population, while non-lyricists appear to engage different populations for each M AN US C rhyme condition. In the behavioral data, lyricists had a much higher rate of responding “yes” in the half-rhyme condition compared to non-lyricists, suggesting different heuristics and/or underlying processes for determining rhyme acceptability. This may apply to all rhyme determinations, including classifying full- and non-rhymes. Furthermore, their ERPs show a different pattern of sensitivity and laterality to the rhyme degree compared to non-lyricists. Specifically, at left-hemisphere electrodes, lyricists’ responses to half-rhyme and full-rhyme were similar in morphology and amplitude, with non-rhyme responses distinct; at right-hemisphere electrodes, lyricists’ responses to half-rhyme and non-rhyme were similar in morphology, with full-rhyme responses distinct (see Figure 5A). This may mean that their left hemisphere is still partly D processing phonological information like non-lyricists, while their right hemisphere is playing a role in distinguishing TE between full-rhymes and half-rhymes on the basis of the two conditions differing with respect to some feature or set of features. We speculate that such features are musical in nature and related to rhyme phrase structure in rap music, especially rhyme as embedded in a temporally organized sequence. In lyrical improvisation, experts have to learn how to find a word EP that satisfies all the constraints (e.g., rhyme, rhythm, meaning) on the fly. Though rhymes can be placed anywhere in a musical phrase, typically, the final word(s) in a musical phrase should rhyme with some preceding word(s) (for examples, AC C see Alim, 2003), and it is generally considered ill-formed to terminate a phrase without a rhyme. Thus, in triplet rhyme schemes like our stimulus design, when a musical phrase A and a musical phrase B are the constituents (such that A phrases rhyme with A phrases, B phrases rhyme with B phrases, etc.), the more acceptable triplet combinations would be of the form AAA, ABA, or ABB. An AAB form would be unacceptable, because the last word in the phrase B has no rhyme correspondent in the preceding context, leaving the phrase unresolved. This formula resembles one in Western tonal music harmonic rules wherein a perfect cadence should come at the end of a musical phrase when it is most critical for the overall structure, and it is considered to be one of the most important pillars of Western music syntax (Koelsch, 2011; Tillmann and Bigand, 2004). Violations of this harmonic expectation have been shown to elicit an early right anterior negativity (ERAN) 22/31 ACCEPTED MANUSCRIPT Cross & Fujioka, Auditory rhyme processing in freestyle rap in expert musicians while less expressed by laypersons (Brattico et al., 2013; James et al., 2008; Koelsch et al., 1999). Similar to these violations, for lyricists, non-rhymes and unsatisfactory half-rhymes are alike in signaling not only the violation of phonological rules but also the violation of musical syntactic rules (i.e., prolongation) that are related to rhyme RI PT phrase structure in lyrical practice. It is also important to note that our triplet sequence was specifically designed to encourage concurrent rhythmic processing, which should in turn induce the proposed phrase structure processing. Because we did not test another condition in which triplets were presented in temporally random intervals, or only using word pairs instead of triplets to discourage rhythm processing, the current data do not offer an isolated account for rhyme processing M AN US C per se. Future studies should explore how such concurrent temporal processing would interact with rhyme processing, and how these are related to lyrical expertise. While our N450 responses occur much later than ERAN responses in latency, the P600 component similarly associated with syntactic violations in both language and music (Patel et al., 1998) might be relevant here, in terms of its similarity in topography and latency to the responses observed in our participants, as seen in Figure 5. In the literature, the P600 typically occurs in central parietal regions, but can also be elicited frontally, depending on the demands of the stimulus, and it is considered an “index of detection for any anomaly in rule-governed sequences” (Núñez-Peña and D Honrubia-Serrano, 2004, p. 130). For example, in the question, “What will you said?”, use of the verb said as opposed to TE say represents a failed convergence with the future verb tense expectation established by the word will. Similarly, in our rhyme sequences, a failed resolution of the established rhyme context occurs upon encountering a non-rhyme or unsatisfactory half-rhyme. Regarding the P600 topographic distribution, Kaan and Swaab (2003) suggested that a EP posteriorly distributed P600 is related to difficulty of syntactic processing, including internal repair and revision, while a frontally distributed P600 is related to ambiguity resolution and/or increases in discourse level complexity. Thus, the AC C presence of the centroparietal positivity in both groups could be explained as related to engaging in the detection of targets that participants perceive as anomalous to the established rhyme context. However, because of their habitual practice creating rhymes in real time, lyricists’ unique P600-like pattern may reflect that they have neurological networks that are automatically engaged in resolving or compensating for the failed resolution of a rhyme context upon hearing it. In other words, while non-lyricists’ brains are experiencing the rhyme sequences somewhat passively to fulfill only the required task, lyricists’ brains may be unconsciously or consciously engaged in an active search for (or imagining of) phonetic/acoustic features that would satisfy the current phrase or a hypothetical next phrase. A related possibility is that, because rhyme should typically coincide with and/or be embedded within a fairly rhythmic phrase structure in rap practice, 23/31 ACCEPTED MANUSCRIPT Cross & Fujioka, Auditory rhyme processing in freestyle rap lyricists’ response to non-rhyme and unsatisfactory half-rhymes reflect a unique sensitivity to the failed convergence of rhyme and rhythm that occurs in these cases. In line with this interpretation, a similar convergence effect between the rhyme and rhythmic aspects of poetry has been recently demonstrated by Obermeier et al. (2016), who found that, in the reception RI PT of poetry, metered and rhyming stanzas elicited smaller N400/P600 responses than their nonmetered, nonrhyming, or nonmetered and nonrhyming counterparts. As stated earlier, the rhythmic structure of our stimulus confounds our ability to conclude anything further concerning this interpretation, thus future investigations should investigate this rhyme vs rhythm paradigm, isolating the effects of rhythm specifically through use of an arrhythmic stimulus. M AN US C Collectively, the ERP and behavioral data presented here suggest the presence of a dedicated bilateral neural network in lyricists for processing rhyme sequences jointly solving linguistic and musical task demands. We propose that in the brains of lyricists, musical and linguistic rules may exist not as separate entities but as a hybrid “musico-linguistic” syntax, based on which the system rigorously and actively seeks the resolution of both music and language elements, forms expectations accordingly, detects violations, reconciles conflicts, and moves forward. We propose that this syntax, in part, works in the following way with respect to rhyme. First, in the left-hemisphere, words are categorized as either rhyme or non-rhyme, based on whether they have identical vowels. Full-rhymes, and half-rhymes (regardless of the level of D phonological/acoustic similarity between their final consonants) are treated as the same, hence ERPs to these conditions are TE similar. Final consonants for full-rhymes and half-rhymes are treated as a separate grammatical marker that determines whether the word/sequence is musically appropriate. Finally, musical appropriateness is processed in the right hemisphere, where non-rhymes and half-rhymes whose final consonants have marked them as inappropriate endings to the preceding EP rhyme context are treated as the same, hence ERPs to these conditions are similar. Essentially, this syntax is evidenced by “groupings” of stimuli: rhyming vs. non-rhyming in the left-hemisphere, and musically appropriate vs. inappropriate in the AC C right-hemisphere. The response of non-lyricists help illuminate the mechanisms of this proposed syntax. Neither fullrhymes, half-rhymes or non-rhymes are phonologically identical to the primes that precede them. Full-rhymes differ from primes in their initial consonants, half-rhymes differ in their initial and final consonants, and non-rhymes differ completely. Distinct ERPs to each condition in non-lyricists at left-hemisphere electrodes reflect their perception of this phonological reality. By comparison, the “grouping” response of lyricists reflects, not an ignorance of this phonological reality, but its incorporation into a grammar to which non-lyricists are unaccustomed. Likewise, non-lyricists’ response at right hemisphere electrodes wherein ERPs to all conditions are near-identical, reflect the absence of a grammar for grouping nonrhymes and unsatisfactory half-rhymes together as musically inappropriate, and processing them as distinct from full24/31 ACCEPTED MANUSCRIPT Cross & Fujioka, Auditory rhyme processing in freestyle rap rhyme. These observations of how lyricists neurologically adapt to the musical and linguistic constraints of their discipline begin to help us understand the neurocognitive underpinnings of musical prolongation as it occurs in poetry, and further our understanding of the relationship between music and language in the brain. Future research should seek to understand how 5 RI PT such a system takes shape in lyricists through development, learning, and artistic sophistication. Conclusion This study sought to understand musical and linguistic integration by investigating auditory rhyme processing in M AN US C relation to expertise in rap improvisation. ERPs to rhyme judgment tasks demonstrated differences between expert rap improvisers (“lyricists”) and novices both in their preparation to make rhyme judgments, and in their response to different degrees of rhyme congruence. Unique ERP responses occurring in lyricists to varying degrees of rhyme congruence and to violations of established rhyme contexts suggests that lyricism may encourage combining linguistic and musical feature processing in the brain, that enables lyricists to process the non-musical phonological aspects of a rhyme sequence in the left hemisphere, and the integrated phonological and musical phrasing aspects of a rhyme sequence in the right hemisphere. It is noteworthy, however, that in rap practice, rhymes do not occur in isolation as it does in this study. Rhymes, as lexical D units embedded within larger semantically meaningful phrase units, serve as anchors about which several creative linguistic phenomena related to “meaning” are organized. Thus, it is of future interest to examine how expertise in lyricism is related TE to semantic processing, as it has been shown that different types of meaning such as simile, metaphor and analogy activate different brain regions (Riddell, 2016; Shibata et al., 2012). Importantly, as rap is a musical tradition of African American EP and Latino American origins that is now practiced in diverse parts of the world by people of various languages and backgrounds, investigation of rap expertise in relation to rhyme and other linguistic and musical features has the potential to AC C inform our understanding of not only integrated musical and linguistic processing in the brain, but also oral cultural tradition and its implications widely relevant to human cognition and development. Acknowledgements This research was supported by Center for Computer Research in Music and Acoustics and Department of Music, Stanford University, and the Stanford University Interdisciplinary Graduate Fellowship. 25/31 ACCEPTED MANUSCRIPT Cross & Fujioka, Auditory rhyme processing in freestyle rap References AC C EP TE D M AN US C RI PT Alexander, J. a, Wong, P.C.M., Bradlow, A.R., Il, U.S. a, 2005. Lexical Tone Perception in Musicians and Non-musicians Department of Linguistics Department of Communication Sciences and Disorders Northwestern University Institute for Neuroscience. Methods 397–400. Ali, S.O., Peynircioğlu, Z.F., 2006. Songs and emotions: are lyrics and melodies equal partners? Psychol. Music 34, 511– 534. Alim, H.S., 2003. On Some Serious Next Millennium Rap Ishhh: Pharoahe Monch, Hip Hop Poetics, and the Internal Rhymes of Internal Affairs. J. English Linguist. 31, 60–84. https://doi.org/10.1177/0075424202250619 Besson, M., Faita, F., Peretz, I., Bonnel, a.-M., Requin, J., 1998. Singing in the Brain: Independence of Lyrics and Tunes. Psychol. Sci. 9, 494–498. https://doi.org/10.1111/1467-9280.00091 Bidelman, G.M., Hutka, S., Moreno, S., 2013. Tone Language Speakers and Musicians Share Enhanced Perceptual and Cognitive Abilities for Musical Pitch: Evidence for Bidirectionality between the Domains of Language and Music. PLoS One 8. https://doi.org/10.1371/journal.pone.0060676 Boersma, P., 2002. Praat, a system for doing phonetics by computer. Glot Int. 5, 341–345. Bradley, A., 2017. Book of rhymes: The poetics of hip hop. Civitas Books. Bradley, E.D., 2012. Tone language experience enhances sensitivity to melodic contour. LSA Annu. Meet. Brattico, E., Alluri, V., Bogert, B., Jacobsen, T., Vartiainen, N., Nieminen, S., Tervaniemi, M., 2011. A functional MRI study of happy and sad emotions in music with and without lyrics. Front. Psychol. https://doi.org/10.3389/fpsyg.2011.00308 Brattico, E., Jacobsen, T., De Baene, W., Glerean, E., Tervaniemi, M., 2010. Cognitive vs. affective listening modes and judgments of music - An ERP study. Biol. Psychol. https://doi.org/10.1016/j.biopsycho.2010.08.014 Brattico, E., Pearce, M., 2013. The neuroaesthetics of music. Psychol. Aesthetics, Creat. Arts 7, 48–61. https://doi.org/10.1037/a0031624 Brattico, E., Tupala, T., Glerean, E., Tervaniemi, M., 2013. Modulated neural processing of Western harmony in folk musicians. Psychophysiology 50, 653–63. https://doi.org/10.1111/psyp.12049 Cason, N., Schön, D., 2012. Rhythmic priming enhances the phonological processing of speech. Neuropsychologia 50, 2652–2658. https://doi.org/10.1016/j.neuropsychologia.2012.07.018 Chatterjee, A., Vartanian, O., 2014. Neuroaesthetics. Trends Cogn. Sci. https://doi.org/10.1016/j.tics.2014.03.003 Coch, D., Grossi, G., 2002. A developmental investigation of ERP auditory rhyming effects. Dev. … 4, 467–489. Coch, D., Grossi, G., Skendzel, W., Neville, H., 2005. ERP nonword rhyming effects in children and adults. J. Cogn. Neurosci. 17, 168–182. https://doi.org/10.1162/0898929052880020 Coch, D., Hart, T., Mitra, P., 2008. Three kinds of rhymes: An ERP study. Brain Lang. 104, 230–243. https://doi.org/10.1016/j.bandl.2007.06.003 Cutler, A., Weber, A., Smits, R., Cooper, N., 2004. Patterns of English phoneme confusions by native and non-native listeners. J. Acoust. Soc. Am. 116, 3668–3678. Dumay, N., Benraïss, a, Barriol, B., Colin, C., Radeau, M., Besson, M., 2001. Behavioral and electrophysiological study of phonological priming between bisyllabic spoken words. J. Cogn. Neurosci. 13, 121–143. https://doi.org/10.1162/089892901564117 Eagan, D.E., Chein, J.M., 2012. Overlap of phonetic features as a determinant of the between-stream phonological similarity effect. J. Exp. Psychol. Learn. Mem. Cogn. 38, 473–481. https://doi.org/10.1037/a0025368 Ettlinger, M., Margulis, E.H., Wong, P.C.M., 2011. Implicit memory in music and language. Front. Psychol. 2, 1–10. https://doi.org/10.3389/fpsyg.2011.00211 Falkenstein, M., Hoormann, J., Hohnsbein, J., Kleinsorge, T., 2003. Short-term mobilization of processing resources is revealed in the event-related potential. Psychophysiology 40, 914–923. https://doi.org/10.1111/1469-8986.00109 Gaillard, A.W.K., 1977. The Late CNV Wave: Preparation Versus Expectancy. Psychophysiology 14, 563–568. https://doi.org/10.1111/j.1469-8986.1977.tb01200.x Gordon, R.L., Schön, D., Magne, C., Astésano, C., Besson, M., 2010. Words and melody are intertwined in perception of sung words: EEG and behavioral evidence. PLoS One 5. https://doi.org/10.1371/journal.pone.0009889 Grossi, G., Coch, D., Coffey-Corina, S., Holcomb, P.J., Neville, H.J., 2001. Phonological processing in visual rhyming: a developmental erp study. J. Cogn. Neurosci. 13, 610–25. https://doi.org/10.1162/089892901750363190 Jackendoff, R., 2009. Parallels and Nonparallels between Language and Music. Music Percept. An Interdiscip. J. https://doi.org/10.1525/mp.2009.26.3.195 26/31 ACCEPTED MANUSCRIPT Cross & Fujioka, Auditory rhyme processing in freestyle rap AC C EP TE D M AN US C RI PT James, C.E., Britz, J., Vuilleumier, P., Hauert, C.A., Michel, C.M., 2008. Early neuronal responses in right limbic structures mediate harmony incongruity processing in musical experts. Neuroimage 42, 1597–1608. https://doi.org/10.1016/j.neuroimage.2008.06.025 Johnsen, S.S., 2011. Rhyme acceptability determined by perceived similarity. Kaan, E., Swaab, T.Y., 2003. Repair, revision, and complexity in syntactic analysis: An electrophysiological differentiation. J. Cogn. Neurosci. 15, 98–110. https://doi.org/10.1162/089892903321107855 Kawahara, S., 2007. Half rhymes in Japanese rap lyrics and knowledge of similarity. J. East Asian Ling. 16, 113–144. https://doi.org/10.1007/s10831-007-9009-1 Keil, A., Debener, S., Gratton, G., Junghöfer, M., Kappenman, E.S., Luck, S.J., Luu, P., Miller, G.A., Yee, C.M., 2014. Committee report: publication guidelines and recommendations for studies using electroencephalography and magnetoencephalography. Psychophysiology 51, 1–21. Kiparsky, P., 1973. The role of linguistics in a theory of poetry. Daedalus 231–244. Koelsch, S., 2011. Toward a neural basis of music perception - a review and updated model. Front. Psychol. 2, 110. https://doi.org/10.3389/fpsyg.2011.00110 Koelsch, S., Gunter, T.C., Wittfoth, M., Sammler, D., 2005. Interaction between syntax processing in language and in music: an ERP Study. J. Cogn. Neurosci. 17, 1565–77. https://doi.org/10.1162/089892905774597290 Koelsch, S., Schröger, E., Tervaniemi, M., 1999. Superior pre-attentive auditory processing in musicians. Neuroreport 10, 1309–13. Kutas, M., Federmeier, K.D., 2011. Thirty years and counting: finding meaning in the N400 component of the event-related brain potential (ERP). Annu. Rev. Psychol. 62, 621–647. Ladefoged, P., Johnson, K., 2011. A course in phonetics, 6th ed. CengageBrain. com. Lerdahl, F., 2001. The sounds of poetry viewed as music. Ann. N. Y. Acad. Sci. 930, 337–54. Limb, C.J., Braun, A.R., 2008. Neural substrates of spontaneous musical performance: an FMRI study of jazz improvisation. PLoS One 3, e1679. https://doi.org/10.1371/journal.pone.0001679 Liu, S., Chow, H.M., Xu, Y., Erkkinen, M.G., Swett, K.E., Eagle, M.W., Rizik-Baer, D. a, Braun, A.R., 2012. Neural correlates of lyrical improvisation: an FMRI study of freestyle rap. Sci. Rep. 2, 834. https://doi.org/10.1038/srep00834 Macar, F., Besson, M., 1985. Contingent negative variation in processes of expectancy, motor preparation and time estimation. Biol. Psychol. 21, 293–307. https://doi.org/10.1016/0301-0511(85)90184-X Macar, F., Franck, V., 2004. Event-Related Potentials as Indices of Time Processing: A Review. J. Psychophysiol. 18, 130– 139. https://doi.org/10.1027/0269-8803.18.2 Magne, C., Schön, D., Besson, M., 2006. Musician children detect pitch violations in both music and language better than nonmusician children: behavioral and electrophysiological approaches. J. Cogn. Neurosci. 18, 199–211. https://doi.org/10.1162/jocn.2006.18.2.199 Müller, M., Höfel, L., Brattico, E., Jacobsen, T., 2010. Aesthetic judgments of music in experts and laypersons--an ERP study. Int. J. Psychophysiol. 76, 40–51. https://doi.org/10.1016/j.ijpsycho.2010.02.002 Núñez-Peña, M.I., Honrubia-Serrano, M.L., 2004. P600 related to rule violation in an arithmetic task. Cogn. Brain Res. 18, 130–141. https://doi.org/10.1016/j.cogbrainres.2003.09.010 Obermeier, C., Kotz, S.A., Jessen, S., Raettig, T., von Koppenfels, M., Menninghaus, W., 2016. Aesthetic appreciation of poetry correlates with ease of processing in event-related potentials. Cogn. Affect. Behav. Neurosci. 16, 362–373. https://doi.org/10.3758/s13415-015-0396-x Patel, A.D., Gibson, E., Ratner, J., Besson, M., Holcomb, P.J., 1998. Processing Syntactic Relations in Language and Music: An Event-Related Potential Study. J. Cogn. Neurosci. 10, 717–733. https://doi.org/10.1162/089892998563121 Peretz, I., Coltheart, M., 2003. Modularity of music processing. Nat. Neurosci. 6, 688–691. https://doi.org/10.1038/nn1083 Perrin, F., García-Larrea, L., 2003. Modulation of the N400 potential during auditory phonological/semantic interaction. Brain Res. Cogn. Brain Res. 17, 36–47. Poulin-Charronnat, B., Bigand, E., Madurell, F., Peereman, R., 2005. Musical structure modulates semantic priming in vocal music. Cognition 94. https://doi.org/10.1016/j.cognition.2004.05.003 Praamstra, P., Meyer, A.S., Levelt, W.J.M., 1994. Neurophysiological Manifestations of Phonological Processing: Latency Variation of a Negative ERP Component Timelocked to Phonological Mismatch. J. Cogn. Neurosci. 6, 204–219. https://doi.org/10.1162/jocn.1994.6.3.204 Praamstra, P., Stegeman, D.F., 1993. Phonological effects on the auditory N400 event-related brain potential. Cogn. Brain Res. 1, 73–86. https://doi.org/10.1016/0926-6410(93)90013-U Riddell, P., 2016. Metaphor, simile, analogy and the brain. Chang. English 23, 363–374. Rohrmeier, M., Rebuschat, P., 2012. Implicit Learning and Acquisition of Music. Top. Cogn. Sci. 27/31 ACCEPTED MANUSCRIPT Cross & Fujioka, Auditory rhyme processing in freestyle rap AC C EP TE D M AN US C RI PT https://doi.org/10.1111/j.1756-8765.2012.01223.x Ruchkin, D.S., Sutton, S., Mahaffey, D., Glaser, J., 1986. Terminal CNV in the absence of motor response. Electroencephalogr. Clin. Neurophysiol. 63, 445–463. https://doi.org/10.1016/0013-4694(86)90127-6 Rugg, M.D., 1984. Event-related potentials and the phonological processing of words and non-words. Neuropsychologia 22, 435–443. https://doi.org/10.1016/0028-3932(84)90038-1 Schenker, H., 1979. Free Composition: Vol 3 of New Musical Theories and Fantasies. Schön, D., François, C., 2011. Musical Expertise and Statistical Learning of Musical and Linguistic Structures. Front. Psychol. 2, 1–9. https://doi.org/10.3389/fpsyg.2011.00167 Shibata, M., Toyomura, A., Motoyama, H., Itoh, H., Kawabata, Y., Abe, J. ichi, 2012. Does simile comprehension differ from metaphor comprehension? A functional MRI study. Brain Lang. 121, 254–260. https://doi.org/10.1016/j.bandl.2012.03.006 Stahl, B., Henseler, I., Turner, R., Geyer, S., Kotz, S. a, 2013. How to engage the right brain hemisphere in aphasics without even singing: evidence for two paths of speech recovery. Front. Hum. Neurosci. 7, 35. https://doi.org/10.3389/fnhum.2013.00035 Stahl, B., Kotz, S. a, Henseler, I., Turner, R., Geyer, S., 2011. Rhythm in disguise: why singing may not hold the key to recovery from aphasia. Brain 134, 3083–93. https://doi.org/10.1093/brain/awr240 Steinbeis, N., Koelsch, S., 2008. Shared neural resources between music and language indicate semantic processing of musical tension-resolution patterns. Cereb. Cortex 18, 1169–1178. https://doi.org/10.1093/cercor/bhm149 Steriade, D., 2003. Knowledge of perceptual similarity and its phonological uses : evidence from half-rhymes 363–366. Tadel, F., Baillet, S., Mosher, J.C., Pantazis, D., Leahy, R.M., 2011. Brainstorm: a user-friendly application for MEG/EEG analysis. Comput. Intell. Neurosci. 2011, 8. Tecce, J., 1972. Contingent Negative Variation (CNV) and Psychological Processes in Man. Psychol. Bull. 77. Tillmann, B., Bigand, E., 2004. The relative importance of local and global structures in music perception. J. Aesthet. Art Crit. 62, 211–222. https://doi.org/10.1111/j.1540-594X.2004.00153.x van Boxtel, G.J.M., Brunia, C.H.M., 1994. Motor and non-motor aspects of slow brain potentials. Biol. Psychol. 38, 37–51. https://doi.org/10.1016/0301-0511(94)90048-5 van Rijn, H., Kononowicz, T.W., Meck, W.H., Ng, K.K., Penney, T.B., 2011. Contingent negative variation and its relation to time estimation: a theoretical evaluation. Front. Integr. Neurosci. 5, 1–5. https://doi.org/10.3389/fnint.2011.00091 Vuust, P., Brattico, E., Seppänen, M., Näätänen, R., Tervaniemi, M., 2012. The sound of music: Differentiating musicians using a fast, musical multi-feature mismatch negativity paradigm. Neuropsychologia 50, 1432–1443. https://doi.org/10.1016/j.neuropsychologia.2012.02.028 Wagensveld, B., Segers, E., Van Alphen, V.A., Hagoort, P., Verhoeven, L., 2012. A neurocognitive perspective on rhyme awareness: The N450 rhyme effect. Brain Res. 1483, 63–70. https://doi.org/10.1016/j.brainres.2012.09.018 Wagner, R.K., Torgesen, J.K., Rashotte, C.A., 1999. Comprehensive test of phonological processing: CTOPP. ASHA. Walter, W., Cooper, R., Aldridge, V.J., McCallum, W.C., Winter, A.L., 1964. Contingent negative variation: an electric sign of sensori-motor association and expectancy in the human brain. Nature 203, 380–384. Wild-Wall, N., Hohnsbein, J., Falkenstein, M., 2007. Effects of ageing on cognitive task preparation as reflected by eventrelated potentials. Clin. Neurophysiol. 118, 558–569. https://doi.org/10.1016/j.clinph.2006.09.005 Yoncheva, Y.N., Maurer, U., Zevin, J.D., McCandliss, B.D., 2013. Effects of rhyme and spelling patterns on auditory word ERPs depend on selective attention to phonology. Brain Lang. 124, 238–243. https://doi.org/10.1016/j.bandl.2012.12.013 28/31 ACCEPTED MANUSCRIPT Cross & Fujioka, Auditory rhyme processing in freestyle rap Figure captions RI PT Figure 1. Illustration of rhyme-creating options available to freestyle lyricists during improvisation. In this example, the freestyle lyricist has uttered Phrase 1 (a one quarter beat pick-up, followed by a measure containing four quarter beats, identical in 1A and 1B), and the rhyme correspondence created in Phrase 2 (another measure of four quarter beats) is optionally satisfied in 2A or 2B. M AN US C Figure 2. Stimulus Design. Three conditions representing three levels of rhyme congruence Figure 3. Trial Structure. Visual stimuli are viewed by participants at the start and end of each trial. A cross marks the beginning of the trial, a blank screen the end. The cue appears after the cross, and instructs participants how to respond to the auditory stimulus. If ‘perfect?’ or ‘like?’ is cued, participants respond respectively to whether the auditory stimulus is a ‘perfect’ rhyme or a rhyme that they ‘like’. Prime-1, Prime-2 and Target correspond to three pseudo-words contained in a given stimulus. Prime-1 and Prime-2 always rhyme. Upon hearing the Target, which may or may not rhyme with the Primes, participants respond to the task indicated in the cue. After a randomized 2000-3000ms period following the D appearance of the blank screen, the next trial begins. TE Figure 4. Grand averaged ERP waveforms comparing response to “like” vs. “perfect” tasks for Lyricists (A), and Nonlyricists (B) at frontocentral midline (fcm) and parietal midline (pm) electrode clusters. Time windows of interest are EP highlighted in grey: 800-1200ms, between first and second pseudo-words, which always rhymed, and 2000-2400ms, between the second and third pseudo-words, after which a rhyme judgment is required. Grand averaged topographic maps AC C of response to “perfect” task, “like” task, “like” – “perfect” task difference for Lyricists (C) and Non-Lyricists (D). Figure 5. Grand averaged ERP waveforms to full-, half- and non-rhyming targets for Lyricists (A) and Non-lyricists (B). Grand averaged topographic maps for responses to each stimulus condition and rhyming effects for condition contrasts during the time-period 438-538ms (100ms around grand-average peak of 488ms) for Lyricists (C) and Non-lyricists (D). 29/31 ACCEPTED MANUSCRIPT Cross & Fujioka, Auditory rhyme processing in freestyle rap Tables Phonological processing test results for Lyricists and Non-lyricists CTOPP Alt. Phonological Awareness 116 8.8 111 15.0 Lyricists Mean SD Non-lyricists Mean SD M AN US C CTOPP Phonological Memory RI PT Table 1 124 11.1 CTOPP Rapid Symbolic Naming 116 15.4 Table 2 104 25.0 109 10.2 Response Times (ms) across Tasks, Conditions, and Responses No 1303 351 1022 207 1226 301 1159 237 994 277 Non-lyricists 1200 253 1018 192 1170 392 Full-rhyme “Like” Task Half-rhyme Non-Rhyme No Yes No Yes No Yes No Yes 952 239 1084 302 1132 225 1021 205 1139 179 1212 301 971 242 1160 267 830 180 1090 250 1167 245 987 220 1046 207 1167 291 886 164 1143 228 AC C Mean SD Yes Non-Rhyme D Lyricists Mean SD Yes EP No “Perfect” Task Half-rhyme TE Full-rhyme Table 3 Rate of “yes” response (%) across Tasks and Conditions "Perfect" Task Lyricists Mean SD Non-Lyricists Mean 30/31 "Like" Task full-rhyme half-rhyme non-rhyme full-rhyme half-rhyme non-rhyme 95.3 6.2 41.2 29.7 1.0 1.4 97.0 2.7 59.3 25.7 1.9 2.1 94.6 20.3 0.4 90.9 40.3 3.6 ACCEPTED MANUSCRIPT Cross & Fujioka, Auditory rhyme processing in freestyle rap 3.1 15.0 0.6 15.8 19.2 AC C EP TE D M AN US C RI PT SD 31/31 7.0 AC C EP TE D M AN U SC RI PT ACCEPTED MANUSCRIPT AC C EP TE D M AN U SC RI PT ACCEPTED MANUSCRIPT AC C EP TE D M AN U SC RI PT ACCEPTED MANUSCRIPT AC C EP TE D M AN U SC RI PT ACCEPTED MANUSCRIPT AC C EP TE D M AN U SC RI PT ACCEPTED MANUSCRIPT ACCEPTED MANUSCRIPT Auditory Rhyme Processing in Expert Freestyle Rap Lyricists and Novices: An ERP Study Keith Cross, Takako Fujioka Highlights EP TE D M AN US C RI PT Expert rappers and laypersons differ in ERP brain response to rhyme judgment tasks Aesthetic vs descriptive rhyme judgment task effects in laypersons, but not experts Groups differ in ERP sensitivity and laterality to degrees of rhyme congruence Evidence of integrated linguistic and musical feature processing in rappers’ brains AC C • • • • ACCEPTED MANUSCRIPT CRediT author statement Keith Cross: Conceptualization, Methodology, Data Curation, Writing- Original draft preparation, Visualization, Investigation, Writing- Reviewing and Editing. AC C EP TE D M AN US C RI PT Takako Fujioka. Conceptualization, Methodology, Data Curation, Software, Visualization, Investigation, Writing- Reviewing and Editing.