The acoustic-phonetic characteristics of speech sounds are influenced by their linguistic position in an utterance. Because of acoustic-phonetic differences between different speech sounds, sounds... Show moreThe acoustic-phonetic characteristics of speech sounds are influenced by their linguistic position in an utterance. Because of acoustic-phonetic differences between different speech sounds, sounds vary in the amount of speaker information they contain. However, do spectral and durational differences between various realizations of the same sound that were sampled from different linguistic positions also impact speaker information? We investigated speaker discrimination in [-focus] versus [+focus] word realizations. Twenty-one Dutch listeners participated in a same-different speaker discrimination task, using stimuli varying in focus, vowel ([a.], [u]), and word context ([._k], [v_t]), spoken by 11 different speakers. Results show that an effect of focus on speaker-dependent information was present, but limited to words containing [u]. Moreover, performance on [u] words was influenced by (interactions of) word context and trial type (same- vs. different-speaker). Context-dependent changes in a speech sound’s acoustics may affect its speaker-dependent information, albeit under specific conditions only. Show less
Filled pauses are widely considered as a relatively consistent feature of an individual’s speech. However, acoustic consistency has only been observed within single-session recordings. By comparing... Show moreFilled pauses are widely considered as a relatively consistent feature of an individual’s speech. However, acoustic consistency has only been observed within single-session recordings. By comparing filled pauses in two recordings made >2.5 years apart, this study investigates within-speaker consistency of the vowels in the filled pauses uh and um, in both first language (L1) Dutch and second language (L2) English, produced by student speakers who are known to converge in other speech features. Results show that despite minor within-speaker differences between languages, the spectral characteristics of filled pauses in L1 and L2 remained stable over time. Show less
In Moroccan Dutch, /s/ has been claimed to be pronounced as retracted [s] (towards /ʃ/) in certain consonant clusters. Recently, retracted s-pronunciation has also been attested in endogenous Dutch... Show moreIn Moroccan Dutch, /s/ has been claimed to be pronounced as retracted [s] (towards /ʃ/) in certain consonant clusters. Recently, retracted s-pronunciation has also been attested in endogenous Dutch. We tested empirically whether Moroccan Dutch [s] is indeed more retracted than endogenous Dutch [s] in relevant clusters. Additionally, we tested whether the inter-speaker variation of /s/ is smaller between Moroccan Dutch speakers than between endogenous Dutch speakers, as expected if retraction of /s/ would be used as identity marker in in-group conversations in Moroccan Dutch. The [s] realizations of 21 young, male Moroccan Dutch and 21 endogenous Dutch speakers were analyzed. Analyses of the spectral centre of gravity (CoG) show that both groups of speakers had more retracted pronunciations of [s] in typically retracting contexts than in typically non-retracting contexts. However, Moroccan Dutch speakers had higher CoG in both contexts than endogenous Dutch speakers, refuting the stronger retraction expected in Moroccan Dutch speakers. The inter-speaker variation was larger between Moroccan Dutch speakers than between endogenous-Dutch speakers, refuting the expected usage of /s/ retraction as a group identity marker. Show less
In forensic speech science, nasals are often reported to be particularly useful in characterizing speakers because of their low within-speaker and high between-speaker variability. However,... Show moreIn forensic speech science, nasals are often reported to be particularly useful in characterizing speakers because of their low within-speaker and high between-speaker variability. However, empirical acoustic data from nasal consonants indicate that there is a somewhat larger role for the oral cavity on nasal consonant acoustics than is generally predicted by acoustic models. For example, in read speech, nasal consonant acoustics show lingual coarticulation that differs by nasal consonant, and syllabic position also seems to affect realizations of nasal consonants within speakers. In the current exploratory study, the within and between-speaker variation in the most frequent nasals in Standard Dutch, /n/ and /m/, was investigated. Using 3,695 [n] and 3,291 [m] tokens sampled from 54 speakers’ spontaneous telephone utterances, linear mixed-effects modelling of acoustic-phonetic features showed effects of phonetic context that differed by nasal consonant and by syllabic position. A following speaker-classification test using multinomial logistic regression on the acoustic-phonetic features seems to indicate that nasals displaying larger effects of phonetic context also perform slightly better in speaker classification, although differences were minor. This might be caused by between-speaker variation in the degree and timing of lingual coarticulatory gestures. Show less
Fluency in terms of speed of speech and (lack of) hesitations such assilent and filled pauses (‘uhm’s) is part of oral proficiency. Languageassessment rubrics therefore include aspects of fluency.... Show moreFluency in terms of speed of speech and (lack of) hesitations such assilent and filled pauses (‘uhm’s) is part of oral proficiency. Languageassessment rubrics therefore include aspects of fluency. Measuringfluency, however, is highly time-consuming because of the manuallabour involved. The current paper aims to automatically measureaspects of L2 fluency, including filled pauses, in both Dutch andEnglish. A revised existing script and a new script for filled pausesare tested on accuracy. We also gauged whether the outcomes ofthe new script could be used for language assessment purposes byrelating the outcomes to human judgements. Without furtherinvestigations, the current script should not (yet) be used for thepurpose of assessing fluency automatically in (high-stakes) oralproficiency assessment. However, the performance of the scriptsfor measuring aspects of fluency globally and quickly are promising,especially given their stability in accuracy on new corpora. Show less
Fluency in terms of speed of speech and (lack of) hesitations such as silent and filled pauses (‘uhm’s) is part of oral proficiency. Language assessment rubrics therefore include aspects of fluency... Show moreFluency in terms of speed of speech and (lack of) hesitations such as silent and filled pauses (‘uhm’s) is part of oral proficiency. Language assessment rubrics therefore include aspects of fluency. Measuring fluency, however, is highly time-consuming because of the manual labour involved. The current paper aims to automatically measure aspects of L2 fluency, including filled pauses, in both Dutch and English. A revised existing script and a new script for filled pauses are tested on accuracy. We also gauged whether the outcomes of the new script could be used for language assessment purposes by relating the outcomes to human judgements. Without further investigations, the current script should not (yet) be used for the purpose of assessing fluency automatically in (high-stakes) oral proficiency assessment. However, the performance of the scripts for measuring aspects of fluency globally and quickly are promising, especially given their stability in accuracy on new corpora. Show less
It has been claimed that filled pauses are transferred from the first (L1) into the second language (L2), suggesting that they are not directly learned by L2 speakers. This would make them usable... Show moreIt has been claimed that filled pauses are transferred from the first (L1) into the second language (L2), suggesting that they are not directly learned by L2 speakers. This would make them usable for cross-linguistic forensic speaker comparisons. However, under the alternative hypothesis that vowels in the L2 are learnable, L2 speakers adapt their pronunciation. This study investigated whether individuals remain consistent in their filled pause realization across languages, by comparing filled pauses (uh, um) in L1 Dutch and L2 English by 58 females. Next to the effect of language, effects of the filled pauses' position in the utterance were considered, as these are expected to affect acoustics and also relate to fluency. Mixed-effects models showed that, whereas duration and fundamental frequency remained similar across languages, vowel realization was language-dependent. Speakers used um relatively more often in English than Dutch, whereas previous research described speakers to be consistent in their um:uh ratio across languages. Results furthermore showed that filled-pause acoustics in the L1 and L2 depend on the position in the utterance. Because filled pause realization is partially adapted to the L2, their use as a feature for cross-linguistic forensic speaker comparisons may be restricted. Show less
Linguistic structure co-determines how a speech sound is produced. This study therefore investigated whether the speaker-dependent information in the vowel [aː] varies when uttered in different... Show moreLinguistic structure co-determines how a speech sound is produced. This study therefore investigated whether the speaker-dependent information in the vowel [aː] varies when uttered in different word classes. From two spontaneous speech corpora, [aː] tokens were sampled and annotated for word class (content, function word). This was done for 50 male adult speakers of Standard Dutch in face-to-face speech (N = 3,128 tokens), and another 50 male adult speakers in telephone speech (N = 3,136 tokens). First, the effect of word class on various acoustic variables in spontaneous speech was tested. Results showed that [aː]s were shorter and more centralized in function than content words. Next, tokens were used to assess their speaker-dependent information as a function of word class, by using acoustic-phonetic variables to (a) build speaker classification models, and (b) compute the strength-of-evidence, a technique from forensic phonetics. Speaker-classification performance was somewhat better for content than function words, whereas forensic strength-of-evidence was comparable between the word classes. This seems explained by how these methods weigh between- and within-speaker variation. Because these two sources of variation co-varied in size with word class, acoustic word-class variation is not expected to affect the sampling of tokens in forensic speaker comparisons. Show less
The relative contributions of static and dynamic formant representations to speaker-specificity were investigated in conversational speech and in two vowels varyingin inherent spectral change.... Show moreThe relative contributions of static and dynamic formant representations to speaker-specificity were investigated in conversational speech and in two vowels varyingin inherent spectral change. Using polynomial fits, the contribution of dynamicformant coefficients to speaker-specificity relative to that of the formant interceptwas investigated in the diphthongal vowel [ei] taken from English and Dutch conversationalspeech. The [ei] tokens were sampled from various linguistic contextsand analysed in an LR approach. Results show that formant dynamics containspeaker-specific information in conversational speech even though the high contextualvariation seems to reduce its effect relative to that reported by earlier work.Vowels differ in inherent dynamicity and therefore, the added value of dynamicformant information to speaker-specificity was also compared between vowels differingin inherent spectral change. Using Dutch data, the contribution of formantdynamics to speaker-specificity was compared between [ei] and [aː] tokens producedby the same speakers. Formant dynamics in conversational speech only contributedto speaker-specificity in the diphthong [ei], not in the monophthong [aː]. Show less
Although previous work has shown that some speech sounds are more speaker-specific than others, not much is known about the speaker information of the same segment in different linguistic contexts.... Show moreAlthough previous work has shown that some speech sounds are more speaker-specific than others, not much is known about the speaker information of the same segment in different linguistic contexts. The present study, there- fore, investigated whether Dutch fricatives /s/ and /x/ from telephone dialogues contain differential speaker informa- tion as a function of syllabic position and labial co-articulation. These linguistic effects, established in earlier work on read broadband speech, were first investigated. Using a corpus of Dutch telephone speech, results showed that the telephone bandwidth captures the expected effects of perseverative and anticipatory labialization for dorsal fricative /x/, for which spectral peaks fall within the telephone band, but not for coronal fricative /s/, for which the spectral peak falls outside the telephone band. Multinomial logistic regression shows that /s/ contains slightly more speaker information than /x/ in telephone speech and that speaker information is distributed across the speech signal in a sys- tematic way; even though differences in classification accuracy were small, codas and tokens with labial neighbors yielded higher scores than onsets and tokens with non-labial neighbors for both /s/ and /x/. These findings indicate that speaker information in the same speech sound is not the same across linguistic contexts. Show less
The phoneme /h/ is absent in French and its acquisition has been described as being difficult for second language learners of Dutch, a language with /h/ in its phoneme inventory. In this study,... Show moreThe phoneme /h/ is absent in French and its acquisition has been described as being difficult for second language learners of Dutch, a language with /h/ in its phoneme inventory. In this study, several factors were examined that may affect the production of /h/ by Belgian-French learners of Dutch. Specifically, the factors included in this exploratory study were (1) L1-to-L2 transfer, (2) semantic contrastiveness, (3) the monitoring of one’s speech, and (4) educational grade. L1-to-L2 transfer was operationalized as the effect of liaison/elision contexts on /h/-production. The expectation was liaison contexts might transfer and would therefore hinder /h/-production. Semantic contrasts in minimal pairs including an h-initial word would elicit more /h/-productions if that word was contrasted with an empty onset than an onset (oor-hoor) filled by some other consonant (hand-tand). If a speaker pays more attention to his/her speech in an increased-monitoring task, the speaker is expected to produce /h/ more often, and finally it was expected that increased exposure to Dutch would result in more correct productions.In a cross-sectional study, students from the first, third and sixth grades of secondary education (60 in total, aged between 12 years and 19 years old) took part in two reading-aloud tasks, which were assumed to differ in the degree of speech monitoring they require. The first task was a text, with which L1-to-L2 transfer was assessed, and the second a list of minimal pairs containing h-onsets contrasting with either empty or filled onsets. Monitoring was assessed by comparing results between reading tasks.Results showed that increased monitoring positively influenced the numbers of [h]s produced, but that L1-to-L2 transfer of liaison/elision contexts did not occur. A small difference between conditions was found, but in the opposite direction. There was large between-learner variability and no performance increase with amount of exposure from first to sixth grade. Overall, performance left much room for improvement relative to native Dutch speakers and to the learners’ teacher. Further research is needed to better understand the development of French-speaker learners’ production of Dutch /h/. Show less
We introduce a targeted language game approach using the visual world, eye-movement paradigm to assess when and how certain intonational contours affect the interpretation of utterances. We created... Show moreWe introduce a targeted language game approach using the visual world, eye-movement paradigm to assess when and how certain intonational contours affect the interpretation of utterances. We created a computer-based card game in which elliptical utterances such as "Got a candy" occurred with a nuclear contour most consistent with a yes-no question (H* H-H%) or a statement (L* L-L%). In Experiment 1 we explored how such contours are integrated online. In Experiment 2 we studied the expectations listeners have for how intonational contours signal intentions: do these reflect linguistic categories or rapid adaptation to the paradigm? Prosody had an immediate effect on interpretation, as indexed by the pattern and timing of fixations. Moreover, the association between different contours and intentions was quite robust in the absence of clear syntactic cues to sentence type, and was not due to rapid adaptation. Prosody had immediate effects on interpretation even though there was a construction-based bias to interpret "got a" as a question. Taken together, we believe this paradigm will provide further insights into how intonational contours and their phonetic realization interact with other cues to sentence type in online comprehension. Show less
Heeren, W.F.L.; Vaerenberg, B.; Coene, M.; Daemers, K.; Govaerts, P.; Avram, A.; ... ; Volpato, F. 2010