This chapter presents an experimental study of consecutive interpreting which investigates whether: (a) judged fluency can be predicted from computer-based quantitative prosodic measures including... Show moreThis chapter presents an experimental study of consecutive interpreting which investigates whether: (a) judged fluency can be predicted from computer-based quantitative prosodic measures including temporal and melodic measures. Ten raters judged six criteria of accuracy and fluency in two consecutive interpretations of the same recorded source speech, from Chinese ‘A’ into English ‘B’, by twelve trainee interpreters (seven undergraduates, five postgraduates). The recorded interpretations were examined with the speech analysis tool Praat. From a computerized count of the pauses thus detected, together with disfluencies identified by raters, twelve temporal measures of fluency were calculated. In addition, two melodic measures, i.e., pitch level and pitch range, were automatically generated. These two measures are often considered to be associated with speaking confidence and competence. Statistical analysis shows: (a) strong correlations between judged fluency and temporal variables of fluency; (b) no correlation between pitch range and judged fluency, but a moderate (negative) correlation between pitch level and judged fluency; and (c) the usefulness of effective speech rate (number of syllables, excluding disfluencies, divided by the total duration of speech production and pauses) as a predictor of judged fluency. Other important determinants of judged fluency were the number of filled pauses, articulation rate, and mean length of pause. The potential for developing automatic fluency assessment in consecutive interpreting is discussed, as are implications for informing the design of rubrics of fluency assessment and facilitating formativeassessment in interpreting education. Show less
In this paper we argue that a comparison of vowel systems of L1 and L2 should not be limited to measuring formants and vowel duration in speech production but should also include a contrastive... Show moreIn this paper we argue that a comparison of vowel systems of L1 and L2 should not be limited to measuring formants and vowel duration in speech production but should also include a contrastive study of the perceptual representations of the vowel systems entertained by native and non‐native users of the target language. An incorrect perceptual representation of the target sounds often lies at the heart of pronunciation difficulties of L2 speakers. To facilitate such perceptual research the present paper offers a universal vowel space in which 43 artificial sounds are sampled at perceptually equidistant steps along the dimensions of vowel height (7 steps), backness/lip rounding (9 steps). Duration can be added as an additional variable in as many steps as required by the researcher. The facility was provisionally tested in a study of the perceptual representation of the monophthongs of American English by American native listeners and by Persian learners of English. Several ways of analyzing the results of such a study are presented. The results show that native listeners distinguish tense and lax members of vowel pairs in English primarily by differences in vowel quality, while the Persian L2 listeners use vowel duration as the primary cue and largely ignore the quality cue. Show less
This study investigates the extent to which word stress facilitates word disambiguation in Papuan Malay. Although there is consistent acoustic support for word stress patterns in this language, the... Show moreThis study investigates the extent to which word stress facilitates word disambiguation in Papuan Malay. Although there is consistent acoustic support for word stress patterns in this language, the function of word stress in Indonesian languages, including Papuan Malay, has been disputed in several studies. Based on a word list of phonetically transcribed Papuan Malay words, an analysis of wordembeddings was carried out. The number of words that are embedded in other words was shown to explain the role of word stress in the word recognition processes crosslinguistically. The results of the lexical analysis indicate that Papuan Malay is somewhat similar to English, a language where word stress differences are mainly signalled by vowel quality and to a lesser extent by suprasegmental cues. The results are discussed within the context of cross-linguistic cues to word stress and shed a new light on the controversy concerning word stress in Indonesian languages. Show less
Identifying speakers by their spoken output is a specialist task for forensic investigators. In the present study we focused on cross-linguistic speaker (Chinese, English, Dutch) identification... Show moreIdentifying speakers by their spoken output is a specialist task for forensic investigators. In the present study we focused on cross-linguistic speaker (Chinese, English, Dutch) identification based on (components of) English stops and fricatives, /p, b, t, d, k, g/ and the fricatives /f, v, θ, ð, s, z, ʃ, ʒ/. English noise bursts’ contribution to native language identification is presented and the special tokens which contribute the most were analyzed. Show less
Sloos, M.; Dijkstra, J.; Heuven, V.J.J.P. van 2019
West-Frisian has a highly frequent suffix -/ən/ in which the schwa is usually deleted. This results in a single nasal which is analysed as ‘syllabic’, at least after obstruents. However, it is... Show moreWest-Frisian has a highly frequent suffix -/ən/ in which the schwa is usually deleted. This results in a single nasal which is analysed as ‘syllabic’, at least after obstruents. However, it is unclear what happens if schwa deletion occurs after a stem-final nasal as in hûn-en ‘dog.PL’. We consider several options, including nasal deletion, nasal contraction, and gemination. We compare the duration of an underlyingly single nasal in stem-final position with that of the nasal after schwa deletion in -/nən/ as in hûn ~ hûnen. The results reveal that the nasal in hûnen after schwa deletion is more than twice as long as in hûn and also longer than after schwa deletion in -/tən/. This suggests that the nasal is geminated. We discuss the status of this nasal in light of the fact that gemination has not been reported elsewhere in the phonology of West-Frisian. Show less
Automatic identification of a speaker’s native language background may have forensic applications. This paper explores the feasibility of automatic identification of the native language background... Show moreAutomatic identification of a speaker’s native language background may have forensic applications. This paper explores the feasibility of automatic identification of the native language background of a foreign speaker of English, using phonetically interpretable measurements. The production of the ten monophthongs of (American) English by Dutch, Mandarin Chinese and American speakers was used as a test case. Vowel formants F1 (corresponding to articulatory vowel height), F2 (capturing vowel backness and lip rounding) and vowel duration were extracted. Clearly different duration and patterning of the vowels in the vowel space were seen. Automatic classification of the speaker’s native language was 90 percent correct when all acoustic parameters were used as predictors. Language identification was slightly poorer when only formant data were used (85% correct) and substantially poorer – but much better than chance – when only vowel duration was used (60% correct). We conclude that vowel duration provides a weaker cue to foreign-accent identification in English than the spectral properties but that the combination of both information sources yields the best results. Show less
The present study applied functional partition to investigate disyllabic lexical tonal pattern categories in an underresourced Chinese dialect, Jinan Mandarin. A two-stage partitioning procedure... Show moreThe present study applied functional partition to investigate disyllabic lexical tonal pattern categories in an underresourced Chinese dialect, Jinan Mandarin. A two-stage partitioning procedure was introduced to process a multi-speaker corpus that contains irregular lexical variants in a semi automatic way In the first stage, a program provides suggestions for the phonetician to decide the lexical tonal variants for the recordings of each word, based on the result of a functional k-means partitioning algorithm and tonal information from an available pronunciation dictionary of a related Chinese dialect, i.e. Standard Chinese. The second stage iterates a functional version of k-means partitioning with silhouette-based criteria to abstract an optimal number of tonal patterns from the whole corpus, which also allows the phoneticians to adjust the results of the automatic procedure in a controlled way and so redo partitioning for a subset of clusters.The procedure yielded eleven disyllabic tonal patterns for Jinan Mandarin, representing the tonal system used by contemporary Jinan Mandarin speakers from a wide range of age groups. The procedure used in this paper is different from previous linguistic descriptions which were based on more elderly speakers' pronunciations . This method incorporates phoneticians' linguistic knowledge and preliminary linguistic resources into the procedure of partitioning. It can improve the efficiency and objectivity in the investigation of lexical tonal pattern categories when building pronunciation dictionaries for underresourced languages. Show less
Automatic identification of a speaker’s native language background may have forensic applications. This paper explores the feasibility of automatic identification of the native language background... Show moreAutomatic identification of a speaker’s native language background may have forensic applications. This paper explores the feasibility of automatic identification of the native language background of a foreign speaker of English, using phonetically interpretable measurements. The production of the ten monophthongs of (American) English by Dutch, Mandarin Chinese and American speakers was used as a test case. Vowel formants F1 (corresponding to articulatory vowel height), F2 (capturing vowel backness and lip rounding) and vowel duration were extracted. Clearly different duration and patterning of the vowels in the vowel space were seen. Automatic classification of the speaker’s native language was 90 percent correct when all acoustic parameters were used as predictors. Language identification was slightly poorer when only formant data were used (85% correct) and substantially poorer – but much better than chance – when only vowel duration was used (60% correct). We conclude that vowel duration provides a weaker cue to foreign-accent identification in English than the spectral properties but that the combination of both information sources yields the best results. Show less
Restrictive and appositive relative clauses differ in their meaning and structure. The first restrict the class to which the antecedent refers, whereas the latter denote additional information on... Show moreRestrictive and appositive relative clauses differ in their meaning and structure. The first restrict the class to which the antecedent refers, whereas the latter denote additional information on the antecedent. In terms of structure, this difference concerns the relation between antecedent and relative clause, which is either narrow (restrictives) or loose (appositives). How these relations are encoded in prosody is the topic of investigation. Although there is considerable agreement on what prosodic cues distinguish restrictives and appositives across languages, claims mainly come from prescriptive literature. The current study investigates the structure-prosody interface experimentally by means of perception tests for Dutch and German. Results indicate that these languages differ in how prosody signals structural cohesion or breaking. Show less
We measured F1, F2 and duration of ten English monophthongs produced by American native speakers and by Danish, Norwegian, Swedish, Dutch, Hungarian and Chinese L2 speakers. We hypothesized that (i... Show moreWe measured F1, F2 and duration of ten English monophthongs produced by American native speakers and by Danish, Norwegian, Swedish, Dutch, Hungarian and Chinese L2 speakers. We hypothesized that (i) L2 speakers would approximate the English vowels more closely as the phonological distance between the L2 and English is smaller, and (ii) English vowels of L2 speaker groups will be more similar as the L2s are closer to one another. Comparison of acoustic vowel diagrams and Linear Discriminant Analyses (LDA) confirm the hypotheses, with one exception: Dutch speakers deviate more from L1 English than the Scandinavian groups. The Interlanguage Speech Intelligibility Benefit was convincingly simulated by the LDA. Show less
There is increasing evidence that non-native speech is more readily understood by listeners who share the native-language background with the speakers. Mandarin-accented English can be expected to... Show moreThere is increasing evidence that non-native speech is more readily understood by listeners who share the native-language background with the speakers. Mandarin-accented English can be expected to be better understood by Mandarin listeners than by American native listeners. The most likely reason for the effect would be that the non-native listeners fruitfully use their (intuitive) knowledge of the interfering source language (Mandarin) to classify the sounds as intended by the speaker (Cutler 2012). This phenomenon has been called the Interlanguage Speech Intelligibility Benefit (or ISIB) in its weak version (Bent & Bradlow 2003). There is also a strong version of the ISIB hypothesis which states that any non-native speaker of a language will be more intelligible to any non-native listener, simply because foreigners tend to speak more carefully and slowly than native speakers of the target language. I will draw on several published intelligibility studies, in which speakers and listeners from a wide variety of native-language backgrounds (including L1 English speakers and listeners) communicate with one another in English (Smith & Rafiqzad 1979, Bent & Bradlow 2003, Wang 2007, Van Heuven & Wang 2007, Wang & Van Heuven 2014), to assess the validity of the ISIB claim. I will show that the ISIB effect is found only occasionally and inconsistently when it is quantified in an absolute way. Generally, native listeners of the target language outperform any L2 listener, even when the L2 listener has the same mother tongue as the L2 speaker. However, if we quantify the ISIB in a relative manner, where R-ISIB is defined as the discrepancy between the actual intelligibility and the score predicted from linear addition of main effects of speaker and listener language background, the notion of interlanguage benefit begins to make more sense. It then appears that the combination of a speaker and listener who do not share the same native language suffers from a negative R-ISIB (even if one interlocutor is a native speaker of the vehicle of communication), but that any combination of speakers and listeners sharing the same mother tongue (whether L1 or L2 speakers of the vehicle of communication) show a consistently positive R-ISIB. Show less
Michaux, M.; Caspers, J.; Heuven, V.J.J.P. van; Hiligsmann, P. 2015
This tutorial-like presentation provides a survey of acoustical correlates of word and sentence stress, with emphasis on Germanic languages such as Dutch and English. It also reviews what is known... Show moreThis tutorial-like presentation provides a survey of acoustical correlates of word and sentence stress, with emphasis on Germanic languages such as Dutch and English. It also reviews what is known about the perceptual cue value of the acoustic correlates of stress, and show that highly reliable correlates are not necessarily strong perceptual cues, and conversely that the strongest perceptual cue (pitch change) is an unreliable correlate. Show less