Are clang and bank as easy to say for English speakers
with speech production problems as German speakers saying Klang
and Bank?
N. Lollini,
N. Miller, D. Howard,
Background:
This study
investigates factors which influence output accuracy in people with speech
output problems after stroke, in particular in apraxia of speech (AoS), and
phonemic paraphasia (PhPa). It is difficult to
clearly differentiate between AoS and PhPa due to
similar symptomalogy between the two disorders (e.g.
one-feature substitutions, difficulty initiating speech). The nature of breakdown
in AoS continues to be a subject of debate in regards to the precise location
of impairment referring to models of speech production. One inroad into
establishing where disruption lies concerns which output variables influence
performance. Our study exploits the fact that English and German share a large
number of (near) homophones in order to determine whether production in
speakers with impaired output is influenced by language-specific or ‘universal’
factors (e.g. factors relating to motor execution). We compared the influence
of the language-specific variables (word frequency, phonotactic predictability,
lexicality and phonological neighbourhood density)
and language-independent factors (number of phonemes, syllables and clusters) on
accuracy of word repetition in German and English speakers.
Numerous factors
have been shown to influence word comprehension/production including imageability, frequency, age of acquisition and word class.
Two further factors are phonological neighbourhood density (ND) and
probabilistic phonotactics (PROB). These are the
focus of this study. ND is a measure of the extent to which a sequence of
sounds is similar to known real words. Thus the ND of a sequence of sounds
equals the number of real words (neighbours) the target sequence is similar to
in the target’s phonological neighbourhood (cat → mat, hat, pat). Targets
with many neighbours have high ND (e.g. cat); targets with few or no neighbours
have low ND (e.g. elf → elk). PROB deals with the frequency with which a
particular sequence of phonemes occurs in a language. For instance /sp-/ in
word initial position in English has a high probability, whereas /-sp/ in final
position has low predictability. As a sequence of phonemes “spade”
is much more likely to occur than “wasp”.
Numerous studies
have found effects of ND and PROB on speech perception (Luce & Pisoni, 1998; Vitevitch &
Luce, 1998, 1999). The aim of this study was to see:
Do ND and PROB
influence speech output in English and German native speakers with speech
programming disorders after stroke?
If so is the
effect facilitatory or inhibitory and is the effect in the same direction for
all speakers?
This study exploits the
fact that English and German share a large number of (near) homophones. If
speech output complexity in AoS is determined entirely by motor factors, then
we might expect clang and Klang, bank and Bank to
pose similar problems fror English and German
speakers. If production varies significantly across languages then we might
rather suspect that language specific variables exercise a strong influence on
speech output in this disorder.
We compared the
influence of the language-specific variables word frequency, phonotactic
predictability, lexicality and phonological neighbourhood
density on accuracy of word repetition in German and English speakers.
Methods:
7 German and 7 English
speakers with post-stroke speech output impairment, matched on the English and
German versions of the Aachen Aphasia Test (EAAT; Miller, Willmes,
& de Bleser, 2000), each repeated a list of
stimuli including real and nonsense words. Each list contained 563 real words
that are (near) homophones across German and English (e.g. fiel-feel;
Bank-bank; Wende-vendor). The stimulus list for the
English speakers entailed another 59 nonsense words. Twenty of those nonsense
words were monosyllabic stimuli taken from Bailey and Hahn (2001) whereas the
remaining 39 items were mono- (N=20) and bisyllabic
(N=19) nonwords from Vitevitch
and Luce (1999). The German speakers’ list included 62 nonwords
of which 32 were real words in English and 30 were nonsense word in both
English and German. We examined speakers' accuracy in repeating these items and
compared differences in accuracy to differences in the language-specific
properties derived from the CELEX database (Baayen, Piepenbrock, & Gulikers,
1995).
Recorded
responses were transcribed phonetically. For the purpose of the work reported
here they were coded as right (no perceptually detected errors) or wrong
(perceived error). Errors were noted if there was a perceived addition,
omission, substitution, distortion, distorted substitution, transposition of
sounds, or if a word was preceded by trial and error struggle or intraword or intrasyllable
pauses. 10% of productions were transcribed by a second listener and
re-transcribed by the first transcriber. Inter-rater reliability was good (0.79 – 0.82).
Data for the
number of phonemes, syllables, clusters, phonological neighbours, word
frequency and phonotactic probability were derived from the CELEX database of
British English and German. Probability was defined as the sum of the log
transformed conditional probabilities of the next phoneme given the previous
phoneme; phoneme position within the onset, nucleus and coda of a syllable was
taken into account on this calculation. Neighbourhood density was computed
based on the single-edit distance definition. A neighbour can be obtained by
substituting, deleting, or adding exactly one phoneme of the target.
Logistic
regression was used to examine for the effects of ND and PROB on word
repetition accuracy for each subject, first when used as the sole predictor,
and secondly when log transformed word frequency, and the number of syllables,
phonemes and clusters had been entered into the regression. This second
analysis allows us to test for effects of ND and PROB on word repetition
accuracy once the effects of other variables already known to affect production
have been taken into account.
Results:
Only a moderate correlation
existed between accuracy on near-homophones for English and German subjects
(r=0.28, p<.001). There was only slight similarity between patients,
within-language chance-corrected correspondence was on average greater within
than between-language correspondence (9.2% vs. 5.2%; p<.00001).
Language-specific
variables were investigated by correlating differences in accuracy in English
and German with differences in word frequency, phonotactic predictability, lexicality and phonological neighbourhood density. In
simple correlations there was a small but significant effect of lexicality (r=0.114, p=.014, two tailed). Restricting the
analysis to the real word items in both languages, differences in accuracy
between the languages were significantly related to both log-transformed word
frequency (r=0.215, p<.001) and length-corrected phonotactic predictability
(r=0.099, p=.030, two tailed), but not phonological neighbourhood density
(r=0.040, p=0.38). When all three variables are entered into simultaneous multiple
regression, the effects of both differences in word frequency (t (478) =4.84,
p<.001) and phonotactic predictability (t (478) =2.35, p=.019) were
independently significant; the effect of the number of neighbours remained
non-significant (p=0.32).
We investigated the
effects of variables common to both German and English by correlating mean
accuracy for speakers combined across both languages with the number of
phonemes, syllables and consonant clusters in the target word. Simple
correlations showed accuracy in production was related to all three variables
(syllables, r=-0.27, p<.001; phonemes, r=-0.47, p<.001 and clusters,
r=-0.32, p<.001). When these variables were used as predictors in multiple
regression, there remained significant independent effects of the number of
phonemes (t (476) =4.63, p<.001) and the number of clusters (t (476) =3.83,
p<.001) but no effect of the number of syllables (t (476) =0.56, p=0.57).
Discussion:
It appears that only phonotactic probability has a significant effect on
production accuracy in these speakers with output impairment after stroke.
Phonotactic probability exerts a facilitatory effect in which words with higher
phonotatic probability are produced with more
accuracy compared to words with less probable sequences. In contrast,
phonological neighbourhood density does not have a significant effect on the
accuracy of productions in these German or English speakers.
Furthermore, both German and English speakers were asked to produce words
that were near-homophones in both languages. However, it appears that the two
speaker groups encountered difficulty of production with different words.
Consequently, the significant effect of phonotactic probability on the
production of auditorily presented stimuli probably
does not lie (solely) at the motor execution level. The results would have
shown a strong cross-language relationship but only a moderate correlation was
observed.
Considering the speech production models of Levelt
et al. (1999) and Dell et al. (1997) possible locations of ND and PROB can be
contemplated. The effect of ND could lie at three different levels in Levelt et al’s (1999) model. These levels include phonetic
encoding, articulation, and the lemma level. Substitution errors which are
phonologically similar to the target could be caused by an impaired monitoring
system or an impaired lemma retriever; those phonologically similar
substitution errors include phonological neighbours of a target (e.g. map
→ mat, tap, nap). In regards to the potential
locus of PROB Levelt et al. (1999) are not specific.
However, it appears plausible to describe possible loci of PROB at either the
level of phonological or of phonetic encoding. During the phonological process
phonemes are placed into metrical frames. The syllabary
is accessed during phonetic encoding which could offer an explanation why
frequently occurring sequences are easier to produce. The syllabary
holds the ‘plans’ for frequently occurring syllables in a language.
Similar to Levelt et al’s model, Dell et al.
(1997) would also place the location of the ND effect at the lemma level.
Phonological neighbours in addition to the target lemma are activated during
the process of lemma access due to interactive activation. If lemma access is
impaired the lemma of a phonological neighbour instead of the target lemma
could be selected (Dell et al., 1997). In addition, the variable of ND could
exert its effect at the level of phonological access in Dell et al’s (1997)
model. In this case, the correct lemma is accessed but the target phonemes are
substituted by other phonemes through the activation of phonological neighbours
of the target. A possible location of PROB in Dell et al’s (1997) model could
be the phoneme level at which phonemes are placed into phonological frames
(i.e. structure of a word). It does not offer an explanation though why more
frequent sequences should be easier to produce.
Explaining why we did not observe a significant effect of ND on the
speech production accuracy of the individuals with output impairment after
stroke despite strong indications of this being a significant factor in other
studies is clearly complex. One aspect might be that we included impaired
speakers as opposed to Vitevitch and Luce (1998,
1999) who recruited healthy speakers for their studies which saw significant
effects of ND. Consequently, measuring whether the participants’ responses were
either right or wrong might not have been sensitive enough to detect more
subtle signs of a significant effect of ND. Measuring instead reaction times or
completing an acoustic analysis instead of a perceptual one to determine
whether a response was correct or not may potentially have assisted in
detecting a significant effect of ND.
Despite attempting to recruit speakers with minimal aphasia, the impaired
speakers nevertheless displayed aphasia-like symptoms alongside their output
impairment. This co-existence of aphasia could have possibly masked the effect
of ND on the production accuracy of the participants. In addition, we employed
a repetition task and therefore did not tax lexical access. The level of
lexical access might be the potential location of the effect of ND.
Furthermore, the type of measurement used to obtain the ND for the stimuli
might have had an impact on the outcome. The single-phoneme edit distance
measurement was utilised in this investigation. However, a different measure
could have been employed. Such a measure could for instance consider the
frequency of the phonological neighbours or calculate neighbourhood according
to single feature distance rather than whole phoneme (hence mat would no longer be a neighbour of cat, but cad would remain so). The potential effects of these alternatives
on results await further analyses.
References
Baayen,
R. H., Piepenbrock, R. and Gulikers,
L. (1995). The CELEX lexical database
(CD-ROM), LDC,
Bauer, L. (2001)
Morphological Productivity.
Bailey, T. M. and Hahn, U.
(2001). Determinants of Wordlikeness:
Phonotactics or Lexical Neighbourhoods.
Journal of Memory and Language, 44,
568-591.
Luce, P.A. and Pisoni, D.B. (1998). Recognizing
spoken words: The neighborhood activation model. Ear & Hearing,
19,
1-36.
Miller N, Willmes K, and De
Bleser R. (2000). The psychometric
properties of the English language version of the Aachen Aphasia Test (EAAT). Aphasiology,
14, 683-722.
Vitevitch, M.S. and Luce, P.A.
(1998). When words compete: Levels of processing in perception of spoken words.
Psychological Science, 9, 325-329.
Vitevitch, M.S. and Luce, P.A.
(1999). Probabilistic phonotactics and
neighborhood activation in spoken word recognition. Journal of Memory
& Language, 40, 374-408.