Everyone knows what it sounds like to hear native speakers of languages we don’t know prattling away incomprehensibly—their voices become a wall of sound, one that’s all texture and cadence, as if they were making some sort of music. This effect tends to persist even with languages we are studying: first we can’t recognize a thing; then we start to snag the odd word here and there; and then whole phrases come into focus. But it can be quite a while before it becomes possible to actually follow what a native speaker is saying. I remember the first time when, sitting on a bus in Münster, Germany as a 19-year-old exchange student, I found myself actually eavesdropping, picking up snatches of other people’s conversations. It was a triumphant moment.
And what about the fact that people in certain languages (e.g. Spanish and Japanese) seem to talk more quickly than those in others? A recent article published in the journal Language (put out by the Linguistic Society of America) and written up by Jeffrey Kluger in TIME Magazine describes a study in which researchers measured the syllables-per-second spoken by native speakers of several different languages and correlated this data with the average “information density” of these languages. “Denser” languages are those that contain, on average, the highest amounts of information per syllable. Syllables that constitute entire words (“truck,” “love,” “hate”) are high in information, while some syllables in longer words contain no information at all (think about the middle “i” in “intelligence”). So the number of low- or no-information syllables a language uses determines how quickly it is likely to be spoken. An interesting corollary is that pronunciation trumps etymology in this calculation. The French word “facile” contains only two syllables (fa-cile), while its Italian counterpart “facile” (fa-ci-le) has three. So one would expect Italian to have a lower average information density than French and therefore to be spoken more quickly—which it is. So does this mean we communicate the same amount of information in the same amount of time no matter what language we are speaking? According to this study, that may well be the case.
I wonder whether anyone’s done a study on the comparative lengths of translated texts. Is there a way to explain how and why languages expand and shrink in translation? Translating from German to English, I’ve noticed that the English translations are invariably shorter than the originals in terms of overall length (i.e. number of characters) but at the same time longer in terms of word count. German tends to use fewer and longer words than English. But this doesn’t affect their relative information density, since the longer German words tend to be amalgams of shorter words. When you’re counting syllables, it doesn’t matter whether the individual words are short or long. And Latinate words tend to have more syllables than those with Indo-Germanic roots. What about words that derive from ancient Greek? What effect does their preponderance or scarcity in a language have on average word length and information density? If anyone knows a good account of this, I’d love to hear about it.