Category Archives: Spanish

Using only orthographic features to identify a language

As I talked about in an earlier post, the Latin alphabet is now used to represent more languages than any other script system. But the Latin alphabet doesn’t have a character for every sound in the world; it was created to represent Latin, and it’s been adapted somewhat since then, but there are still plenty of natural language sounds out there that have no corresponding Latin character. In order to represent these sounds in the Latin alphabet without creating a new letter—an expensive proposition—we’ve relied on diacritical marks, as seen in ù, ú, û, ũ, ū, ŭ, ü, ủ, ů, ű, ǔ, ȕ, ȗ, ư, ụ, ṳ, ų, ṷ, ṵ, ṹ, ṻ, ǖ, ǜ, ǘ, ǖ, ǚ, ừ, ứ, ữ, ử, ự, and ʉ, as well as combinations of letters, such as ch.

But sometimes diacritics and multigraphs aren’t used to represent new sounds. Sometimes they’re used for ideological reasons, to create the illusion that a language is more unique than it actually may be. For example, when planning the Basque orthography, the language moguls decided to use tx to represent the same sound that ch represents in Spanish. Why create a new digraph when an existing one would serve perfectly well? One reason, surely, is because the invented digraph would emphasize that Basque is not Spanish—that the Basques are not Spanish!

There’s an interesting result of all this: Because of the unique characters and character combinations of many languages, it’s fairly easy to tell what language something is written in, even if you don’t know that language. If you see an ü and a sch in some text, it’s probably German, and if there was a ß, that’d give it away for sure.

I hypothesized that this would even work with made-up words.

This idea was the basis for an art project of mine, in which I considered the most defining characteristics of several languages: average number of letters per word, most common beginning and ending letters, most frequent letters, and unique or stereotypical multigraphs, characters or punctuation marks. As a result, I’ve created, among others, the most English fake English word.


The resulting words are interesting enough, but instead of simply printing them here like any other text, I traced their typeset forms and filled them in with watercolor, arousing meditations on mechanical reproduction, handwriting and creation.

See if you can tell which languages the rest of these made-up words are written in.







Why some spelling reforms fail and others succeed

This week on the BBC Radio show Fry’s English Delight, the topic was spelling.  (You can listen to the episode until next Monday. Also, thanks to the Virtual Linguist for the heads up on her blog.)

We like to think of English spelling as absurd and unruly. But it wasn’t always this way: When it was first written down, English enjoyed an almost one-to-one letter-to-sound correspondence. But, as Fry outlines, English spelling received several layers of outside influence throughout history. The Norman French wanted English spelling to be a little more Frenchy (hence mice instead of mys), publishers thought the spellings of certain words should remind readers of their Holy Latin Origins (hence debt instead of det), and the Flemish typesetters were apparently homesick and thought English words should be spelled like Flemish words (hence ghost instead of gost). Some more fun examples can be found in this Mental Floss article.


Today, Fry says, “Our alphabet isn’t exactly fit for representing our language in writing.” He points inefficiencies, such as our 11 different ways of spelling the /e/ sound: hey, gauge, weigh, pay, staid, lei…

Inefficiencies notwithstanding, our alphabet isn’t so bad. After all, we seem to get along just fine. Some proponents of reform say that English’s wacky spelling slows learning, but it’s not the worst, by far—Japanese, anyone? As I’ve blogged about before, a complicated writing system might even be making us smarter.

Why do spelling reforms work sometimes but fail other times? I think there are two primary reasons: convention and identification.

John Hart, a 16th century spelling reformer, recognized that even though his proposed system was objectively better and easier to learn than current English spelling, it would seem more difficult to people who were already accustomed to English spelling. Why should they have to relearn everything? Moreover, people need to be able to read things that were written before the reform—so many people would have to learn both forms anyway.

The other main issue with spelling reform is that reformers propose that English spelling should correspond to pronunciation. But, as David Crystal says in the broadcast, “Any spelling reform system which tries to reflect pronunciation… which pronunciation do you use?” Crystal suggests that this very issue may point to the strength of current English spelling: that it works for the many different pronunciations that English has around the world. In other words, if a speaker doesn’t identify with the proposed reformed spelling system, they will reject it. An acceptable reform to the system of English spelling must appeal to all English speakers.

Spelling reform is certainly possible, and it’s happening right now. It’s just that a system-wide, overnight reform is unlikely. Instead, it goes nice and slow, championed by the democracy of English writers rather than any reforming body in particular. For example, the alternate spelling “nite” is popping up more and more—in my opinion, it’s only a matter of time before it’s accepted in more formal arenas.

For an overview of spelling reform in other languages, check out the Spelling reform Wikipedia article. The article on Simplified Chinese characters, an example of government effort to increase literacy through spelling reform, may be of particular interest.

For a systematized examination of the sense behind English spelling, check out English Isn’t Crazy, by Diana Hanbury King.

Does ALL CAPS always mean shouting?

We don’t think about capital letters too much. We stick them on automatically at the start of sentences and proper names, and that’s about it. We see them on signs and newspaper headlines and generally think nothing of it. But there’s one place where they do call our attention: on the Internet.

ONLINE, IF WE SEE SOMETHING WRITTEN IN ALL CAPS, ESPECIALLY IF WE KNOW IT WAS WRITTEN BY A PERSON (NOT A MACHINE), we tend to think of it as shouting. On some level, this seems intuitive. It’s a visual metaphor: If shouting is talking with a big voice, then shouting in text would be writing with big letters. (Anyone figure out how to whisper?)

But is that the end of the story?

I investigated this question in my master’s thesis, and I found another important motivation for writing something in all caps: semantic highlighting. For example, if I write “I just wanted to say thanks for the PRESENT you got me,” it’s unlikely that I’m shouting the word “present.” Instead, I wanted to highlight that word’s meaning—and often in text boxes the only way to do so is by using capital letters.

Furthermore, I found that there is yet another use of all caps that has nothing to do with shouting: It’s to be colloquial. This usage is most often seen among the elderly, some of whom apparently enable caps lock so that they can see the letters more clearly—after all, they’re bigger.

But in Spain, for example, where I carried out my Master’s research, this behavior is much more deep-rooted: Written Spanish uses accent marks to distinguish certain sounds, but for a long time these accent marks were considered optional above capital letters. Therefore, people began to write in capital letters when they wanted to type faster and more casually, leaving off the accent marks. Hence all caps for colloquial writing.

This all goes to show that there can be multiple explanations for even the things we most take for granted—and that things, especially visual conventions, change across cultures.