What You See is What They Wrote? Thoughts on Latin Spelling

Wolfgang de Melo

Contrary to popular belief, we have a very good idea of what Classical Latin sounded like; Nicholas Swift’s piece on this site explains how we know what we know. But, again contrary to popular belief, we often do not know equally well what Classical Latin actually looked like. I am writing this piece because every single year, without fail, there is at least one student who reprimands me when I discuss archaisms (ancient forms, words or phrases) in the dramatic poet Plautus and “overlook” forms like quom “when” (cum in later Latin) or servos “slave” (nominative singular, later servus); and, to make matters worse, in my own edition!

When my students encounter forms like quom or servos, they often think that these ‘archaic’ spellings reflect what Latin sounded like back in 200 BC, and it comes as a big surprise to them when they learn that already in Plautus’ time they were pronounced cum and servus, and as an even bigger surprise when they find out that spellings like quom and servos were still perfectly normal in Cicero’s time. My students have been misled by what they see in editions, and some of these misconceptions only disappear when they start reading inscriptions. Since orthography (the technical term for “spelling”) is the much-neglected stepchild of philology, I want to take this opportunity to reflect on Latin orthography, and orthography more generally.

In this piece, I want to answer two larger questions. First, what are the main differences in orthography between original documents of the Classical period (1st century BC and 1st century AD) and critical editions of authors belonging to that period? What is the situation in other periods? And how did spelling conventions change in antiquity? And second, why do we not stick to ancient spelling conventions when editing ancient authors? What should an editor do? In the course of our journey, I will also talk about how the Latin alphabet came about, and I will ask what makes for a great script. Let us begin with this last question.

The spread of the Latin alphabet: dark green countries use the Latin alphabet for their official languages; light green countries for co-official languages.

What makes for a great script? The phonemic principle

Most people assume that a spelling system with a clear correspondence between sounds and letters is ideal. Sidney Allen, too, took this idea for granted in his Vox Latina (1965): if spelling tells us how to pronounce a specific word precisely and unambiguously, then the job is done. However, that means that we need to know, first of all, which sounds matter in a language.

In the ‘Received Pronunciation’ of English – the kind of English most commonly used by BBC presenters – the l-sound in late is quite different from that in tail: the former is called “clear” and the latter is called “dark”, a metaphor that is as intuitive as the one used in Latin, where these two sounds also occurred, and where they were referred to as exile, “thin”, and pingue, “fat”, respectively. When we pronounce a dark l, the back of our tongue is raised higher than for the clear variety. This is an important phonetic distinction in English and Latin, and a failure to make it will make you sound unusual, even though intelligibility is not seriously impeded.

The International Phonetic Alphabet, on which further details are given here.

In southern British English, the variety which Received Pronunciation is based on, the clear l occurs before vowels and the dark l elsewhere. The two sounds never contrast, which is why we speak of “complementary distribution”. Other phonetic distinctions are contrastive: compare the vowels in bitter, better, batter, butter. These four vowels constitute four “phonemes”, and they need to be kept distinct if we want to remain intelligible. But clear l and dark l are what we call “allophones” of the same phoneme l. A writing system with a clear correspondence between sounds and letters is phonemic in nature, but not fully phonetic, because for English there is little point in specifying each and every time when to use which l: the distribution of the two sounds is entirely predictable.

As a side note, we need to make a phonemic analysis for any language that we are devising a script for. Even if two languages had exactly the same sounds phonetically, these could still be divided differently into phonemes. Let me illustrate: the Spanish word for “finger” is dedo (from Latin digitus) and the first d is pronounced similarly to the d in English dear, but the second one resembles the th in English other. In English, these are two separate phonemes (compare udder and other), but in Spanish they are allophones and hence no distinction is made between them in writing.

Now that we have the phonemes, we can match them up with letters. Finnish does this perfectly, with a neat phoneme-to-letter correspondence. Italian writing is also near-phonemic, but with some minor complications: for historical reasons, the phoneme /k/ is not always spelled identically; /k/ is written c before a, o and u (personal name Carina), but ch before i and e (personal name Chiara), because c before i and e represents a different phoneme, /ʧ/ (cento, “one-hundred”, with an initial sound similar to English chin). But these are minor complications, and Italian spelling still allows us to predict pronunciation with a high degree of accuracy (although some more complications await us below).

The organs of speech (image courtesy of Yu-Xian Claire Huang).

Problems with purely phonemic scripts

As I said above, most people think of such phonemic or near-phonemic scripts as the gold standard, whereas English orthography is considered an almighty mess. English orthography used to be much more phonemic than it is now, but sound change happened, and orthography did not catch up. However, that does not necessarily make English orthography bad.

Phonemic scripts are perfect for learners, whether these are children learning to write their native language or adults learning a foreign language. Phonemic scripts are also wonderful for linguists who need to consult data from languages they are not particularly familiar with. But the situation is different once we are competent language users. If you read Antigone, chances are that you are fluent in English, whether it is your first language or not. And that means that you do not spell out every word letter for letter, but that you take in word shapes as a whole, much like a competent reader of Chinese takes in characters as a whole without having to divide them into radicals and further additions. Incidentally, this is also why proofreading is so difficult: if you are reading for content, your brain will auto-correct any spelling mistakes you have made; and if you are proofreading properly, you will often find that you have gone through an entire page without actually knowing what you have read.

Prominent characters in the Chinese script.

As proof of the brain’s auto-correction skills, let me show you a scrambled sentence that has been circulating on the internet for some years now:

Aoccdrnig to rscheearch at Cmabrigde Uinervtisy, it deosn’t mttaer in waht oredr the ltteers in a wrod are, the olny iprmoatnt tihng is taht the frist and lsat ltteer be at the rghit pclae.

I do not need to unscramble this – you can understand it without slowing down much. Written language is not simply a representation of speech. Often, its structures are different, and we all come across words whose meanings we know even if we do not know how to pronounce them. One of my British undergraduates two decades ago used the word scion in a translation, and used it correctly, but believed that scion starts like ski and rhymes with goon, simply because he had only ever encountered this archaic word in its written form.

Purely phonemic scripts can also have some, admittedly minor, downsides. To begin with, whose phonemes do we choose? In a language spoken over a large area, there will be different dialects, each with a slightly different sound system. Picking one over another is always a partly political decision. Italian spelling is near-phonemic, with a few systematic exceptions, and these are very sensible. Standard Italian is the Italian of Tuscany, minus some purely local features. Tuscan is a good choice for a standard dialect, not only because Dante, Petrarch and Boccaccio hailed from the region, but also because as a central dialect it is easily intelligible to northerners and southerners alike. In Tuscan, there is a phonemic distinction between e and ɛ, as in pesca (with e), “fishing” – compare Latin piscis, “fish” – and pesca (with ɛ), “peach” – compare Latin (mala) persica, “Persian apple”, “peach”. In pronunciation, these are distinct, in spelling, they are alike; and yet this ‘underspecification’ in spelling is an advantage. The reason is that not all dialects make this distinction, and among those which do, the two sounds are often distributed differently from Tuscan. In fact, my Italian students sometimes ask each other how they pronounce these two words, as a way of finding out which dialects they speak. If the Tuscan distinction in pronunciation were enforced in spelling, it would perhaps help the foreign learner, but it would make life more difficult for many other Italians, and with few benefits: the ‘functional load’ of this distinction is very low because there are very few minimal pairs, that is, words which differ from each other in only one sound.

A portrait of Chaucer in Thomas Hoccleve’s Regiment of Princes, 1410s (London British Library Harley MS 4866, f. 88r).

Historical spellings, rather than phonetically accurate ones, can also help us with older texts. Most speakers of English can read Chaucer with relatively little training. They may not know how English was pronounced back then, but because spelling has changed so little, they still can access his texts. If we go for phonemic spelling, we lose this access, and we need to update our orthography regularly. During the Cultural Revolution, the Chinese government toyed with the idea of replacing traditional characters with phonemic Romanizations; this would have sped up the literacy drive enormously. Ultimately, the idea was abandoned because it would have meant losing access to centuries of literature. By contrast, a language which does update its orthography very regularly is Dutch: every few years, a ‘Green Book’ appears to inform people about which words are spelled differently from before.

Other considerations are less significant, but deserve to be mentioned nevertheless. Phonemic scripts, naturally, do not allow us to distinguish between homophones (words that sound alike). This may not matter much in English, where but and butt, or or and ore, would be unlikely to be mixed up within a sentence, but our current convention of writing both Latin quom, “when”, and cum, “with”, as cum does make life more difficult for learners.

Similarly, with phonemic spelling, ‘phoneme consistency’ can come at the expense of ‘morpheme consistency’; morphemes are the smallest meaningful units in language: cats consists of two morphemes, cat and the plural marker –s. The final sounds in cats and dogs are pronounced s and z, respectively, but there is something to be said for consistently writing plurals in the same way. In Dutch, final stops (p b t d) and fricatives (f v s z) are devoiced consistently in pronunciation, so that for example b becomes p and v becomes f; but this does not happen consistently in spelling. Final fricatives are pronounced and written as voiceless, hence plural druiven, “grapes”, next to a singular druif; but next to handen, “hands”, one writes hand, even though it is pronounced /hant/.

The Dutch fruit crib-plate: still from video courtesy of Dinolingo.

The issue matters for Latin as well. Quintilian, whom I discuss below, tells us that if we follow logic, we write obtineo, “I obtain”, but the ears perceive p rather than b. Inscriptions show us that spelling was not as consistent as one might think (spellings like optineo are not uncommon), but the point is that if we write obtineo, we want to maintain morpheme consistency because in isolation the first element is ob, and we sacrifice phoneme consistency because in pronunciation b devoices if a voiceless sound follows.

Formal inscriptions tend towards morpheme consistency, but there are other factors at work as well. For example, which prefix is employed: ad is spelled with assimilation (e.g. afferre rather than adferre) more commonly than ab. Modern editions and dictionaries have not standardized such spellings; if I have to look up a word like adfero / affero, “bring to”, in a dictionary, I cannot predict whether I will find it under adf- or aff-. But modern editions and dictionaries have standardized elsewhere, at the end of words; thus, we normally write urbs, “city”, nowadays, even though it was pronounced urps and in antiquity both spellings were found. Incidentally, such problems are not confined to Latin-derived scripts, but can occur in any system that is not purely logographic (in logographic systems words are represented by stylized images). The native Korean script, Hangul, devised under King Sejong in the 15th century, is sound-based, but in compound words, morpheme consistency is respected. Thus, one writes kuk-min, “nation people” ( 국민), but in pronunciation it is /kuŋmin/ because the stop k turns into a nasal before another nasal (ŋ is the symbol for the final sound in sing).

Statue of Sejong the Great, Gwanghwamun Plaza, Seoul, South Korea.

A related issue is what to do with so-called ‘weak forms’. Should we always write is not, or should we follow pronunciation and write isn’t when we pronounce it thus? In other words, not has a reduced, ‘clitic’ by-form in pronunciation. In spelling, however, formal writing prefers the full form, while informal writing prefers the reduced one. English also has weak forms of am, are and is (I’m, you’re, it’s). Latin, too, has such weak forms of esse, “to be”: next to es and est, we find reduced s and st, which can ‘clitically’ attach to a preceding word if specific syntactic and phonological conditions are met. In some ancient manuscripts, we find spellings like actumst or erust, but they occur side by side with the full actum est, “it is done”, and erus est, “he is master”. Inscriptions show variation, but in medieval manuscripts, this kind of ‘prodelision’ is often abolished in favour of the full forms.

Sound-based scripts also run into problems when they have to handle loanwords from languages with different sound systems and spelling conventions. Should spelling be adapted? Should we tell people how to pronounce such words? Or is it preferable to stick to foreign spellings, the way English does for French words, and let people decide whether or not they should make concessions to foreign pronunciation? A common word for “gift” in Dutch comes from French cadeau. This is most commonly written as cadeau, but an adapted (and still substandard) spelling kado also exists. But what to do if we want to make this a diminutive in –tje? The official spelling for a “little gift” is cadeautje, but such a hybrid form looks more awkward than the adapted kadootje. Again the issue is not irrelevant for Latin. Greek words in early inscriptions were not only transliterated, but also entirely nativized in spelling, so that the aspirates φ, θ, χ were rendered as p, t, c, and υ and ζ were rendered as u and s. But this changed under the influence of the Scipionic Circle in the 2nd century BC; [1] Greek became a prestige language, and suddenly people started to write ph th ch as well as y and z (though not always in the right places!).

Engraving of Scipio and his circle, 1846.

A final consideration, less relevant for Latin, is that when we develop an orthographic system for a hitherto purely oral language, we should ideally make these conventions similar to those for languages already used by the community. I had never thought about this much until a doctoral student, Vijay D’Souza, now the director of the Northeastern Institute of Endangered Languages in Assam, told me how he had developed a Latin-derived orthography for Hrusso Aka. Some scholars devise spelling conventions which are phonemic and easy to type on a computer because they avoid cumbersome diacritics, but in so doing they give sound values to letters like j which are counter-intuitive to a speech community already literate in English (and often also Hindi and Assamese, but these use different scripts).

A related problem exists in ancient Italy, though not for Latin. Many modern languages are written in more than one script (e.g., Cyrillic and Latin in the case of Serbian). Similarly, in ancient Italy, Oscan was written in a native alphabet derived from Etruscan, in the Latin script, and in an Ionic Greek alphabet. The native script is near-phonemic. But when Oscan is written with Greek letters, problems arise. For example, how should the Oscan diphthong /ɛi/ be rendered? At the relevant period, ει is a monophthong in Greek. The Oscans sometimes go for ηι, but also for ει, thereby using Oscan-script conventions in a Greek alphabet. On this note, let us turn to the historical side.

The languages of Iron Age Italy.

A short history of the Latin alphabet

The ancestry of the Latin alphabet lies in the Near East. The Phoenicians developed a script that could render the consonantal phonemes of the language neatly, but failed to render its vowels, which we can only reconstruct by comparison with other Semitic languages and through Phoenician inscriptions in other alphabets, including the Latin one. But why did the original Phoenician alphabet not have vowel signs? The Phoenician letter names are names for things, like ’alp,[2] “ox”, or bet, “house”. They indicate the first sound in this word. But in Phoenician, as in some other Semitic languages, words have to begin with a consonant, and this explains why vowels are not written.

In its long history, Phoenician kept a conservative orthography, but sound changes happened, and when certain consonants got lost in certain positions these signs began to develop a double duty, as consonants (preserved elsewhere) and as vowels. The same thing happens in Hebrew: the name Sarah ( שרה, originally meaning “princess” and written srh) was at some point pronounced with h, but when h disappeared in this position in pronunciation, the symbol came to mark ā as well.

The Natan-Melech/Eved Hamelech bulla, a seal of the 6th cent. BC inscribed with the Phoenician script (found in 2019 in the City of David, Jerusalem, Israel).

The Greeks took over the Phoenician script, hence alpha and beta from ’alp and bet. A West Greek form of the alphabet, in which X stood for /ks/ rather than /kh/ as in Attic, was adopted by the Etruscans, who passed it on to the Romans. Phoenician had three stops produced at the back of the mouth, voiced g, voiceless k, and emphatic, uvular q. The Greeks turned these into Γ, Κ, Ϙ, gamma, kappa, qoppa, although the first of them was called gemma in West Greek. Ϙ was a dead letter: there was no use for it, but since we learn the alphabet as a list, it was passed on within it, and was also kept as a numeral (all Greek letters can indicate numbers as well). The Etruscans took on these letters as C, K, Q, but since they did not have voiced stops or an o, they were presumably pronounced ke, ka, ku. But what to do with three letters of the same value? Because of the letter names, C was used before e and i, K before a, and Q before u. This convention was rather pointless, and the Etruscans gave it up, but not before passing on the alphabet to the Romans, and this convention along with it. The Romans eventually gave it up, too, using QV (qu) for the sound kw and abolishing K except in old-fashioned abbreviations like K for the name Kaeso or KAL for the Kalendae, “kalends” (the first day of the month). Funnily enough, K is often used for Karthago / Carthago, a Phoenician city which in the native language had q (qart ħadašt means “new city”)!

But Latin did have a g sound, and using C for it, as in the old abbreviation C for Gaius, did not feel right. In the end, the Romans got rid of Z in seventh position, which would only have been useful for Greek words, and replaced it with G, a C with a diacritic.[3] However, in the course of the 2nd century BC, Greek became a prestige language, and people suddenly felt a need for the Greek letters Y and Z to indicate non-Latin sounds. These were re-borrowed and added at the end of the alphabet. In the early period, no one cared too much about Greek pronunciation or spelling, but in this period, people also start to write Greek aspirate consonants as ph, th, ch.

The Classical Latin alphabet.

The Latin alphabet is a reasonably good fit for the sounds of the language, but there are some problems. For instance, vowel and consonant length is phonemic, compare anus, “old woman”, ānus, “ring, anus”, and annus, “year”. In Plautus’ time, geminate (doubled) consonants began to be written as such, but indications of vowel length were never systematized. The obvious thing to do would be to write long vowels double, just as long consonants are written double. But this only happened for a short period of time in the 2nd century BC. That convention is ascribed to the poet Accius (170-86 BC), but presumably it was the result of influence from Oscan, a language related to Latin and spoken in the south of Italy: in Oscan, vowel length is phonemic only in initial syllables (which are also stressed), and here it is rendered by double spellings in the national alphabet. In Latin, the accent does not regularly fall on the initial syllable, and vowel length is phonemic in every position; yet double spellings occur only in initial syllables, as in paastores, “shepherds” (inscription from Polla, CIL 12.638), which actually has three long vowels and an accented penultimate syllable.

The “Lapis Polla”, an inscription of 132 BC (CIL I2 638) discovered in Polla, near Solerno, Italy.

Other methods of indicating vowel length existed. The one which has survived to this day is the apex, a diacritic placed on top of a vowel. More idiosyncratic is the I longa, an elongated I that sticks out at the top and the bottom, but has no equivalent for other letters. And finally, there are historical spellings: when ei had merged with inherited ī, the spelling ei could now be used to indicate a long vowel. This was sometimes done very historically, and sometimes in the ‘wrong’ places simply to indicate vowel length. It is quite common to find the i / ei distinction in final syllables, but not elsewhere, and this makes sense: it is easy enough to learn a handful of endings and apply historical spellings to them, but it is quite a feat to learn historical spellings for thousands of words.

In the mid-1st century AD, the Emperor Claudius (reigned 41–54), by and large a serious scholar, noticed three problems with the Latin script. First, as in Greek, we have a single letter for /ks/ (x / ξ), but we lack a single letter for /ps/ corresponding to ψ. Second, the letter u stands for both a vowel and a semivowel (in antiquity, volui, “I wanted”, and volvi, “I turned”, were written the same way because the letter v for the semivowel is a medieval invention). And third, in words like optumus / optimus, “best”, the vowel in the second syllable is neither u nor i, but something in between. He duly invented three letters to remedy the situation, but these are only found in court inscriptions of the period, clearly as an attempt to flatter the Emperor. Tacitus (Annales 11.13–14) cannot refrain from mockery, and rightly so: from a phonemic perspective, x is as useless as a letter for ps would be, since these can be written as digraphs; and the sound in optumus / optimus never contrasts with u or i, but is just an allophone. Only the issue of the semivowel u could be taken seriously, but the underspecification hardly matters because the functional load is minute: I cannot think of a minimal pair other than the one I already mentioned, trisyllabic volui, “I wanted”, and disyllabic volvi ‘I turned’.

Bust of Claudius, AD 41–54 (National Archaeological Museum, Naples, Italy).

Ultimately, though, I have to admit that I find bad spelling far more interesting than spelling that sticks to the norms. When I see, in comments in online newspapers, how some people spell hypocrisy as hypocracy (as if formed like democracy), I smile. It reveals that most people do not understand the etymology of these words, and even more importantly, if someone had to reconstruct the pronunciation of current English in a few hundred years’ time, these misspellings would be invaluable, whereas ‘correct’ spellings tell us very little. Much the same can be said of Latin. We are now ready to contrast inscriptions and texts of the same period that have come down to us in a manuscript tradition.

Ariadne and Bacchus Sarcophagus Relief, AD 210-20 (Getty Villa, Los Angeles, CA, USA).

Snippets from two inscriptions and two attempts at transposition

The Senatus consultum de Bacchanalibus is a decree that was written in 186 BC, two years before Plautus’ death, and hence still in the archaic period. Right after the preamble, the text begins thus:


About the Bacchanalian celebrations, they decreed that one ought to announce as follows to those who are allied.

Note that geminate consonants are not written; Greek aspirates are not rendered as such; and the old diphthong ei is still in use, as is oi rather than oe. In a more modern orthography, this would read as follows:

De Bacchanalibus qui foederati essent ita edicendum censuere.

Reproduction of the Senatus consultum de Bacchanalibus of 186 BC from the bronze original (discovered in Tiriolo, Italy, in 1640 and now in the Kunsthistorisches Museum, Vienna, Austria).

Plautus’ Pseudolus was staged in 191. It begins thus (line 3):

Si ex te tacente fieri possem certior…

If I could get the information out of you while you’re silent…

But if Plautus followed orthographic conventions similar to those of the Senatus consultum, the text could look like this:

Sei ex ted tacente fierei posem certior…

Clearly an unusual look! The Senatus consultum is, however, deliberately archaizing, so the Plautine text may not have looked like this. Note also that this archaizing tendency is not always successful: we find the phrase IN OQVOLTOD, “in secret” (Classical in occulto), but there was never a kw sound in this word; clearly, genuine -quo- had already become -co- / -cu-, and we are dealing with a hypercorrection here. That also means that all instances of quom on this document were already pronounced as cum! Spellings like SERVOS and QVOM remained long after the sound change so as to avoid a sequence VV, which could be misread as M or N.

Comedic mask depicted in a 3rd-cent. AD mosaic, Sousse, Tunisia.

And here is the beginning of Virgil’s Fifth Eclogue, written around 38 BC:

Cur non, Mopse, boni quoniam convenimus ambo,
tu calamos inflare levis, ego dicere versus,
hic corylis mixtas inter consedimus ulmos?

Mopsus, since we have met and we are both skilled, you at blowing through thin pipes and I at speaking verses, why have we not sat down here among the elms mixed with hazels?

We can compare this with poetry by Cornelius Gallus found in Qaṣr Ibrīm in Egypt, on a papyrus which in all likelihood goes back to 50–25 BC:


Then will my fate be sweet to me, Caesar, when you are the most important part of history, and when I read of the temples of many gods being richer after your return, on account of being hung with your trophies.

This is quite modern orthography. Maxima is used rather than maxuma, which would have been the more normal form just a few decades before and was still common in this period. We can see tuum rather than tuom, which was still normal for Quintilian’s teachers (see below); in view of this, quom rather than cum is presumably used in order to keep these two homophones distinct, as we do with but and butt. The spelling of ī is interesting, unless this is simply an accident, given how little we have of this poet. In spolieis and tueis, the diphthong is used in endings, historically accurately, but endings are easy to learn. On the other hand, deiuitiora and fixa both have long vowels in the initial syllables, and it is remarkable that both are written historically correctly, as the former used to have a diphthong and the latter used to have a long monophthong. And finally, mihi here scans as an iamb (light-heavy) and could have been written mihei, which is the inherited form, but in most instances mihi consists of two light syllables, and the spelling could simply have been taken over from the more common form.

The Gallus papyrus, discovered in Qaṣr Ibrīm, Egypt, in 1978.

What would Virgil look like in this kind of orthography? Here is an attempt:[4]

Cur non, Mopse, bonei quoniam conuenimus ambo,
tu calamos inflare leuis, ego deicere uersus,
heic coryleis mixtas inter consedimus ulmos?

I have changed boni, dicere, hic and corylis to bonei, deicere, heic and coryleis. I am less sure about the accusative plural leuis, where ī is inherited, but where we could also find ei through conflation with the nominative plural (which was originally leuēs, but nominative and accusative plural got muddled up early on). And it is quite possible that Virgil would have written quor rather than cur, but again this is uncertain. But what do the Romans themselves tell us about orthography?

The opening of Virgil’s First Eclogue, Bern Burgerbibliothek 165, 9th cent., which can be read here.

Roman writers on orthography

Prescriptive comments on orthography are already found in the 2nd century BC. Lucilius (died 103 BC) stated that for second-declension nouns in -us, the genitive singular should be written -i and the nominative plural, -ei (quoted in Quint. Inst. 1.7.15). These are historically correct spellings, matching Plautine pronunciation, but in Lucilius’ time, old -ei had merged with inherited in pronunciation, and some people used the phonemic spelling -i for the nominative plural, or the hypercorrect spelling -ei for the genitive singular.

In the 1st century AD, Quintilian took a broader, less prescriptive view. In Inst. 1.7, where he focuses on orthography, he states that the apex should not be overused; vowel length is phonemic, but for most words he is happy with leaving things vague (“underspecification”), arguing that the apex comes into its own only when it distinguishes minimal pairs like malus, “bad”, and mālus, “apple”. He reports the view that one should write exspecto, “I await”, even though in this compound from ex and specto one would only hear a single s; this is a case where morpheme consistency matters. But Quintilian himself does not distinguish in spelling between cum, “with”, and quum, “when” (no longer quom!). He considers it pedantic to write quotidie, “daily”, in line with its etymology, rather than cotidie; and he also thinks it is pedantic to write quicquid, “whatever”, a phonemic spelling, instead of quidquid, which could be misinterpreted as two instances of quid, “what”. For obtinuit, “he obtained”, Quintilian argues that ‘logic’ (ratio, morpheme consistency) demands -b-, while the ears hear -p-.

In the same section, he also comments on outdated spelling conventions. Lucilius’ distinction between -i and -ei is considered superfluous; historically accurate spellings like caussa, “cause”, and diuissio, “division”, are ascribed to the Ciceronian period (incidentally, the phonetic change of -ss- to -s- after diphthong or long vowel predates Cicero); and he points out that his teachers would still write nominative singular servos, “slave”, and cervos, “stag”, whereas he himself uses servus and cervus.

Quintilian addresses the crowd: frontispiece to Pieter Burman’s edition of the Institutio Oratoria (Leiden, 1720).

What we can see in Quintilian is a general preference for phonemic spelling, but with a high tolerance for underspecification among vowels, where the apex is not used to indicate just any long vowel, but only long vowels that result in minimal pairs. He is generally averse to historical spellings, even where they could differentiate homophones. The one area where he is happy to deviate from the phonological ideal in a major way is when it comes to morpheme consistency, where preference is given to morphemic spellings.

Neither Lucilius nor Quintilian is primarily concerned with orthography. Treatises dedicated to the subject first appear in the early 2nd century AD, with Velius Longus and Terentius Scaurus. The relatively late appearance of orthographical treatises is perhaps not surprising, since it coincides with a period in which major sound changes happened, while orthography remained relatively stable.

Terentius Scaurus is by and large in favour of phonemic spellings and advocates equus, “horse”, rather than equos, and causa, “cause”, rather than caussa; however, he opts for paullus, “little”, rather than paulus, which in his time must have been the normal pronunciation. Morpheme consistency is not a major consideration for him: he recommends pleps, “common people”, rather than plebs (genitive plebis) and pelligere, “to read through”, rather than perligere (from per and legere); that said, he prefers obscurus, “obscure”, over opscurus, claiming absurdly that one hears -b- here. Differentiation of (near-)homophones is a consideration: for the relative pronoun, we should distinguish between nominative (and, as he absurdly adds, vocative!) qui and dative cui in spelling.

An early edition of several Roman writers on technical aspects of the Latin language, Basel, 1527, which can be explored here.

Although Terentius Scaurus is as aware of the problems of purely phonemic scripts as a modern linguist should be, he does not couch his language in such terminology. Instead, he speaks of errors and their correction. Errors arise for four reasons: adiectio, when we wrongly add a letter, as in (historically accurate) caussa instead of causa; detractio, when we wrongly leave out a letter, as in aedus, “goat”, instead of haedus (h- had always been an unstable sound in Latin); immutatio, when we confuse two (near-)homophones such as ad, “to”, and at, “but”; and annexio, when we wrongly divide a word: it should be ne-scire, “to be ignorant”, in accordance with morpheme boundaries, and not nes-cire, in accordance with pronunciation.

Apart from phonology, which is taken for granted, there are three, sometimes contradictory, ways of correcting poor spellings: historia, referring to older spellings (such as haedus and dialectal faedus, where the regular correspondence between h- and f- indicates that we need to write the silent h-); originatio, referring to the etymology of a word; and proportio or analogy (domini, “of the master”, corresponds to dominus in the same way that equi corresponds to a form equus, not equos). Terentius Scaurus does not seem too bothered that sometimes these principles contradict each other; for equus, I could argue equally well that historia demands a spelling equos.

Of course Terentius Scaurus was a prescriptive writer. Yet what is unclear is whether his prescriptions were based on current educated usage or on a priori speculations about what spelling is all about; I tend towards the first alternative, but if the second is correct, it would be interesting to see whether he followed his own rules. Alas, this is no longer possible because the manuscript tradition clearly normalized and regularized even where this would have violated Scaurus’ ideas. We can see that same normalization process in Varro’s De lingua Latina: in 9.80 he explicitly tells us that the nominative plural is written -ei, yet the scribe has modernized his examples and given them the ending -i. Let us move on to the spelling of Latin in our day.

The Lyon Tablet, a bronze inscription of a speech by Emperor Claudius, c. AD 48 (Gallo-Roman Musem, Lyon, France).

Editorial conventions and their pros and cons

By now it should be clear that, with the exception of inscriptional texts, we cannot recover how exactly a Latin author would have spelled each and every word in every instance. We know the parameters within which the spelling of a given word would have varied, and we can say which spellings would be possible or even likely, and which ones would be impossible; but we cannot go beyond that. For this reason, restoring ‘the original spelling’ of an author is an exercise in futility.

Most editions, then, opt for an orthography that is based on the orthography in the big dictionaries, which in turn is based on a curious mix of Classical, late-antique and medieval spellings. But that still allows for a fair degree of variability, and editors can opt for one of two extremes or anything in between.

On one end of the spectrum, there are editions that opt for complete uniformity within the text corpus. I did this for my Loeb Plautus. My rationale was that even the earliest manuscript dates to six centuries after Plautus, and its spellings are largely Classical. Orthographic variation was relatively great in Plautus’ time, so there is little point in pretending that we know how Plautus would have spelled his text. On the other hand, uniform orthography allows us to read with ease, to focus on language, style and content. Moreover, it makes texts easily searchable electronically, a consideration which is becoming more and more important.

A sample page from my Loeb edition of Plautus: the opening of the Pseudolus.

On the other end of the spectrum are the Sarsina editions of Plautus. These try to be as faithful to the manuscripts as possible and opt for the oldest spelling conventions found in any given line. That does not always make for easy reading; one and the same person is Argyrippus in one line, with a Classical spelling, Argurippus in another, with an old-fashioned spelling, and Argirippus in yet another, with a late or medieval spelling. I have opted for a similar method for my Oxford Classical Text of Varro, but excluding purely medieval spellings. The reason is that for most of the De lingua Latina, there is only one manuscript that matters because all extant manuscripts are copied from it. My procedure allows me to keep the apparatus minimal, and unusual spellings can often point to specific textual corruptions. But as a downside, I do not achieve a text that is easily machine-searchable.

I do not believe that one type of orthographical convention is inherently superior to another. Whatever we go for, we have advantages and disadvantages, and the editor needs to weigh these before making a decision. However, I do believe that editors should spell out and justify their decisions, which, unfortunately, is still rare.

Karl Lachmann (1793–1851).

Excursus: Lachmann’s Normalisiertes Mittelhochdeutsch

Karl Lachmann was foundational for the development of textual criticism and the stemmatic method. Most Classicists know him as a brilliant editor of Lucretius, but are unaware that he also edited many Middle High German texts, from the Nibelungenlied to Walther von der Vogelweide. Lachmann’s orthographic practices are the same in Latin and Middle High German: he goes for standardized, ‘normalized’ spelling. For Latin, of course, this is sensible because the majority of manuscripts are so much later than the original text, and generations of copyists modernized its outward appearance. As we have seen, this means that an author’s spelling conventions are largely no longer recoverable.

But for Middle High German, this is problematic. Lachmann’s ‘normalized’ Middle High German makes life easier for the novice and also helps the scholar of literature who may not care too much about the linguistic side of a text. On the other hand, this standardized form of Middle High German, based on Swabian, Alemannic and Franconian written in the Staufer period (a dynasty that was powerful between 1079 and 1254), is an artifice that masks dialectal and diachronic variation in manuscripts, manuscripts which did not undergo generations upon generations of copying. Middle High German was not yet a standardized language, and modern readers ought to be able to see that when reading and analysing texts whose dialectal features may be of interest to them. Lachmann’s stemmatic method is useful for textual criticism in any language, but whether or not the texts in question should then be standardized orthographically depends on other factors, especially the time that has elapsed between the original author and the actual manuscript that survives.

Middle High German dialect boundaries.

Final thoughts

I am not under the illusion that Latin orthography is the most central part of Latin linguistics, but I hope that this piece has shown that, despite its neglect in most grammars, it is a topic that is interesting and perhaps even entertaining. We have moved from general considerations to the history of the alphabet, and from inscriptions and ancient authorities on the subject to modern practices. But much work remains to be done, and I would be thrilled if my essay were to inspire some readers to tackle this work.

Wolfgang de Melo is Professor of Classical Philology at Oxford. He has published on early Latin, especially Plautus and Roman comedy, and on Varro. He teaches linguistics and comparative philology and has a special interest in linguistic typology. He has previously written for Antigone about gender and language.

Further Reading

For writing systems in general, I recommend Florian Coulmas, Writing Systems: An Introduction to Their Linguistic Analysis (Cambridge UP, 2002), as well as Alan Cruttenden, Writing Systems and Phonetics (Routledge, London, 2021).

As an introduction to Latin pronunciation and spelling, I enjoyed W. Sidney Allen’s Vox Latina: The Pronunciation of Classical Latin (Cambridge UP, 1965, 2nd ed. 1978). For early Latin, I have learned a great deal from Rudolf Wachter, Altlateinische Inschriften: sprachliche und epigraphische Untersuchungen zu den Dokumenten bis etwa 150 v.Chr. (Peter Lang, Bern / Frankfurt am Main, 1987). The late Jim Adams wrote a brilliant piece on Classical orthography, ‘Was classical (late republican) Latin a “standard language”?’; it will appear in the Transactions of the Philological Society. On Latin ‘prodelision’, we now have the final word in Giuseppe Pezzini’s Terence and the Verb ‘To Be’ in Latin (Oxford UP, 2015). Terentius Scaurus has been masterfully edited and translated by Federico Biddau, Q. Terentii Scauri De Orthographia: introduzione, testo critico, traduzione e commento (Weidmann, Hildesheim, 2008). For Oscan, the best resource is Nicholas Zair’s Oscan in the Greek Alphabet (Cambridge UP, 2016).

Script development is one topic in Vijay D’Souza’s excellent doctoral thesis, Aspects of Hrusso Aka Phonology and Morphology (Oxford, 2021). For Korean, both the native script (Hangul) and the Chinese characters (Hanja), an outstanding resource is Yoolim Kim’s doctoral thesis, The Mental Representations of Hanja: Exploring Cross-Script Semantic Cohorts in Korean (Oxford, 2019).


1 The ‘Scipionic Circle’ was a highly influential group of scholars and politicians under the patronage of Scipio Aemilianus (185–129 BC), which was most active at the middle of the 2nd century BC.
2 The symbol stands for a glottal stop, a consonant produced by complete closure of the vocal folds. It is found in some varieties of British English, where bottle is pronounced with a “dropped t”, as bo’le; this “dropped t” is actually a glottal stop.
3 A certain Spurius Carvilius Ruga is credited with the invention of the letter G, and although this attribution is probably apocryphal, the time frame is about right, as he lived in the last quarter of the 3rd cent. BC.
4 In this reconstruction, I follow Roman practice in using u not just for vocalic u but also for consonantal v.