Qerenoka

Qers'noki, commonly known as Sangheili, Pan Sangheili, Common Sangheili or Imperial Sangheili is the common name for the Covenant's lingua franca, and the most common of the myriad languages spoken within the Holy Ecumene. Qers'noki is named after the original language chosen as the Covenant's common language at the time of the Writ of Union, but it has since evolved considerably and spawned numerous dialects and offshoot languages.

History
Modern Pan Sangheili's relationship to the original, Writ of Union-era Qers'noki is roughly similar as that of modern Italian to Latin, and perhaps even more distant. Because Qers'noki is spoken by a vast collective, it is by nature highly eclectic, being more aptly described as a language family than a singular language and encompassing hundreds if not thousands of dialects and subvariants, many of which are only barely mutually intelligible or not at all.

The version of Qers'noki documented here describes "Pan Sangheili", or the standardized version of the language spoken in High Charity, as of the early to mid-26th century. Pan Sangheili is spoken natively by much of the Covenant population, but also serves as an auxiliary language to those who speak either divergent dialects or entirely unrelated languages. Even so, Qers'noki should not be understood as being universal to the entire Covenant population. Pan Sangheili is institutionally regulated by a specific body within High Charity (likely part of the Ministry of Edification or a multi-ministerial effort), though this has not always been so. This has also slowed down the natural evolution of the language, and it has stayed relatively unchanged for centuries. There are also institutions on Sanghelios and other worlds claiming to represent the ultimate authority on the language. The language spoken by the high nobles of Sanghelios is known separately as High Sangheili or Aristocratic Sangheili, and is considerably more elaborate than Pan Sangheili.

Many sounds once present in ancient Qers'noki and other Sangheili languages have disappeared from Pan Sangheili, leaving only sounds that can be produced in some way by most of the Covenant's member species; mainly the San'Shyuum. Overall, Writ of Union-era Qers'noki featured more guttural sounds than the modern Pan Sangheili, which does not utilize the Sangheili's entire vocal range. As something of a quirky equivalent to human click sounds, some native Sangheili languages utilize a range of whistles and roars which cannot be reproduced by most species. The Sangheili have a rather bizarre facial anatomy, but they can sound like Keith David or Robert Davi. It's difficult to explain how their vocal tracts work, so all we can really say is "very well" (it likely happens in a separate cavity within their throat, likely also involving the nasal cavities, etc).

Characteristics
Since there's no need to redo work that's already been done, Daybreak implements a spin-off of the Sangheili conlang created for the Halo TV show by David J. Peterson and Carl Buck and documenting as well as expanding upon that (regardless of the show's quality, the work on the language is impressive, professional, and will serve our purposes well; indeed, it may be the only good thing to come out of the show). We call it Qers'noki rather than Sangheili because it's better to distinguish it from the name of the species, but both names are used in-universe.

''The following is mostly copied from David Peterson's twitter with some reformatting, minor changes, and additions to suit Daybreak's existing examples of the language. All credit for the language goes to David Peterson and Carl Buck.''

The language is a lightly inflectional head-final language with distinctive vowel length and ejectives. It uses Standard American Romanization.

Phonology
Stress is regularly antepenultimate, i.e. on the second-to-last syllable.

Ejective consonants are written with a following apostrophe, and the r is the tap [ɾ]. The language has long vowels, represented by a doubled vowel, and occasionally has geminates, also written doubly. Likely the most challenging aspect of the phonology are the consonants with a velar release. These are written as if they began with a consonant cluster, but they occur at every point of articulation—namely, pkh [pˣ], tkh [tˣ], kkh [kˣ], and qkh [qˣ], and even the fricative (or fricative-ending) consonants skh [sˣ], shkh [ʃˣ], and chkh [tʃˣ].

Phonemes / Romanization Exotic consonant clusters are sometimes simplified in non-academic romanization. Some double vowels, e.g. "oo" may be dialectally specific. Diphtongs also occur.

Morphology
Typologically, Qers'noki is a strongly head-final language. It has some case-like postpositions, with no agreement. Adjectives and possessors precede the nouns they modify, as do relative clauses.

Case particles
Cases are, in this case, little tags that let you know what role a noun plays in the sentence. Ergative and absolutive are grammatical; vocative is for direct address; the rest are locative.

The ergative “o” is placed directly after a noun that effects the action of the verb. For example, in K’uucho o domo ruuk’inatan, “The warrior attacks the human”, k'uucho “warrior” is followed by “o” because it’s the one that causes the attacking to happen. Domo gets no tag.

Possession
While there is an archaic genitive case, it is rarely used in normal contexts. Instead, one uses various of the other cases for possession depending on the nature of the possessive relationship.
 * For example, K’uucho oni zhuro would be “the warrior’s weapon”. Presumably this is one the warrior owns.
 * K’uucho ni zhuro would also be “the warrior’s weapon”, but the implication would be it was one they just picked up, or was an improvisational weapon—one they happened to have.

Now, let’s say the warrior has their father’s weapon. You’d probably say something like K’uucho oni nejo ga zhuro. That is, “the weapon FROM the father TO the warrior”.

You can also make fun distinctions like K’uucho me ik’o “the warrior’s eye(s)” (presumably still in there), and k’uucho ba ik’o “the warrior’s eye(s)” (which, regrettably, have been removed for some reason).

The genitive case, which has largely disappeared from the mainline language, is mostly used in formal contexts to signify relationships indicated in English with the preposition "of", but it is rarely used in reference to individuals possessing things and is most common in the names of institutions or places.

Pronouns
One notable trait of the pronoun system is that the third-person pronouns make a distinction between animate and inanimate subjects. The inclusive/exclusive distinction in the first person plural pronouns (i.e. “we”) signifies a difference between “you and us” vs. “us and not you” (“riin” vs. “jaari”).

Verbs
There is a distinction between dynamic and stative verbs. The understanding of the tenses will change depending on the type of verb. This should be familiar to English speakers, as we do the same thing. (Cf. “I like pizza”~“I’m liking pizza” vs. “I eat pizza”~“I’m eating pizza”.)

A dynamic verb is one where there has been some actual change in the world—where some action has taken place (e.g. “call”, “crush”, “send”). A stative verb is one that reflects more of an internal state (e.g. “understand”, “be useful”, “forget").

Sangheili has 8 tenses, but the meanings of those tenses vary depending on whether the verb is stative or dynamic. The forms are relatively simple, except for the reduplicative, which enjoys a lot of use. For dynamic verbs, it’s the imperfect tense ("I was x'ing"); for stative verbs, the emphatic. Here are some examples:
 * ch'in ~ ch’injin “stab”
 * naya ~ nenaya “fertilize egg"
 * opkho ~ pkhaapkho “bend"
 * pkhungo ~ pkhubungo “sleep"
 * qkhoso ~ qkhoghoso “walk"
 * satkha ~ sasatkha “be sure"
 * tkhop’o ~ tkhaadop’o “name”
 * zaya ~ zaazaya “expand"
 * ik'o ~ ch'anik'o "see"

Something that’s missing from these are the question forms. When asking a yes/no question, there are special forms for the verbs used with a reduced set of tenses (4). You’ll hear them when questions are asked.

For example, in the last episode, Makee says Jan o tkha q’unqijaga, k’e daaghajahe? “Are you worried I’ll forget?” Before the comma is the “I’ll forget” part. K’e is “you”. Daagha is “worry”, and the -jahe suffix is the one you’ll hear with questions. (= "That I'll forget, you worry?")

Phonaesthetics
The language is quite vowel-heavy overall, and doubled vowels are frequent. Vowels are also the most common word endings due to the postposition system. The use of some consonants in word endings may be restricted, and geminates or consonant clusters (which are otherwise used) never occur at the end of a word. Because of the heavy use of postpositions, word endings also give clues as to word class. See above for recurring trends.
 * -a, -e and -o seem to be the most common verb endings.
 * Acceptable consonants in word endings (in rough order of frequency): n, m, l, s, r, k (uncommon), t
 * Dialectal: -th (variant of -s or -t)
 * Apostrophes written preceding a word (e.g. Thel 'Vadam) indicate nonstandard stress on the first syllable? Clan names are rendered this way when speaking of, or addressing, a specific individual. In other contexts, e.g. when speaking of the house of Vadam, the stress marker is not used.

Lexicon
The other big table up there has less to do with grammar than the lexicon. The instrumental prefixes are used to derive new verbs from verb bases. It’s a little like how we have verbs like “deduce”, “produce”, “induce”, “adduce”, etc.

A basic verb would be duje “to molt”. From that, we derived the following:
 * moduje: lose track of
 * juunduje: make look good
 * gaiduje: sully

Another example using ghaina “to hear”:
 * banghaina: sense
 * t’ighaina: understand

This is khawa “to say”:
 * gaikhawa: guess
 * khekhawa: respond
 * juukhawa: claim
 * t’ikhawa: chat

If you compare the prefixes and their original meanings plus approximate uses, then combine them with the original verbs, you can get a sense of how we built these words, and came up with meanings for them.

The demonstratives are more or less explicable (this, that, yonder, at an unknown place, nowhere).

Word examples (official)

 * kheluuga = artifact/relic
 * K’uucho = warrior
 * nejo = water
 * ruq'a = fire
 * ruq'otajaga!    = you will burn (jaga = prospective future tense)
 * zhuro = weapon
 * juukhojo = Prophets (juu-khojo, note relation to the verb "claim", derived from the word "to say" combined with the "with certainty" prefix)
 * hira = (to) praise
 * hirajo = "praised one", formed with declarative suffix -jo
 * warut = go (unmodified form)
 * warut'o = go (ergative)
 * waruut = journey
 * Ch’awaruut = Great Journey (note relation to "going"; "Ch" indicates it's a bigger/more important such thing)
 * Ch'anggagomo = Halo/Sacred Ring (note "Ch" intensifier)
 * K'iisho = Luminary?
 * Q’iitu = Mercy
 * Tkhuyujo = The Ancients / Forerunners
 * Shandi = The Covenant
 * hodu = wait (imperative/command/direct address)
 * Ghashank'o = Demon
 * pkhada = stop, imperative

Word example (Daybreak)

 * hringun = heaven, firmament, sky associated with the numinous
 * shabui = sons
 * zhuuni = strand
 * harusi = tradition, way, school of thought

Names
These can be useful for figuring out the sound and look of the language, what kinds of syllables and sounds occur commonly, etc. If there are some clear irregularities from phonesthetic trends in names we've come up with, we can see if some things need to be readjusted. However, some of the ones that diverge from the phonetic rules of Qers'noki are also not native to Standard Sangheili; having more than one name for a place in different languages just makes the world richer and more layered. However, most Covenant worlds have a "Sangheilized" name, namely those featured in official documentation, and by and large those have to abide by the general rules of Standard Qers'noki, even if they are derived from the endonym (and they always aren't).

Some names may also be rendered in divergent or simplified forms of romanization to be more legible for humans.


 * Saepon'kal = Joyous Exultation
 * Malurok = Decided Heart (-k ending may be dialectal; name may've been subject to change over time)

Romanization
The language and romanization system we use is an approximation of the original language made for human use, as we cannot perfectly replicate all the original sounds. This also explains any possible irregularities. However, the romanization system is meant to be as straightforward, literal and uncomplicated as possible for clarity's sake. Sangheili romanization more or less uses the Latin alphabet phonetically, though there are some cases where a word's or name's pronunciation has previously been ambiguous and has caused an erroneous spelling to be coined as standard.

Special diacritics may sometimes be used to indicate that a letter is meant to be pronounced in a way counterintuitive to an English speaker, E.g. if a vowel is pronounced separately from its surroundings. E.g. the "ï" in "Raah chïwei" is not pronounced like the "i" in "chide" but like the i in "mist". However, these are commonly omitted.

343i has taken a liking to using Klingon-style capital letters mixed amid words to indicate... something (e.g. QezoY'asabu, 'sKelln), but other than make the language seem more stereotypically "alien" (probably because we associate it with Klingon) it doesn't seem to contribute much and makes the romanization look more untidy. It doesn't help that we're already trying to distance our depiction of the Sangheili from the discount Klingons they're often portrayed as. Peterson's and Buck's version of the language also doesn't use it, so we might as well ditch it.

Resources

 * David J. Peterson's documentation of dialogue in the TV show

Existing examples
These can be useful for figuring out the sound and look of the language, what kinds of syllables and sounds occur commonly, etc. If there are some clear irregularities from phonesthetic trends in names we've come up with, we can see if some things need to be readjusted or retconned. However, if some of these seem to diverge from a clear trend, we can also handwave these deviations as not being from Standard Sangheili; having more than one name for a place in different languages just makes the world richer and more layered. But most Covenant worlds have a "Sangheilized" name, namely those featured in official documentation, and by and large those have to abide by the general rules of Standard Qers'noki, even if they are derived from the endonym (and they always aren't).

The Covenant tendency to use "Adjective Noun" in names is integrated to the structure of the language and doesn't work like in English (e.g. "Joyous Exultation" vs. "Saepon'kal"; the adjective is usually compounded into the word and comes after the noun; qualities can easily be affixed into nouns this way). But sometimes these kinds of names aren't even formed from separate words in the original language; e.g. there is a distinct concept for "glorious proclamation" in Sangheili (Kaa'shash), and the English translation has to resort to two in order to capture the meaning. Plus it's already become a solidified translation convention when it comes to Covenant phrases, so most translators roll with it. Often the original Covenant names (for ships, worlds, etc.) may also contain more semantic content than is conveyed in the English equivalent, but ONI and UNSC translators especially seek to keep their translations as curt as they can; yet we do end up with names like "Long Night of Solace", which is a shorter phrase in the original language and has very specific cultural connotations.

Lexicon
When inventing the vocab, we have to consider the Sangheili/Covenant semantic space: where are they coming from, what do they regard as important, and what they don't, and which meanings they map to things. Languages don't map out 1:1 in words and phrases, especially with alien cultural contexts. Words have local context and meaning that doesn't carry over when you translate, and the very ways words are used can be radically different. (So a "literal" translation can actually be a worse one than a non-literal one that captures the meaning of the phrase better.) The semantic content created in one's head depends entirely on cultural and environmental factors. So, what is that context? Some features that could have an impact on the language: An all-encompassing religiosity, caste-based society, warrior culture, millennia of spacefaring, importance of community and family bonds, etc.

There is a notable difference between inventing a new vocabulary vs. re-encoding the English vocab. Things that are different words in English don't necessarily need to be different in another language (e.g. "sailor" and "mariner"). These are different because they came from different sources, but an alien language won't have that history. Equally, the opposite can be true - they would distinguish concepts we don't even think of. Let's say "space" and "outer space" could be words of totally different origin (the latter maybe translating to "void" or something), or one for interstellar and in-system space, and so on. Slipspace doesn't need to be a "space" at all but may derive from totally different origins. Or, in some cases distinctions may exist, but they are less frequently used only used when the distinction needs to be made.

Examples
 * "Sangheili" as a name is somewhat equivalent to "Earthling", in that it is derived from the planet and came after space travel? Different word for "people" or species designation that got gradually supplanted? "Sanghelios" means (or originally meant) "earth-sphere" or "earth air sphere" (could also have evolved into a word for world or planet)? (note that the Lights of Sanghelios are called Helios for short, so this also has to make sense unless we assume a very lax translation convention)

Number system
Sangheili uses a base-12 numbering system, which might be related to their 12 digits.

An easy way to express negative numbers, like an affix/prefix? Could be related to way to express a negative of something, or negation, in general.

Writing system(s)
Main writing system is an abjad or abugida? (no characters for vowels, but they are indicated through diacritics in the consonant characters)?

Consider ease of writing in logographic script - ie morphemes & roots represented by logograms

-->