Languages with interesting phonotactics

A forum for discussing linguistics or just languages in general.
Post Reply
User avatar
Omzinesý
runic
runic
Posts: 2665
Joined: 27 Aug 2010 08:17
Location: nowhere [naʊhɪɚ]

Languages with interesting phonotactics

Post by Omzinesý » 29 Jul 2019 20:32

Which languages' phonotactics do you consider interesting or exceptional? Or which phonotactic features in some language do you consider interesting or exceptional?

I think most phonotactics of conlangs are very similar. So give us examples of different phonotactics in natlangs, so we can take new ideas.
This is not supposed to be a very thorough thread. We can Google details.


First example
1) Swahili (and many other Bantu languages') phonotactics is interesting in that Swahili syllables do not have coda consonants but what looks like a coda nasal is actually a syllabic consonant, when it comes to stressing.

User avatar
cedh
MVP
MVP
Posts: 374
Joined: 07 Sep 2011 22:25
Location: Tübingen, Germany
Contact:

Re: Languages with interesting phonotactics

Post by cedh » 29 Jul 2019 21:33

One of the most interesting languages worldwide with regard to phonotactics IMO is Georgian. It has very complex consonant clusters, especially in syllable onsets (and especially word-initially), as in /prt͡skvna/ ‘to peel’, /mt͡s’vrtneli/ ‘trainer’, or /brt’χ’eli/ ‘flat’. It also allows quite complex coda clusters, as in /msxverp’ls/ ‘victim (DAT)’. But despite this, a majority of syllables is actually open, many of them even of the form CV, so the overall flair of the language is surprisingly "light", cf. the beginning of the Declaration of Human Rights: q’vela adamiani ibadeba tavisupali da tanasts’ori tavisi ghirsebita da uplebebit. mat minich’ebuli akvt goneba da sindisi da ertmanetis mimart unda iktseodnen dzmobis sulisk’vetebit.

What is more, the complex consonant clusters have not just one but several really interesting properties. For example, they often are or contain so-called "harmonic clusters" consisting of two obstruents, the first of them labial or coronal, and the second velar or uvular, which behave as a single segment in various situations. Also, many instances of /r/ in onset clusters are apparently "optional", whatever that means in practice. And /v/ can often be analysed as consonant labialization instead. Taking all this into account, the seemingly "crazy" consonant clusters of Georgian are in fact highly structured.

A dissertation about this can be found here:
https://www.lotpublications.nl/Document ... lltext.pdf

User avatar
Ser
cuneiform
cuneiform
Posts: 78
Joined: 30 Jun 2012 06:13
Location: Vancouver, British Columbia / Colombie Britannique, Canada

Re: Languages with interesting phonotactics

Post by Ser » 29 Jul 2019 22:20

Standard Arabic with ʔiʕraab (that is, classical morphology) has an extremely strict CV(C) syllable structure, and goes to great lengths to enforce it and prevent three-consonant clusters, with many rules for epenthetic vowels. The inserted vowel is usually a short i, but sometimes it's a short a or short u depending on the particular transfix (root template) or word used.

Like many Polynesian languages, Standard Arabic distinguishes words that begin with a vowel from words that begin with a glottal stop + vowel. However, because of the strict CV(C) structure, if a word begins with a vowel, the initial vowel is dropped everywhere except at the beginning of an utterance. Most vowel-initial words have two consonants afterwards, which means these words regularly trigger vowel epenthesis in the word before.

For example, استلموا istalamuu 'they received' begins with a vowel. This means that this word is usually just "stalamuu" if it isn't the first word of an utterance, but becomes ʔistalamuu if it is. Compare:

ـهم -hum -3PL.MASC
استلموه ʔistalamuu-hu receive.PAST.3PL.MASC-3SG.MASC 'they received it'
إنهم استلموه ʔinna-hum-u stalamuu-hu EMPH-3PL.MASC-[EPENTH] receive.PAST.3PL.MASC-3SG.MASC 'they did receive it'

In the 3rd phrase, stalamuu-hu triggers vowel epenthesis in the word before, so ʔinna-hum becomes ʔinna-hum-u to prevent a three-consonant cluster (*ʔinna-hum stalamuu-hu).

This phenomenon pervades Standard Arabic throughout.


HOWEVER, because of its heavy use of ablaut/transfixes, a high number of two-consonant clusters is tolerated, including somewhat odd-looking clusters like [tˤf] in أطفال ʔaTfaalun 'children', [sˤb] in أصبح ʔaSbaHa 'he became', [ðh] in تذهب taðhabu 'she goes', [ʔt] in تأتي taʔtii 'she comes', [sʔ] in نسأل nasʔalu 'we ask', [ħθ] in بحث baHθun 'to search', [qd] in عقد ʕiqdun 'necklace', [xdˤ] in أخضر ʔaxDaru 'green', [ɣf] in يغفر yaġfiru 'he forgives'.

Not to mention the fact it can geminate a glottal stop: رأّس raʔʔasa 'to make [somebody] the leader', derived from رأس raʔsun 'head, leader'.
Last edited by Ser on 06 Aug 2019 04:42, edited 2 times in total.

User avatar
Ser
cuneiform
cuneiform
Posts: 78
Joined: 30 Jun 2012 06:13
Location: Vancouver, British Columbia / Colombie Britannique, Canada

Re: Languages with interesting phonotactics

Post by Ser » 29 Jul 2019 22:21

A couple fun things on word length:


Xhosa has a strict rule that all content words (nouns, adjectives, adverbs, verbs) must be at least two syllables long. In the situations where a verb in the singular imperative could end up with only one syllable, an extra meaningless yi- prefix is added so that the verb conforms to the rule. Ukutya 'to eat' should become just *tya in the singular imperative, but the actual form is yitya.

This blatantly happens only because of the rule, as any direct object prefix makes the meaningless yi- go away: litye 3SG-eat 'eat it!', zitye 3PL-eat 'eat them!'.


The Australian Pama-Nyungan language Yidiny has a strong preference for words with an even number of syllables. This means that when affixes would create a word with an odd number of syllables, an alternative form of the affix is used so that the rule is obeyed.

For example, the genitive -ni becomes -n + lengthening of the previous vowel when attaching to a noun that already has an even number of syllables: bunya + -ni = bunyaan. The verbal suffix for relative clauses -nyunda becomes -nyuun when attaching to a verb stem that has an odd number of syllables: majinda- + -nyunda = majindanyuun.

User avatar
Pabappa
sinic
sinic
Posts: 254
Joined: 18 Nov 2017 02:41
Contact:

Re: Languages with interesting phonotactics

Post by Pabappa » 30 Jul 2019 14:34

One thing I notice about Semitic languages is that they don't seem to have a lot of homorganic nasal+stop clusters. /mb mp nt nd/ etc.
Sorry guys, this one has the worst sting.

User avatar
Zekoslav
sinic
sinic
Posts: 223
Joined: 07 Oct 2017 16:54

Re: Languages with interesting phonotactics

Post by Zekoslav » 30 Jul 2019 18:01

Pabappa wrote:
30 Jul 2019 14:34
One thing I notice about Semitic languages is that they don't seem to have a lot of homorganic nasal+stop clusters. /mb mp nt nd/ etc.
Might it be because of the triconsonantal root system? As has already been said, Semitic languages tolerate many unusual consonant clusters in order to preserve the root intact, which may include heterorganic clusters like /np nb mt md/, and homorganic clusters like /mb mp nt nd/ may simply be rare in roots for some reason (there's a cross-linguistic tendency to avoid roots composed of identical or similar consonants... but that usually means roots like those found in baby talk, not m + b, n + d etc...).
Languages:
:hrv: [:D], :bih: :srb: [;)], :eng: [:D], :fra: [:|], :lat: [:(], :deu: [:'(]

A linguistics enthusiast who would like to make a conlang, but can't decide what to call what.

- Tewanian languages
- Guide to Slavic accentuation

User avatar
Pabappa
sinic
sinic
Posts: 254
Joined: 18 Nov 2017 02:41
Contact:

Re: Languages with interesting phonotactics

Post by Pabappa » 30 Jul 2019 23:28

Zekoslav wrote:
30 Jul 2019 18:01
Pabappa wrote:
30 Jul 2019 14:34
One thing I notice about Semitic languages is that they don't seem to have a lot of homorganic nasal+stop clusters. /mb mp nt nd/ etc.
Might it be because of the triconsonantal root system? As has already been said, Semitic languages tolerate many unusual consonant clusters in order to preserve the root intact, which may include heterorganic clusters like /np nb mt md/, and homorganic clusters like /mb mp nt nd/ may simply be rare in roots for some reason (there's a cross-linguistic tendency to avoid roots composed of identical or similar consonants... but that usually means roots like those found in baby talk, not m + b, n + d etc...).
Indeed... and yet there are many Semitic roots with two of the same consonant in a row, or at least so it seems from looking at place names. But not too many nasal+stop sequences, whether a vowel comes between or not.
Sorry guys, this one has the worst sting.

User avatar
Creyeditor
mongolian
mongolian
Posts: 4490
Joined: 14 Aug 2012 19:32

Re: Languages with interesting phonotactics

Post by Creyeditor » 31 Jul 2019 22:14

I definitely want to add something to this thread. I remember this language that allowed really complex codas but only relatively simple onsets. I forgot, which one it was. Also Austroasiatic phonotactics are always fun. Sesquisyllabicity, cool clusters like the ones in Khmer.
Creyeditor
"Thoughts are free."
Produce, Analyze, Manipulate
1 :deu: 2 :eng: 3 :idn: 4 :fra: 4 :esp:
:con: Ook & Omlűt & Nautli languages & Sperenjas
[<3] Papuan languages, Morphophonology, Lexical Semantics [<3]

User avatar
Ser
cuneiform
cuneiform
Posts: 78
Joined: 30 Jun 2012 06:13
Location: Vancouver, British Columbia / Colombie Britannique, Canada

Re: Languages with interesting phonotactics

Post by Ser » 31 Jul 2019 23:27

Pabappa wrote:
30 Jul 2019 14:34
One thing I notice about Semitic languages is that they don't seem to have a lot of homorganic nasal+stop clusters. /mb mp nt nd/ etc.
A lot of such clusters in Romance and by extension the latinate vocab of English come from attaching the prefix in- 'inside, into' ("impose") or 'not' ("impossible"), and many others involve the Latin -t- suffixes (-tus/a/um passive participles: sanctus > saint, -tus process nouns: conventus > convent, -antem agent participles: servantem > servant, -mentum result nouns: unct-i-mentum > ointment).

I don't know about the rest of Semitic, but standard Arabic only has one such affix, which is not as prominent: the transfix انفعل inCaCaCa that derives non-volitive passives or statives of some sort, like كسر kasara 'break something' > انكسر inkasara 'break (intransitive, subject is the broken thing)' or قسم qasama 'divide something' > انقسم inqasama 'be divided [into parts]' (as in "Egypt is divided into two main parts...").

This means that such clusters mostly appear in words with roots that have a nasal followed by a homorganic stop. I would say there is a fair number of /n/ + dental plosive (or velar/uvular plosive) roots, but they get swamped in the sea of roots that create all kinds of odd clusters instead. Examples would be ينتن yantunu 'it stinks' and يندو yanduu 'he calls, invites'. I'm not sure whether /mb/ occurs in native roots (probably not), but there's borrowings like إمبراطور ʔimbaraaTuur 'emperor', and I think words with /nb/ like أنبياء ʔanbiyaaʔu 'prophets' usually have [mb] anyway.

In vernacular Arabic these clusters are a bit more prominent because the cognate transfix of standard inCaCaCa (usually ʔenCaCaC or nCaCaC) is regularly used to form passives, and many Arabic vernaculars also drop the first vowel of standard transfixes beginning with mu- (which forms participles) or ma- (which forms place nouns). Compare standard عيد مبارك ʕiidun mubaarak(un) 'blessed festival!' (or "happy Eid!") vs. Syrian colloquial ʕiid-ə mbaarak.

Khemehekis
mayan
mayan
Posts: 2323
Joined: 14 Aug 2010 09:36
Location: California über alles

Re: Languages with interesting phonotactics

Post by Khemehekis » 01 Aug 2019 08:52

Ser wrote:
31 Jul 2019 23:27
Pabappa wrote:
30 Jul 2019 14:34
One thing I notice about Semitic languages is that they don't seem to have a lot of homorganic nasal+stop clusters. /mb mp nt nd/ etc.
A lot of such clusters in Romance and by extension the latinate vocab of English come from attaching the prefix in- 'inside, into' ("impose") or 'not' ("impossible"),
Or "con-" (with). Intumescent, insect, indicate, infect, inject, invent, ingest, immature, important, imbue, illiterate, irrational. Contest, consider, condense, confuse, conjunction, conversation, congress, commission, compromise, combine, collateral, correct.
♂♥♂♀

Squirrels chase koi . . . chase squirrels

My Kankonian-English dictionary: 60,137 words and counting

31,416: The number of the conlanging beast!

User avatar
Omzinesý
runic
runic
Posts: 2665
Joined: 27 Aug 2010 08:17
Location: nowhere [naʊhɪɚ]

Re: Languages with interesting phonotactics

Post by Omzinesý » 02 Aug 2019 12:38

cedh wrote:
29 Jul 2019 21:33
[...]For example, they often are or contain so-called "harmonic clusters" consisting of two obstruents, the first of them labial or coronal, and the second velar or uvular, which behave as a single segment in various situations.
Can the same consonant clusters also be used "as non-harmonic" like /ap.ta/?

User avatar
cedh
MVP
MVP
Posts: 374
Joined: 07 Sep 2011 22:25
Location: Tübingen, Germany
Contact:

Re: Languages with interesting phonotactics

Post by cedh » 02 Aug 2019 19:53

Omzinesý wrote:
02 Aug 2019 12:38
cedh wrote:
29 Jul 2019 21:33
[...]For example, they often are or contain so-called "harmonic clusters" consisting of two obstruents, the first of them labial or coronal, and the second velar or uvular, which behave as a single segment in various situations.
Can the same consonant clusters also be used "as non-harmonic" like /ap.ta/?
I don't know. There are probably situations where something that could be a harmonic cluster is only brought together by morphology, and thus not underlyingly a typical "harmonic cluster", but I haven't studied Georgian well enough to be able to say what happens in such a situation.

User avatar
Omzinesý
runic
runic
Posts: 2665
Joined: 27 Aug 2010 08:17
Location: nowhere [naʊhɪɚ]

Re: Languages with interesting phonotactics

Post by Omzinesý » 11 Aug 2019 18:22

Chechen allows geminates on onset.

Salmoneus
MVP
MVP
Posts: 1636
Joined: 19 Sep 2011 19:37

Re: Languages with interesting phonotactics

Post by Salmoneus » 12 Aug 2019 00:07

Omzinesý wrote:
11 Aug 2019 18:22
Chechen allows geminates on onset.
Likewise, famously, much of Italian.

EDIT: although in Italian iirc it's a syntactic mutation, not lexical.

User avatar
Omzinesý
runic
runic
Posts: 2665
Joined: 27 Aug 2010 08:17
Location: nowhere [naʊhɪɚ]

Re: Languages with interesting phonotactics

Post by Omzinesý » 12 Aug 2019 08:46

Salmoneus wrote:
12 Aug 2019 00:07
Omzinesý wrote:
11 Aug 2019 18:22
Chechen allows geminates on onset.
Likewise, famously, much of Italian.

EDIT: although in Italian iirc it's a syntactic mutation, not lexical.
You mean the a[k] casa thing? Or does Italian do something else?

Alessio
sinic
sinic
Posts: 354
Joined: 03 Sep 2012 21:27
Location: Modena, Emilia-Romagna, Italy

Re: Languages with interesting phonotactics

Post by Alessio » 13 Aug 2019 14:00

Omzinesý wrote:
12 Aug 2019 08:46
Salmoneus wrote:
12 Aug 2019 00:07
Omzinesý wrote:
11 Aug 2019 18:22
Chechen allows geminates on onset.
Likewise, famously, much of Italian.

EDIT: although in Italian iirc it's a syntactic mutation, not lexical.
You mean the a[k] casa thing? Or does Italian do something else?
I'm guessing so. Italian does not allow geminates on onset normally. Moreover the syntactic doubling, as we call it, only happens between vowels anyways, so it's not really different from a mid-word geminate even if it happens across different words.
:ita: :eng: [:D] | :fra: :esp: :rus: [:)] | :con: Hecathver, Hajás, Hedetsūrk, Darezh...

Tin't inameint ca tót a sàm stê żōv'n e un po' cajoun, mo s't'armâgn cajoun an vōl ménga dîr t'armâgn anc żōven...

Nortaneous
greek
greek
Posts: 638
Joined: 14 Aug 2010 13:28

Re: Languages with interesting phonotactics

Post by Nortaneous » 14 Aug 2019 21:27

Wutung allows initial clusters up to CCCC, but these are highly constrained. The only attested CC clusters are:
/hb hd hdʒ hl hm hn hɲ hw/
/ʔb ʔd ʔdʒ ʔl ʔm ʔw/
/pl bl fl ml/

The only attested CCC clusters are /hɲdʒ/, /hmb/, /ʔbl/, and /ʔml/, and the only attested CCCC cluster is /hmbl/.

Wutung also has no velars outside loans, and doesn't allow coda consonants except (very rarely) word-internal /m/ or /n/.

Khroskyabs (Wobzi Lavrung) has highly complex clusters:
ʁvrdʑɣə 'geminate'
nvse 'morning'
nvtsæ 'meal'
ʁmno 'awl'
χsmɑr 'wheat'
ʁvdʑvər 'sprout'
çɲɕə 'sweat'
χscər 'danger'
jmbjəmpɑ 'bird'
ʁjnzdəjnzdə 'cause to buy each other things for their own benefit'

There are also some details of resyllabification of word-internal clusters, which I won't try to describe here.

The related language Japhug doesn't allow initial clusters greater than CCC, but doesn't adhere to the standard sonority hierarchy:
ɲcɣaʁ 'birch bark'
βzɟɯr 'changed'
wsaŋ 'fumigation'
wɕaʁ 'he repents for it'
wxti 'big'
wstɯm 'he serves him'
wrɟaŋ 'he stretches it (skin)'
znde 'wall'
zɲɟa 'a type of plant'

Itelmen has both large initial clusters and large final clusters:
ntʼnuaɬkicen '(we) will eat'
əsxɬi- 'wake up'
əŋkʼzu- 'help'
ekeʔnc 'girls'
ckpəc 'spoon'
txtum 'dugout canoe'
qtχiʔn 'legs'
ksxliɬ 'with sled runners'
kʼɸəʔnk 'at the nails'
kɬqzukneʔn 'they were'
sitɬxpkʼeɬ 'with embers'
ntkskqzu 'if we made it'
tɸscŋin 'you are carrying it'
mskceʔn 'I will make them'
kʼənsɬxc 'boil it!'
əŋksxq 'painful'
Omzinesý wrote:
11 Aug 2019 18:22
Chechen allows geminates on onset.
Word-initial geminates existed in Proto-Micronesian and are preserved in Woleaian:

lyːtyː / nːyːtyː 'jump'
xaʃeː-j / kːaʃe 'throw'
raxo-mi / tʃːaxo 'hug'
ʃaxeː-j / tʃːaxe 'chase'
ɸʷuxa / pʷːuxa 'boil'
peʃa-ŋi / pːaʃa 'stick to'
saweː-j / sːawe 'go alongside'
taɸʷeː-j / tːaɸʷe 'follow'
feraxi / fːeraxi 'be spread'

There are some singleton/geminate mismatches: ɸʷ > pʷː, r > tʃː, ʃ > tʃː, x > kː, l > nː.

Word-initial geminates are also preserved in Chuukese. Pohnpeian shifted geminates to NC clusters, but preserves geminate nasals: /mmet/ 'full', /ŋŋar/ 'see'. It also allows word-final geminate liquids.

Initial geminates also exist in some Malay dialects (e.g. Pattani and Kelantan), Arop Lokep (only m: n: l: r:, I think), Luganda, Ikema Miyako, Leti, Doku, Dorig, Nyaheun, and Hiw. Polish also allows initial geminates, but apparently it's better analyzed as allowing, in word-initial position, an additional syllable onset with no nucleus or coda.

Speaking of Dorig, its syllable template is CCVC, but it doesn't have a sonority hierarchy:

ⁿbtɔt 'canoe pegs'
mkɛ 'above'
ŋsi 'snout'
rkpʷa 'woman'
βtaːl 'banana'
ɣtam 'door'
ɣsuw 'rat'
lkɔn 'an island'
wⁿdɛ 'pig'
wsa 'egg'

The related language Hiw treats /w/ as a fricative and allows word-initial FC clusters, except for sC:

βti 'star'
ɣtiɣ 'waist'
wte 'small'
Ser wrote:
29 Jul 2019 22:21
Xhosa has a strict rule that all content words (nouns, adjectives, adverbs, verbs) must be at least two syllables long. In the situations where a verb in the singular imperative could end up with only one syllable, an extra meaningless yi- prefix is added so that the verb conforms to the rule. Ukutya 'to eat' should become just *tya in the singular imperative, but the actual form is yitya.
This is common in some branches of Austroasiatic, and the prefixed material differs by language, but I forget the details.

Post Reply