How would you romanize Kazakh?

A forum for discussing linguistics or just languages in general.
Post Reply
User avatar
All4Ɇn
mayan
mayan
Posts: 1738
Joined: Sat 01 Mar 2014, 07:19

How would you romanize Kazakh?

Post by All4Ɇn » Tue 12 Dec 2017, 03:19

After learning about the problems facing Kazakhstan's new Latin alphabet (http://www.eurasianet.org/node/86051) I thought I'd ask everyone here what would be their personal favorite romanization of Kazakh. [:D]

As of right now the Cyrillic system is as follows:
/m n ŋ/ <м н ң>
/p b t d k ɡ q/ <п б т д к г қ>
/*t͡s *t͡ɕ/ <ц ч>
/*f *v s z ʃ ʒ *ɕ x ʁ *h/ <ф в с з ш ж щ х ғ һ>
/l j w/ <л й у>
/r/ <р>

/ɘ~ɪ ʉ ə ʊ/ <і ү ы ұ>
/əj~ɘj ʊw~əw~ʉw~ɘw <и у>
/i̯ɘ y̯ʉ~ø~œ u̯o/ <е ө о>
/*i̯e/ <э>
/æ ɑ/ <ә а>
/*jo jʊw~jəw~jʉw~jɘw jɑ/ <ё ю я>
As well as the silent letters <*ъ~*ь>

*only used in (mostly Russian) loanwords




Now my personal system would be the following:
/m n ŋ/ <m n ng>
/p b t d k ɡ q/ <p b t d k g k>
/*t͡s *t͡ɕ/ <t ç>
/*f *v s z ʃ ʒ *ɕ *x ʁ *h/ <f v s z ş j h g h>
/l j w/ <l y w>
/r/ <r>

/ɘ~ɪ ʉ ə ʊ/ <í ü ı ú>
/əj~ɘj ʊw~əw~ʉw~ɘw <i (iy before a vowel) u (uw before a vowel)>
/i̯ɘ y̯ʉ~ø~œ u̯o/ <е ö о>
/*i̯e/ <e>
/æ ɑ/ <ä a>
/*jo jʊw~jəw~jʉw~jɘw jɑ/ <yo yu (yuw before a vowel) ya>
<ъ~ь> become ' in would be homophones but are dropped elsewhere



Sample:
Барлық адамдар тумысынан азат және қадір-қасиеті мен кұқықтары тең болып дүниеге келеді. Адамдарға ақыл-парасат, ар-ождан берілген, сондықтан олар бір-бірімен туыстық, бауырмалдық қарым-қатынас жасаулары тиіс

Barlık adamdar tumısınan azat jäne kadír-kasnetí men kúkıktarı teng bolıp düniyege kelegí. Adamdarga akıl-parasat, ar-ojdan berílgen, sondıktan olap bír-bírímen tuwıstık, bawırmaldık karım-katınas jasawları tiyís
Last edited by All4Ɇn on Wed 28 Feb 2018, 22:30, edited 1 time in total.
shimobaatar
darkness
darkness
Posts: 10197
Joined: Fri 12 Jul 2013, 22:09
Location: PA → IN

Re: How would you romanize Kazakh?

Post by shimobaatar » Tue 12 Dec 2017, 05:18

/m n ŋ/ <m n ñ>
/p b t d k g q/ <p b t d k g q>
/*t͡s *t͡ɕ/ <c ç>
/*f *v s z ʃ ʒ *ɕ *x ʁ *h/ <f v s z ş j şş kh ğ h>
/l j w/ <l y w>
/r/ <r>

/ɘ~ɪ ʉ ə ʊ/ <ī ü ı ū>
/əj~ɘj ʊw~əw~ʉw~ɘw/ <i u>
/i̯ɘ y̯ʉ~ø~œ u̯o/ <е ö о>
/*i̯e/ <ē>
/æ ɑ/ <ä a>
/*jo jʊw~jəw~jʉw~jɘw jɑ/ <yo yu ya>

<Барлық адамдар тумысынан азат және қадір-қасиеті мен кұқықтары тең болып дүниеге келеді. Адамдарға ақыл-парасат, ар-ождан берілген, сондықтан олар бір-бірімен туыстық, бауырмалдық қарым-қатынас жасаулары тиіс.>

<Barlıq adamdar tumısınan azat jäne qadīr-qasietī men kūqıqtarı teñ bolıp düniege keledī. Adamdarğa aqıl-parasat, ar-ojdan berīlgen, sondıqtan olar bīr-bīrīmen tuıstıq, bauırmaldıq qarım-qatınas jasauları tiīs.
User avatar
sangi39
moderator
moderator
Posts: 3042
Joined: Thu 12 Aug 2010, 00:53
Location: North Yorkshire, UK

Re: How would you romanize Kazakh?

Post by sangi39 » Tue 12 Dec 2017, 05:31

I came up with this back in 2013 over on the ZBB (although with a couple of changes):

/p b t d (tɕ) k g q ɢ/ <p b t d (ć) k g q y>
/f v s~θ z~ð ʃ ʒ (ɕ x) h/ <f v s z š ž (ś x) h>
/m n ŋ/ <m n ŋ>
/r l j w/ <r l j w>
-RTR: /æ ɘ ʉ i̯ɘ y̯ʉ/ <ä i ü e ö>
+RTR: /ɑ ə ʊ u̯ʊ/ <a ı u o>

bʊrəŋɢə y̯ʉtki̯ɘn zɑmɑndɑ, bɘr dɑnəʃpɑn kɘsɘ, bɑɢdɑt ʃɑhɑrənəŋ bɘr ʉlki̯ɘn qɑzəsənəŋ ʉjɘni̯ɘ ki̯ɘlɘp qu̯ʊnəptə. qɑzəmi̯ɘni̯ɘn sy̯ʉjli̯ɘsɘp, qɑzənə sy̯ʉzгi̯ɘ ʒi̯ɘŋi̯ɘ bi̯ɘrɘptɘ. su̯ʊndɑ qɑzə qu̯ʊrqəp, — "bʊl mɑɢɑn ki̯ɘlgi̯ɘn bɑlɑ — mi̯ɘnɘŋ qɑzələɢəmdə tɑrtəp ɑlsɑ ki̯ɘri̯ɘk! ni̯ɘ di̯ɘ bu̯ʊlsɑ, bʊɢɑn ʒɑlənəp, səj bi̯ɘrɘp, u̯ʊrnəmdɑ qɑlɑjən!“ — di̯ɘp, qɑtənənɑ ɑqəldɑsəptə.

Burıŋyı ötken zamanda, bir danıšpan kisi, baydat šaharınıŋ bir ülken qazısınıŋ üjine kelip qonıptı. Qazımenen söjlesip, qazını sözгe žeŋe beripti. Sonda qazı qurqıp, — "bul mayan kelgen bala — meniŋ qazılıyımdı tartıp alsa kerek! Ne de bolsa, buyan žalınıp, sıj berip, urnımda qalajın!“ — dep, qatınına aqıldasıptı.

And using the UDHR text:

Barlıq adamdar tumısınan azat žäne qadir-qasıjeti men kuqıqtarı teŋ bolıp dünıjege kelede. Adamdarya aqıl-parasat, ar-oždan berilgen, sondıqtan olar bir-birimen twıstıq, bawırmaldıq qarım-qatınas žasawları tıjis.


Now, you could go full-on imitation of Turkish and go for

/p b t d (tɕ) k g q ɢ/ <p b t d (ç) k g q ğ>
/f v s~θ z~ð ʃ ʒ (ɕ x) h/ <f v s z ş j (ş h) h>
/m n ŋ/ <m n ng>
/r l j w/ <r l y w>
-RTR: /æ ɘ ʉ i̯ɘ y̯ʉ/ <ä i ü e ö>
+RTR: /ɑ ə ʊ u̯ʊ/ <a ı u o>

And that would give you:

Burınğı ötken zamanda, bir danışpan kisi, bağdat şaharınıng bir ülken qazısınıng üyine kelip qonıptı. Qazımenen söylesip, qazını sözгe jenge beripti. Sonda qazı qurqıp, — "bul mağan kelgen bala — mening qazılığımdı tartıp alsa kerek! Ne de bolsa, buğan jalınıp, sıy berip, urnımda qalayın!“ — dep, qatınına aqıldasıptı.

Barlıq adamdar tumısınan azat jäne qadir-qasıyeti men kuqıqtarı teng bolıp dünıyege kelede. Adamdarğa aqıl-parasat, ar-ojdan berilgen, sondıqtan olar bir-birimen twıstıq, bawırmaldıq qarım-qatınas jasawları tıyis.


IIRC, this makes it appear much more similar to the relatively closely related Crimean Tatar which also uses the Latin alphabet (if you search for some of those words, for example, on Wiktionary, several of them do appear with Crimean Tatar entries).
You can tell the same lie a thousand times,
But it never gets any more true,
So close your eyes once more and once more believe
That they all still believe in you.
Just one time.
User avatar
Vlürch
sinic
sinic
Posts: 215
Joined: Wed 09 Mar 2016, 21:19
Location: Finland
Contact:

Re: How would you romanize Kazakh?

Post by Vlürch » Tue 12 Dec 2017, 12:09

Аа > Aa
Әә > Ää
Бб > Bb
Вв > Vv (or just merge with Ww)
Гг > Gg
Ғғ > Ğğ
Дд > Dd
Ее > YEye (word-initially and after vowels), Ee (elsewhere)
Ёё > YOyo
Жж > Jj
Зз > Zz
Ии > İYiy
Йй > Yy
Кк > Kk
Ққ > Qq
Лл > Ll
Мм > Mm
Нн > Nn
Ңң > Ññ
Оо > Oo
Өө > Öö
Пп > Pp
Рр > Rr
Сс > Ss
Тт > Tt
Уу > Ww
Ұұ > Uu
Үү > Üü
Фф > Ff
Хх > Xx
Һһ > Hh
Цц > Cc
Чч > Çç
Шш > Şş
Щщ > ŞŞşş (or just merge with Şş)
Ыы > Iı
Іі > İi
Ээ > Ee
Юю > YUyu
Яя > YAya


Barlıq adamdar tumısınan azat jäne qadir-qasiyeti men quqıqtarı teñ bolıp düniyege keledi. Adamdarğa aqıl-parasat, ar-ojdan berilgen, sondıqtan olar bir-birimen twıstıq, bawırmaldıq qarım-qatınas jasawları tiyis.

...so, basically the same as sangi39's latter one with a difference of only two letters, and even closer to Crimean Tatar and pretty much identical to the already-existing alternative orthography, the only real difference being <iy> instead of <ï>.

PS: Isn't the correct word құқықтары, not *кұқықтары? The former has waaaaay more results on Google, at least, and follows the consonant harmony. That's why I used that instead, but it could of course be that they're just two variants of the same word, but even then the former is clearly more common.
User avatar
Frislander
runic
runic
Posts: 2809
Joined: Sat 14 May 2016, 17:47
Location: The North

Re: How would you romanize Kazakh?

Post by Frislander » Tue 12 Dec 2017, 15:19

/m n ŋ/ <m n ng>
/p b t d k ɡ q/ <p b t d k g q>
/*t͡s *t͡ɕ/ <ts tsh>
/*f *v s z ʃ ʒ *ɕ *x ʁ *h/ <f v s z c j sh x ğ h>
/l j w/ <l y w>
/r/ <r>

/ɘ~ɪ ʉ ə ʊ/ <i u e o>
/əj~ɘj ʊw~əw~ʉw~ɘw <ey ew>
/i̯ɘ y̯ʉ~ø~œ u̯o/ <ie ue uo>
/*i̯e/ <ii>
/æ ɑ/ <á a>
/*jo jʊw~jəw~jʉw~jɘw jɑ/ <yo yow ya>

Sample:
Барлық адамдар тумысынан азат және қадір-қасиеті мен кұқықтары тең болып дүниеге келеді. Адамдарға ақыл-парасат, ар-ождан берілген, сондықтан олар бір-бірімен туыстық, бауырмалдық қарым-қатынас жасаулары тиіс

Barleq adamdar tewmesenan azat jánii qadir-qaseyieti mien koqeqtare tieng buolep duneyiegie kieliedi. Adamdarğa aqel-paracat, ar-uojdan bierilgien, suondeqtan uolar bir-birimien tewesteq, bawermaldeq qarem-qatenas jasawlare teyis.
Last edited by Frislander on Thu 14 Dec 2017, 12:43, edited 1 time in total.
User avatar
esoanem
hieroglyphic
hieroglyphic
Posts: 64
Joined: Tue 05 Sep 2017, 13:03
Location: Cambridge, UK
Contact:

Re: How would you romanize Kazakh?

Post by esoanem » Wed 13 Dec 2017, 01:55

/m n ŋ/ <м н ң> -> <m n ň>
/p b t d k ɡ q/ <п б т д к г қ> -> <p b t d k g q>
/*t͡s *t͡ɕ/ <ц ч> -> <c ç>
/*f *v s z ʃ ʒ *ɕ x ʁ *h/ <ф в с з ш ж щ х ғ һ> -> <f v s z ş z̧ *ş h ğ *h>
/l j w/ <л й у> -> <l y w>
/r/ <р> -> <r>

/ɘ~ɪ ʉ ə ʊ/ <і ү ы ұ> -> <i u e o>
/əj~ɘj ʊw~əw~ʉw~ɘw <и у> -> <ey ow>
/i̯ɘ y̯ʉ~ø~œ u̯o/ <е ө о> -> <ye ö wo>
/*i̯e/ <э> -> <ye>
/æ ɑ/ <ә а> -> <ä a>
/*jo jʊw~jəw~jʉw~jɘw jɑ/ <ё ю я> -> <yo yu ya>

so I have three diacritics, an umlaut for fronting, a cedilla for palatalisation, and hačeks/breves for a variant which is velar (I don't really care which it is they're hard to distinguish in writing and either'd work fine). There're potentially some ambiguities with the <y> and <w> which could potentially be part of a digraph (on one side or the other) or independent; most of these can be avoided using optional apostrophes in cases where a <y> or <w> would occur between two vowels e.g. <ыйы> -> <eye>, <ые> <e'ye>, <иы> <ey'e>

Барлық адамдар тумысынан азат және қадір-қасиеті мен кұқықтары тең болып дүниеге келеді. Адамдарға ақыл-парасат, ар-ождан берілген, сондықтан олар бір-бірімен туыстық, бауырмалдық қарым-қатынас жасаулары тиіс

Barleq adamdar towmesenan azat z̧änye qadir-qaseyyeti myen koqeqtare tyeň bwolep duneyyegye kyelyedi. Adamdağa aqel-parasat, ar-woz̧dan byerilgyen, swondeqtan wolar bir-birimyen tow'esteq, baow'ermaldeq qarem-qatenas z̧acaowlare teyis.
My pronouns are they/them/their

:gbr: native | :esp: advanced | :deu: intermediate | :fra: intermediate | :rus: basic | :ell: lapsed | :navi: lapsed | :con: making a bunch
User avatar
Thrice Xandvii
darkness
darkness
Posts: 3829
Joined: Sun 25 Nov 2012, 10:13
Location: Carnassus

Re: How would you romanize Kazakh?

Post by Thrice Xandvii » Wed 13 Dec 2017, 03:04

Ya know, I feel like the proposed orthography is missing something. I feel like all the letters without apostrophes should get some so they don't feel left out and you can just double the number on letters that already had them to avoid ambiguity.
Image
User avatar
Xonen
moderator
moderator
Posts: 1455
Joined: Sat 15 May 2010, 23:25

Re: How would you romanize Kazakh?

Post by Xonen » Thu 14 Dec 2017, 01:59

Vlürch wrote:
Tue 12 Dec 2017, 12:09
Аа > Aa
Әә > Ää
Бб > Bb
Вв > Vv (or just merge with Ww)
Гг > Gg
Ғғ > Ğğ
Дд > Dd
Ее > YEye (word-initially and after vowels), Ee (elsewhere)
Ёё > YOyo
Жж > Jj
Зз > Zz
Ии > İYiy
Йй > Yy
Кк > Kk
Ққ > Qq
Лл > Ll
Мм > Mm
Нн > Nn
Ңң > Ññ
Оо > Oo
Өө > Öö
Пп > Pp
Рр > Rr
Сс > Ss
Тт > Tt
Уу > Ww
Ұұ > Uu
Үү > Üü
Фф > Ff
Хх > Xx
Һһ > Hh
Цц > Cc
Чч > Çç
Шш > Şş
Щщ > ŞŞşş (or just merge with Şş)
Ыы > Iı
Іі > İi
Ээ > Ee
Юю > YUyu
Яя > YAya


Barlıq adamdar tumısınan azat jäne qadir-qasiyeti men quqıqtarı teñ bolıp düniyege keledi. Adamdarğa aqıl-parasat, ar-ojdan berilgen, sondıqtan olar bir-birimen twıstıq, bawırmaldıq qarım-qatınas jasawları tiyis.

...so, basically the same as sangi39's latter one with a difference of only two letters, and even closer to Crimean Tatar and pretty much identical to the already-existing alternative orthography, the only real difference being <iy> instead of <ï>.
Something like this strikes me as the most sensible option - especially since, as you point out, it's already in use. I do also kind of like the idea of marking the vowels with on- and offglides with digraphs (as in Frislander's and esoanem's suggestions), which avoids the need to introduce special characters, but ultimately, I think /i̯ɘ/ at least warrants its own character; it appears to be extremely common and occurs as the front-vocalic counterpart to /ɑ/ in several suffixes, so it's a question of both economy and symmetry. And if /i̯ɘ/ gets a letter of its own, then it makes sense to be systematic and give one to /y̯ɵ/ and /u̯ʊ/ as well. Incidentally, the commonness of /i̯ɘ/ also happens to make Kazakh exceptionally well suited for being written in Cyrillic, so the most sensible option would be not to romanize it all.

Of course, we can't have that, because politics. And we can't have a sensible romanization either, because Nazarbayev or whoever wants 1) something that maps directly onto the existing Cyrillic orthography, 2) no digraphs and 3) no "hooks or superfluous dots". Also, apostrophes don't count as superfluous dots, the lack of a dot counts as a superfluous dot on <ı>, and the combination of a letter with an apostrophe counts as neither adding a superfluous dot nor as a digraph. None of which makes any sense - but since I enjoy the challenge, I started wondering if it would be possible to work within those limitations and still come up with something at least slightly less abominable than the impending apostrophecalypse.

First things first: <y'> for /w/...
Spoiler:
Image
...when <w> isn't even used in the orthography at all...
Spoiler:
Image
...should just be replaced with <w>.

Other than that, it gets trickier. A part of the problem is that I don't really know Kazakh phonology that well; is that contrast between velars and uvulars actually phonemic, for instance? Seems to me like the latter only occur in back-vocalic contexts, but I could be missing something. If we assume it's not, we can ignore /q ʁ/ and just use <k g>, respectively, for those. This gets rid of the apostrophe on <g'>, and would also free up the letter <q>; you could use it as a poor man's <ŋ>, but I think I'd prefer <ng> for /ŋ/, since I seem to recall seeing somewhere that they already sometimes use <нг> (although yeah, technically this means I'm cheating on the "no digraphs" rule, even if this particular digraph has some precedent). Another possibility would be to use <q> (instead of <c'>) for /t͡ɕ/, but since that apparently only occurs in Russian loanwords, it's not a major concern anyway.

To get rid of the rest of the apostrophes, replace <s'> for /ʃ/ with <x> and <i'> for /j/ with <y>.

As for the vowels... I spent like two hours thinking about this while I really should've been doing something else, and finally gave up. I can't think of a good way to distinguish between /ɑ æ/; the only graphemes that make intuitive sense are <a e>, but it's pretty much impossible to make the system work without using <e> for something else. Then again, /æ/ seems to have a fairly low functional load, and /ɑ/ only occurs in back-vocalic words, so it might not be that bad to just use <a> for both. Not an ideal solution by any means, though, so if anyone's got any suggestions, I'm all computer screens.

...and I ended up using apostrophes for marking the onglides, but at least that gives them a consistent function. In full, the vowels would look like this:

/i̯ɘ y̯ɵ u̯ʊ/ <'e 'u 'o>
/ɘ ʉ ə ʊ/ <i u e o>
/æ ɑ/ <a>

So to sum up:

а, ә > a
б > b
в > v
г, ғ > g
д > d
е > 'e
ж > Jj
з > z
и, й > y
к, қ > k
л > l
м > m
н > n
ң > ng
о > 'o
ө > 'u
п > p
р > r
с > s
т > t
у > w
ұ > o
ү > u
ф > f
x, һ > h
ц > c
ч > q
ш, щ > x
ы > e
і > i

Барлық адамдар тумысынан азат және қадір-қасиеті мен кұқықтары тең болып дүниеге келеді. Адамдарға ақыл-парасат, ар-ождан берілген, сондықтан олар бір-бірімен туыстық, бауырмалдық қарым-қатынас жасаулары тиіс.
Barlek adamdar twmesenan azat jan'e kadir-kasy'eti m'en kokektare t'eng b'olep duny'eg'e k'el'edi. Adamdarga akep-parasat, ar 'ojdan b'erilg'en, s'ondektan 'olar bir-birim'en twestek, bawermaldek karem-katenas jasawlare tyis.
User avatar
Vlürch
sinic
sinic
Posts: 215
Joined: Wed 09 Mar 2016, 21:19
Location: Finland
Contact:

Re: How would you romanize Kazakh?

Post by Vlürch » Thu 14 Dec 2017, 16:23

Xonen wrote:
Thu 14 Dec 2017, 01:59
I think /i̯ɘ/ at least warrants its own character; it appears to be extremely common and occurs as the front-vocalic counterpart to /ɑ/ in several suffixes, so it's a question of both economy and symmetry. And if /i̯ɘ/ gets a letter of its own, then it makes sense to be systematic and give one to /y̯ɵ/ and /u̯ʊ/ as well. Incidentally, the commonness of /i̯ɘ/ also happens to make Kazakh exceptionally well suited for being written in Cyrillic, so the most sensible option would be not to romanize it all.
I've only heard Kazakh spoken in a few videos on Youtube, briefly on TV in some documentary about something, etc. since my dedication to learning it never got to the point where I'd spend much time listening to it spoken, but I do listen to fairly large amounts of Kazakh music. That could mean my understanding of the language's phonology is skewed because it's different in singing (which I have noticed to be the case); but anyway, as far as I can tell, the vowels are actually much closer to just about any language with similar vowels than Wikipedia claims. I mean, especially the <е> sounds like [e~e̞~ë̞~ë~ɘ̟~ɘ~ə̟] or whatever much more commonly than [i̯ɘ]; consonants before it are palatalised, though, so it kind of makes sense to transcribe it as /i̯ɘ/, and maybe in real everyday speech it really does sound like that, but AFAICT it's just that it's /je/ word-initially and after vowels, and that's generally what it corresponds to in other Turkic languages.
Xonen wrote:
Thu 14 Dec 2017, 01:59
Other than that, it gets trickier. A part of the problem is that I don't really know Kazakh phonology that well; is that contrast between velars and uvulars actually phonemic, for instance? Seems to me like the latter only occur in back-vocalic contexts, but I could be missing something. If we assume it's not, we can ignore /q ʁ/ and just use <k g>, respectively, for those.
Yeah, that's exactly what Tatar does and why it looks so different from Bashkir even though the two form one of the most closely related language pairs considered separate languages. Personally, a huge part of why I like Bashkir more than Tatar is that its alphabet is much more badass-looking with its <ҡ>, <ғ>, <ҫ> and <ҙ>. It's also slightly closer to Kazakh, and since I fucking love Kazakh, that's automatically a plus. And let's not forget that it has /θ/ and /ð/, the latter of which is among the cutest sounds any language could dream of having. Kinda sad that Finnish doesn't have it, especially knowing that it used to...
Xonen wrote:
Thu 14 Dec 2017, 01:59
Another possibility would be to use <q> (instead of <c'>) for /t͡ɕ/, but since that apparently only occurs in Russian loanwords, it's not a major concern anyway.

To get rid of the rest of the apostrophes, replace <s'> for /ʃ/ with <x>.
Using <q> for /t͡ɕ/ and <x> for /ʃ/ would be so Chinese that it would retroactively tear the Sino-Soviet split into a full-blown war. [xP]
Xonen wrote:
Thu 14 Dec 2017, 01:59
I can't think of a good way to distinguish between /ɑ æ/; the only graphemes that make intuitive sense are <a e>, but it's pretty much impossible to make the system work without using <e> for something else. Then again, /æ/ seems to have a fairly low functional load, and /ɑ/ only occurs in back-vocalic words, so it might not be that bad to just use <a> for both. Not an ideal solution by any means, though, so if anyone's got any suggestions, I'm all computer screens.
Well, Uzbek uses <a> for both [æ] and [ɑ], <o> for [o] and [ø] and <u> for [u​] and [y], but they're considered allophones due to the loss of vowel harmony. And, supposedly, it doesn't really have [ɑ] at all even as an allophone, but I can't not hear it in just about any clip of spoken Uzbek or Uzbek songs. Then again, the problem might be that my ears refuse to accept that [ɒ] is /o/ and as such I hear words where other Turkic languages have /ɑ/ but Uzbek has /o/ as having /ɑ/, which being [ɒ] makes me hear it as [ɑ], and Uzbek really doesn't have [ɑ] even as an allophone of anything and it's just an auditory illusion... but I kinda doubt that that's the case.

Anyway, it would be possible to do something like this:
Аа, Әә > Aa
Ее, Ыы > Ee
Оо, Өө > Oo
Ұұ, Үү > Uu
Іі, Ии > Ii

Thus, even with your consonant romanisation that uses <k> and <g> for both the velars and uvulars, it would still be possible to tell whether a word has back or front vowels by suffixes having the "wrong" letter. It could be taken a step further by romanising both <н> and <ң> as <n>.

For example,
Current Cyrillic: қол (hand), қолдың (hand's), қолдар (hands), қолдардың (of hands)
Romanisation: kol (hand), kolden (hand's), koldar (hands), koldarden (of hands)

Current Cyrillic: әйел (woman), әйелдің (woman's), әйелдер (women), әйелдердің (women's)
Romanisation: ayel (woman), ayeldin (woman's), ayelder (women), ayelderdin (women's)

I looked up if the word айыл exists in Kazakh, and it does, apparently meaning the belt in a horse's saddle or something. Even though the two would be homographs, I'm fairly certain context would clear any ambiguity in this case, and any suffix would immediately disambiguate them anyway. There probably would be homographs that wouldn't be so easily disambiguated by context, though, and I don't know how they should be dealt with. The good thing is that Kazakh's rounding harmony is not represented in writing in Cyrillic and also isn't as prominent as in other Turkic languages, so there wouldn't be confusion between words due to the back and front rounded vowels being represented by the same letter even if there was otherwise.

There are one-syllable words that could cause trouble, though, like:
тор (net, cage, grid)
төр (I don't really know what this means; this Kazakh-Russian PDF dictionary says it's "почетное место в доме (юрте), в комнате", so some kind of home shrine or something?)
торды > torde (the net/cage/grid (accusative))
төрде > torde (in/on/at whatever төр is)

...but a cheap solution could be to romanise <ы> as <u> in such cases?
shimobaatar
darkness
darkness
Posts: 10197
Joined: Fri 12 Jul 2013, 22:09
Location: PA → IN

Re: How would you romanize Kazakh?

Post by shimobaatar » Wed 28 Feb 2018, 17:34

I couldn't find any mention of this elsewhere on the board, so I figured I'd mention it here. Apparently, the official romanization that Kazakhstan plans to adopt has been revised.
User avatar
Pabappa
cuneiform
cuneiform
Posts: 172
Joined: Sat 18 Nov 2017, 02:41
Contact:

Re: How would you romanize Kazakh?

Post by Pabappa » Wed 28 Feb 2018, 17:38

Very nice. It's still weird in various ways, but the new idea is more artistically appealing and apparently more helpful for programmers.
Image
User avatar
All4Ɇn
mayan
mayan
Posts: 1738
Joined: Sat 01 Mar 2014, 07:19

Re: How would you romanize Kazakh?

Post by All4Ɇn » Wed 28 Feb 2018, 22:20

Thank God! I'm not particularly a fan but it's definitely a vast improvement over the older one. The sample from earlier rendered in the new script would be the following:


Барлық адамдар тумысынан азат және қадір-қасиеті мен кұқықтары тең болып дүниеге келеді. Адамдарға ақыл-парасат, ар-ождан берілген, сондықтан олар бір-бірімен туыстық, бауырмалдық қарым-қатынас жасаулары тиіс.

Barlyq adamdar týmysynan azat jáne qadir-qasıeti men kuqyqtary teń bolyp dúnıege keledi. Adamdarǵa aqyl-parasat, ar-ojdan berilgen, sondyqtan olar bir-birimen týystyq, baýyrmaldyq qarym-qatynas jasaýlary tıis.
Nachtuil
sinic
sinic
Posts: 410
Joined: Wed 20 Jul 2016, 23:16

Re: How would you romanize Kazakh?

Post by Nachtuil » Thu 01 Mar 2018, 03:06

What a great timely topic! I wish I saw this earlier... count me in as one who is glad that they updated it. The one their president came up really wasn't great!

I notice some differences between this inventory and the one on Kazakh but I will try to fill out the one this thread started with. Mine is incredibly boring but works I guess but I know next to nothing about the language besides what I can gleam from wikipedia. The sample text seems to suggest nothing that would cause trouble for what I have done phonotacticswise:
/m n ŋ/ <m n ng >
/p b t d k ɡ q/ <p b t d k g q>
/*t͡s *t͡ɕ/ <ts tc>
/*f *v s z ʃ ʒ *ɕ x ʁ *h/ < f v s z sh zh c kh gh h>
/l j w/ <l y w >
/r/ <r>

(Yikes these vowels)
/ɘ~ɪ ʉ ə ʊ/ <i u e uu>
/əj~ɘj ʊw~əw~ʉw~ɘw <ey uw >
/i̯ɘ y̯ʉ~ø~œ u̯o/ <ie oe o >
/*i̯e/ <ie >
/æ ɑ/ <a aa >
/*jo jʊw~jəw~jʉw~jɘw jɑ/ <yo yuw ya >
Nortaneous
earth
earth
Posts: 592
Joined: Sat 14 Aug 2010, 12:28

Re: How would you romanize Kazakh?

Post by Nortaneous » Tue 13 Mar 2018, 16:07

All4Ɇn wrote:
Wed 28 Feb 2018, 22:20
Thank God! I'm not particularly a fan but it's definitely a vast improvement over the older one. The sample from earlier rendered in the new script would be the following:


Барлық адамдар тумысынан азат және қадір-қасиеті мен кұқықтары тең болып дүниеге келеді. Адамдарға ақыл-парасат, ар-ождан берілген, сондықтан олар бір-бірімен туыстық, бауырмалдық қарым-қатынас жасаулары тиіс.

Barlyq adamdar týmysynan azat jáne qadir-qasıeti men kuqyqtary teń bolyp dúnıege keledi. Adamdarǵa aqyl-parasat, ar-ojdan berilgen, sondyqtan olar bir-birimen týystyq, baýyrmaldyq qarym-qatynas jasaýlary tıis.
Revision seems reasonable, but ı ý > ý w. If they're revising for programmers, shouldn't be using dotless i, because nonstandard casepairs are a pain in the ass.

Barlyq adamdar twmysynan azat jáne qadir-qasýeti men quqyqtary teń bolyp dúnýege keledi. Adamdarǵa aqyl-parasat, ar-ojdan berilgen, sondyqtan olar bir-birimen twystyq, bawyrmaldyq qarym-qatynas jasawlary týis.
anlztrk
rupestrian
rupestrian
Posts: 3
Joined: Wed 28 Oct 2015, 19:04

Re: How would you romanize Kazakh?

Post by anlztrk » Thu 15 Mar 2018, 20:21

I'd use the existing QazAqparat system with a few modifications, so

/m n ŋ/ <m n ñ>
/p b t d k ɡ q/ <p b t d k g q>
/*t͡s *t͡ɕ/ <c ç>
/*f *v s z ʃ ʒ *ɕ *x ʁ *h/ <f v s z ş j şş x ğ h>
/l j w/ <l y w>
/r/ <r>

/ɘ~ɪ ʉ ə ʊ/ <i ü ı u>
/əj~ɘj ʊw~əw~ʉw~ɘw <ĭy ŭw>
/i̯ɘ y̯ʉ~ø~œ u̯o/ <e ö o>
/*i̯e/ <e>
/æ ɑ/ <ä a>
/*jo jʊw~jəw~jʉw~jɘw jɑ/ <yo yŭw ya>

Barlıq adamdar tŭwmısınan azat jäne qadir-qasĭyeti men kuqıqtarı teñ bolıp dünĭyege keledi. Adamdarğa aqıl-parasat, ar-ojdan berilgen, sondıqtan olar bir-birimen tŭwıstıq, bawırmaldıq qarım-qatınas jasawları tĭyis.

But if they really want to be different from Turkish, then I'd go with something like that:

/m n ŋ/ <m n ń>
/p b t d k ɡ q/ <p b t d k g q>
/*t͡s *t͡ɕ/ <ts c>
/*f *v s z ʃ ʒ *ɕ *x ʁ *h/ <f v s z ś j ś h ǵ h>
/l j w/ <l ý w>
/r/ <r>

/ɘ~ɪ ʉ ə ʊ/ <i ú y u>
/əj~ɘj ʊw~əw~ʉw~ɘw <í w>
/i̯ɘ y̯ʉ~ø~œ u̯o/ <e ó o>
/*i̯e/ <e>
/æ ɑ/ <á a>
/*jo jʊw~jəw~jʉw~jɘw jɑ/ <ýo ýw ýa>

Barlyq adamdar twmysynan azat jáne qadir-qasíeti men kuqyqtary teń bolyp dúníege keledi. Adamdarǵa aqyl-parasat, ar-ojdan berilgen, sondyqtan olar bir-birimen twystyq, bawyrmaldyq qarym-qatynas jasawlary tíis.
User avatar
Vlürch
sinic
sinic
Posts: 215
Joined: Wed 09 Mar 2016, 21:19
Location: Finland
Contact:

Re: How would you romanize Kazakh?

Post by Vlürch » Tue 20 Mar 2018, 20:39

shimobaatar wrote:
Wed 28 Feb 2018, 17:34
I couldn't find any mention of this elsewhere on the board, so I figured I'd mention it here. Apparently, the official romanization that Kazakhstan plans to adopt has been revised.
Phew! It's so much better than the mess with the apostrophes, although using an acute accent on both consonants and vowels is a bit... annoying... and makes certain words look pretty ugly:
1. At least one country's name will be ruined: Ауғанстан -> Aýǵanstan (Afghanistan). Maybe that's a reflection of the reality in said country, being in a state of civil war for decades and all that, but it would be a better idea to use <w> instead of <ý>.
2. May decrease the appeal of spelunking: үңгір -> úńgir (cave)
3. Genitives: әкенің -> ákeniń (father's), etc. become somewhat uglier, I guess.

So, if <ý> was <w> instead, I'd like it a lot more.
As for <ń>, why not just make it <n> before <k g q ǵ> and <ng> elsewhere?
Ideally, <ǵ> should be something else too... maybe <x> if the Cyrillic <х һ> are really merged as Latin <h>.

...on second thought, Awxanstannyng úngirleri or whatever doesn't look very intuitive. Merging it with <g> may seem like a good idea, but that'd reduce the alphabet's phoneticism even more. Since there are the digraphs <sh ch>, why not just go for the most obvious and boring <gh>...?

There's something they clearly didn't think about at all, though: merging <и й> as <ı> will have some absolutely disastrous results. Like, әйел -> áıel (woman). What the hell? If I saw that, I'd think it was supposed to be /ɑːɯel/ or maybe /æəel/ or something if I got extremely lucky. I'd literally never figure out that it's /æjel/ because <ı> for /j/ is up there as a competitor for the prize of most ridiculous orthographic ridiculousness in all of human history.
User avatar
Pabappa
cuneiform
cuneiform
Posts: 172
Joined: Sat 18 Nov 2017, 02:41
Contact:

Re: How would you romanize Kazakh?

Post by Pabappa » Wed 11 Apr 2018, 05:37

https://en.m.wikipedia.org/wiki/Kyrgyz_language might be up soon.... It looks like they've got a similar phonology, though maybe a bit lighter than Uzbek and Kazakh. Still they have the same vowel problem, and a TV station has decided to use w for / u/.
Image
User avatar
Omzinesý
mayan
mayan
Posts: 2408
Joined: Fri 27 Aug 2010, 07:17
Location: nowhere [naʊhɪɚ]

Re: How would you romanize Kazakh?

Post by Omzinesý » Wed 11 Apr 2018, 12:55

I don't quite understand the phonology.
Which vowels correspond in the harmony?
shimobaatar
darkness
darkness
Posts: 10197
Joined: Fri 12 Jul 2013, 22:09
Location: PA → IN

Re: How would you romanize Kazakh?

Post by shimobaatar » Wed 11 Apr 2018, 14:33

Omzinesý wrote:
Wed 11 Apr 2018, 12:55
I don't quite understand the phonology.
Which vowels correspond in the harmony?
At least in Kyrgyz, I'd assume it works like in Turkish (/i y e ø/ vs. /ɯ u ɑ o/), right?
User avatar
Vlürch
sinic
sinic
Posts: 215
Joined: Wed 09 Mar 2016, 21:19
Location: Finland
Contact:

Re: How would you romanize Kazakh?

Post by Vlürch » Thu 12 Apr 2018, 18:51

shimobaatar wrote:
Wed 11 Apr 2018, 14:33
Omzinesý wrote:
Wed 11 Apr 2018, 12:55
I don't quite understand the phonology.
Which vowels correspond in the harmony?
At least in Kyrgyz, I'd assume it works like in Turkish (/i y e ø/ vs. /ɯ u ɑ o/), right?
Yeah. There's also /æ/ in Kazakh, which isn't phonemic in Turkish and is debatable in Kyrgyz (IIRC it has [æ] as both a "fronted allophone" of /ɑ/ and an allophone of /e/ in some circumstances, so it could be called phonemic if that's the case; I don't remember the details, though, and I've always been much more into Kazakh anyway).

The strangely narrow phonemic transcription of Kazakh always confuses me, as well as at least Wikipedia's obsession with it having tongue root harmony instead of front/back harmony. I'd totally get it as a phonetic transcription and description, but why not just transcribe it phonemically as /ɑ æ e i ɯ o ø u y/ when that's what the vowels correspond to in other Turkic languages and is practically how they are at least when sung?

Back: <а ы о ұ> /ɑ ɯ o u/
Front: <ә е і ө ү> /æ e i ø y/

What's written <и> in Kazakh, though, is something I'm not really sure about... It's generally [ij~əj] or whatever before vowels, but I think it's sometimes also just [i​] in words that otherwise have back vowels (presumably all loanwords) and whatnot...? I honestly don't know what that's all about, but well.
Post Reply