Vocabulary creation for constructed language

If you're new to these arts, this is the place to ask "stupid" questions and get directions!
Post Reply
User avatar
OTʜᴇB
roman
roman
Posts: 935
Joined: Sat 14 May 2016, 10:59
Location: SW England

Vocabulary creation for constructed language

Post by OTʜᴇB » Sat 14 May 2016, 11:13

Hello.

I am trying to create a categorization of English words in order to create the corresponding word for my language. This doesn't involve manipulating an existing word at all but instead assigning letters or groups of letters to different categorizations which would then be strung together to form the corresponding word.

My original idea came from the philosophical language of John Wilkins and I have been using the categorization used in the Universal Language Dictionary. This works to an extent. I can categorize words and generate words from it very easily, but almost every word has up to 20 different words that differ in only 1 or 2 letters. For example, the word "wall" is currently "iʔisakaʔalilɛnit" and the word "window" is "iʔisakaʔalilinit".

I'm trying (and failing) to figure out how to create a categorization system such that it has the organization of my current setup, but without it boiling down to numbered entries. It would also ideally have a reasonable degree of diversity.

Has anybody got any ideas? I'm also trying to use methods such as saying "father" and "mother" by adding a different suffix to the word "parent" as to connect as many words as possible as I think it would become very intricate and elegant.

Thanks for any help [:D]

NOTE: I'm posting this in the Beginner's section as I'm not quite sure where to put this. Please do tell me if it fits in best elsewhere and how it can be moved there. [:)]

[b]What I have so far:[/b]
I have been thinking about 3 distinct methods:
The first is a categorization much akin to the Scientific Classification of Animals in which some large categories are split into many smaller categories which are split into many even smaller categories and so on all the way down to sub-sub-sub-sub-...-sub-sub-categories containing less than 3 words.

The second method involves assigning certain letters or strings of letters to different properties and forming the words based on these properties. The order of these could also be affected by other properties or to give the same string of letters different meanings based on position in the word.

The final method is a hybrid of the two in which words are categorized but use the property method once at a low level or vice versa. This, I believe, will give the best results but would be the most challenging to devise effectively.
Last edited by OTʜᴇB on Sat 14 May 2016, 17:47, edited 1 time in total.
:con: : Dijo
:con: : Language 8 (Reviving Dijo)

BTW I use Arch
User avatar
Creyeditor
mongolian
mongolian
Posts: 3948
Joined: Tue 14 Aug 2012, 18:32

Re: Vocabulary creation for constructed language

Post by Creyeditor » Sat 14 May 2016, 16:53

Yay, an engelang question [:)]

Okay, one way to solve the problem of the Neighbourhood Effect, would be to have a constraint over your lexicon, prohibiting words, that are to similar. For example, you could have words being different in at least three letters by having your categories signified by three letters. These three letter combinations should in every position have three unique letter in comparison to every letter combination in the same position.

So let's say we have three letters: a,b,c

that gives us the following possibilites:
Spoiler:
aaa
aab
aac
aba
abb
abc
aca
acb
acc
baa
bab
bac
bba
bbb
bbc
bca
bcb
bcc
caa
cab
cac
cba
cbb
cbc
cca
ccb
ccc
But we have a limitation on possible sets of three letter combination for a position.

Suppose we have a simplified version:
Let's say the first position has the following set:
Spoiler:
bac - human
cba - animal
acb - inanimate
So we could have three sets in the second position. Each set has to have unique three letter combination
Spoiler:
humans:
aab - male
bca - female
cbc - other

animals:

bab - fish
abc - bird
ccb - beast

inanimate

aba - plant
cac - tool
bcb - other
and so on and so forth. Examples include bacaab man, bacbca woman and cbabab bird. If you have more letters, you can have more categories.
Creyeditor
"Thoughts are free."
Produce, Analyze, Manipulate
1 :deu: 2 :eng: 3 :fra: 4 :esp: 4 :ind:
:con: Ook & Omlűt & Nautli languages & Sperenjas
[<3] Papuan languages, Morphophonology, Lexical Semantics [<3]
User avatar
Lao Kou
korean
korean
Posts: 5489
Joined: Sun 25 Nov 2012, 10:39
Location: 蘇州/苏州

Re: Vocabulary creation for constructed language

Post by Lao Kou » Sat 14 May 2016, 17:05

My head hurts. [:$]
道可道,非常道
名可名,非常名
User avatar
Lambuzhao
earth
earth
Posts: 7162
Joined: Sun 13 May 2012, 01:57

Re: Vocabulary creation for constructed language

Post by Lambuzhao » Sat 14 May 2016, 17:28

I would suggest the following litmus test:


Run a pair of words (mebbe make multiple copypastas of each) through a TTS program.

EG:
"iʔisakaʔalilɛnit" "iʔisakaʔalilinit"

record the TTS and play it back. Play with the speed, making it faster and faster, if the program u use can do that.

Can you tell which word is which at higher speeds?
No :?: :wat: \

Chances are, your :con: people won't, either.

I, too, went thru the bloviosynthetic phase of conlanging.
I came up with some whoppers. Then, as I started actually using those sesquipedalian words, I started paraphrasing the megalodontian vocabulary. A lot.
Then I made up 1-2 syllable other words for the same terms, just because.
EG
antmotraozrfueni [smaller.carnivorous.furry.creature] was ,more tedious to write in my conscript than motraurfueni, or even myaufue [meowing.furry], all for 'cat'.

I think it's a noble effort to experiment this way, you just might bonk early in this linguistic triathlon, is all I'm saying.
User avatar
OTʜᴇB
roman
roman
Posts: 935
Joined: Sat 14 May 2016, 10:59
Location: SW England

Re: Vocabulary creation for constructed language

Post by OTʜᴇB » Sat 14 May 2016, 17:52

Creyeditor, this is a very good idea and I will definitely use this. However, the problem I'm having is less in choosing the strings of letters, but more in categorizing the words in the first place.
:con: : Dijo
:con: : Language 8 (Reviving Dijo)

BTW I use Arch
User avatar
OTʜᴇB
roman
roman
Posts: 935
Joined: Sat 14 May 2016, 10:59
Location: SW England

Re: Vocabulary creation for constructed language

Post by OTʜᴇB » Sat 14 May 2016, 18:03

Lambuzhao wrote:I would suggest the following litmus test:


Run a pair of words (mebbe make multiple copypastas of each) through a TTS program.

EG:
"iʔisakaʔalilɛnit" "iʔisakaʔalilinit"

record the TTS and play it back. Play with the speed, making it faster and faster, if the program u use can do that.

Can you tell which word is which at higher speeds?
No :?: :wat: \

Chances are, your :con: people won't, either.

I, too, went thru the bloviosynthetic phase of conlanging.
I came up with some whoppers. Then, as I started actually using those sesquipedalian words, I started paraphrasing the megalodontian vocabulary. A lot.
Then I made up 1-2 syllable other words for the same terms, just because.
EG
antmotraozrfueni [smaller.carnivorous.furry.creature] was ,more tedious to write in my conscript than motraurfueni, or even myaufue [meowing.furry], all for 'cat'.

I think it's a noble effort to experiment this way, you just might bonk early in this linguistic triathlon, is all I'm saying.
This is another good point. Maybe this shortened version that could appear could be the "slang" or informal dialect of the language?
:con: : Dijo
:con: : Language 8 (Reviving Dijo)

BTW I use Arch
User avatar
Creyeditor
mongolian
mongolian
Posts: 3948
Joined: Tue 14 Aug 2012, 18:32

Re: Vocabulary creation for constructed language

Post by Creyeditor » Sat 14 May 2016, 18:19

OTheB wrote:Creyeditor, this is a very good idea and I will definitely use this. However, the problem I'm having is less in choosing the strings of letters, but more in categorizing the words in the first place.
Oh, sorry, I might have misunderstod you, I thought you were using the ULD categorization successfully.
I also just noticed the second part of your post (What I have so far:...). If you use a combination it would actually mirror the naming conventions for species in biology.

You could use a limted set of postnomen and multi-morphemic prenomen, where the prenomen is the (sub)-categorization and the postnomen would highlight certain properties. To use part of Wilkins system: zitα is dog and you could have a postnomen to make it wolf or fox, e.g.

zitαbig - big dog = wolf
zitαcil - small dog = fox

The prenomen would be unique, but the postnomen could be recycled.
Creyeditor
"Thoughts are free."
Produce, Analyze, Manipulate
1 :deu: 2 :eng: 3 :fra: 4 :esp: 4 :ind:
:con: Ook & Omlűt & Nautli languages & Sperenjas
[<3] Papuan languages, Morphophonology, Lexical Semantics [<3]
User avatar
OTʜᴇB
roman
roman
Posts: 935
Joined: Sat 14 May 2016, 10:59
Location: SW England

Re: Vocabulary creation for constructed language

Post by OTʜᴇB » Sat 14 May 2016, 18:26

Creyeditor wrote:
OTheB wrote:Creyeditor, this is a very good idea and I will definitely use this. However, the problem I'm having is less in choosing the strings of letters, but more in categorizing the words in the first place.
Oh, sorry, I might have misunderstod you, I thought you were using the ULD categorization successfully.
I also just noticed the second part of your post (What I have so far:...). If you use a combination it would actually mirror the naming conventions for species in biology.

You could use a limted set of postnomen and multi-morphemic prenomen, where the prenomen is the (sub)-categorization and the postnomen would highlight certain properties. To use part of Wilkins system: zitα is dog and you could have a postnomen to make it wolf or fox, e.g.

zitαbig - big dog = wolf
zitαcil - small dog = fox

The prenomen would be unique, but the postnomen could be recycled.
This is similar to my idea to use suffixes to branch out to similar or related words. For instance, if I take "kalet" to mean "Parent" for the purpose of this example, then:
"kaletaʔi" is "Father" and
"kaletai" is "Mother" where "aʔi" is the imperative suffix and "ai" is the indicative suffix in my language.
I would then add the final suffix for the word types to create nouns and verbs:
"kaletaʔiit" is "a Father"
"kaletaʔia" is "to Father" and so on. Here "a" is the verb suffix and "it" is the noun suffix.
:con: : Dijo
:con: : Language 8 (Reviving Dijo)

BTW I use Arch
User avatar
Creyeditor
mongolian
mongolian
Posts: 3948
Joined: Tue 14 Aug 2012, 18:32

Re: Vocabulary creation for constructed language

Post by Creyeditor » Sat 14 May 2016, 18:53

Yes, it was supposed to be similar. For having words look more different see my first post.
Creyeditor
"Thoughts are free."
Produce, Analyze, Manipulate
1 :deu: 2 :eng: 3 :fra: 4 :esp: 4 :ind:
:con: Ook & Omlűt & Nautli languages & Sperenjas
[<3] Papuan languages, Morphophonology, Lexical Semantics [<3]
User avatar
OTʜᴇB
roman
roman
Posts: 935
Joined: Sat 14 May 2016, 10:59
Location: SW England

Re: Vocabulary creation for constructed language

Post by OTʜᴇB » Sat 14 May 2016, 19:18

Ok so now I have a way of having diversity within the words, I need a method of categorizing English words so that I may derive a new word from them. For instance I may have a bit of categorization as such:
Spoiler:
>Material
>>Solid
>>>Metal
>>>>Copper
>>>Wood
>>>Plastic
>>Liquid
>>Gas
The general request on my post is for help designing this hierarchy in order to categorize each word.
:con: : Dijo
:con: : Language 8 (Reviving Dijo)

BTW I use Arch
User avatar
OTʜᴇB
roman
roman
Posts: 935
Joined: Sat 14 May 2016, 10:59
Location: SW England

Re: Vocabulary creation for constructed language

Post by OTʜᴇB » Sat 14 May 2016, 22:01

Ok I think I have a solution! :idea: :idea: :idea: :!: :!: :!:

I will take the ULD categorization as I was using previously, but use Creyeditor's method for "numbering" them in a way.

My system will work as follows:
I have 10 "areas", each with a base string of letters.
Within each area is a selection of the 40 chapters in the ULD. These add some more letters to their area's string.
I am using a base 9 numbering system as there are 9 letters in my alphabet that can be used for the numbering system.
I will take the number of the group of words in the ULD within each chapter and add the numbered string.
I will then add another numbered string based on what numbered entry in the group the word is but this will be counting in the opposite order.

This should create sufficient diversity in the language.

The only issue with this and near all similar methods is that as soon as I want to start properly expanding the vocabulary, I need to start thinking about how to order the words in each group, and how to order each group too. Using the ULD numbers should work for the time being but I may have to rework my vocabulary at some point in the future.

Using this system, I can now refer to my language as: "ɛtilanaiʎaxait" which is "language" with this system.

Now to check the wall and window problem. The results of this are:
"wall": "sakalaʎɛnk"
"window": "sakalaʎɛni"

Again, this only differs in one letter, but there is hope yet. What if I were to choose the numbers based on other parts? I could use the following rules:
  • Add chapter number to first group number.
  • First group number mod 48.
  • Subtract first group number from second group number.
  • Second group number mod 48.
  • Place vowel filler of number (first group number - second group number) mod 8
This essentially means that each word affects itself based on how it is categorized. The wall and window problem then yields this result:
"wall": "sakalɛxiʎale"
"window": "sakalɛxaʎɛla"

NOW we have difference!!! [:D] [:D] [:D] [:D] [:D] [:D] [:D] [:D] [:D] [:D] [:D] [:D] [:D] [:D] [:D] [:D] [:D] [:D] [:D] [:D]

Thanks for all the help, even if I did come to my own conclusion in the end. Creyeditor, thanks for the letter combination idea as that's what made the whole system work as well as it does.

With this system, my language is called: "ɛtaiʎalikiʔiʎaiʔit"

It just needs some adjustments on the category letters to trim down the word lengths and I have pretty much got my first conlang finished.
:con: : Dijo
:con: : Language 8 (Reviving Dijo)

BTW I use Arch
User avatar
GrandPiano
mayan
mayan
Posts: 2249
Joined: Sun 11 Jan 2015, 23:22
Location: Ohio, USA

Re: Vocabulary creation for constructed language

Post by GrandPiano » Sun 15 May 2016, 03:45

OTheB wrote:For instance, if I take "kalet" to mean "Parent" for the purpose of this example, then:
"kaletaʔi" is "Father" and
"kaletai" is "Mother" where "aʔi" is the imperative suffix and "ai" is the indicative suffix in my language.
Why do you use the imperative and indicative suffixes for "father" and "mother"?
:eng: - Native
:chn: - B2
:esp: - A2
:jpn: - A2
User avatar
OTʜᴇB
roman
roman
Posts: 935
Joined: Sat 14 May 2016, 10:59
Location: SW England

Re: Vocabulary creation for constructed language

Post by OTʜᴇB » Sun 15 May 2016, 11:56

GrandPiano wrote:
OTheB wrote:For instance, if I take "kalet" to mean "Parent" for the purpose of this example, then:
"kaletaʔi" is "Father" and
"kaletai" is "Mother" where "aʔi" is the imperative suffix and "ai" is the indicative suffix in my language.
Why do you use the imperative and indicative suffixes for "father" and "mother"?
Personal preference. It would fit as the stereotypical Father is strong where the stereotypical mother is gentle and caring. Then I would add the "augment" suffix on the end of those to create "grandmother" and "grandfather" respectively.

I should probably refer to them differently but the way I am using them is as follows:
"wall red-augment": "Wall is very red"
"wall red-diminish": "Wall is not very red"
"wall red-imperative": "Wall is definitely red"
"wall red-indicative": "Wall is likely red"

Or with verbs:
"eat-imperative": "must eat"
"eat-indicative": "should eat"
:con: : Dijo
:con: : Language 8 (Reviving Dijo)

BTW I use Arch
User avatar
GrandPiano
mayan
mayan
Posts: 2249
Joined: Sun 11 Jan 2015, 23:22
Location: Ohio, USA

Re: Vocabulary creation for constructed language

Post by GrandPiano » Sun 15 May 2016, 16:02

Maybe you're using it differently, but the indicative is usually just a statement of facts, so "wall red-indicative" would just be "wall is red", and "eat-indicative" would just be "eat" or "eats".
:eng: - Native
:chn: - B2
:esp: - A2
:jpn: - A2
User avatar
OTʜᴇB
roman
roman
Posts: 935
Joined: Sat 14 May 2016, 10:59
Location: SW England

Re: Vocabulary creation for constructed language

Post by OTʜᴇB » Sun 15 May 2016, 20:55

Then yes I should use a different term. Because "eat" would be the indicative. I'm also using it to say "May you eat?" as "eat-2ndPresentSimple-indicative-question" as a pose to "You must eat" which would be "eat-2ndPresentSimple-imperative"
:con: : Dijo
:con: : Language 8 (Reviving Dijo)

BTW I use Arch
Post Reply