CALS vs WALS: A Comparison

A forum for all topics related to constructed languages
User avatar
PTSnoop
cuneiform
cuneiform
Posts: 178
Joined: 01 May 2013 23:07

CALS vs WALS: A Comparison

Post by PTSnoop » 28 Apr 2015 20:18

Hi, I'm PTSnoop. You may vaguely remember me from the previous CALS vs WALS thread almost two years ago.

I originally set out to do a statistical comparison of the conlangs on CALS and the natlangs on WALS, to see what intriguing features of natural languages we conlangers tend to pay more/less attention to. Then I found out that the CALS numbers I was using had natlangs mixed in with them, realised I'd have to redo all my graphs, ran out of round tuits, got thoroughly distracted by other things, then some time later noticed I had a whole bunch of Conlangery episodes sitting in my rss feed reader and got thoroughly distracted back into the world of conlangs once again.

So here's the much-delayed more-accurate version of the CALS vs WALS study, complete with the two extra years of CALS and WALS data. This time round, I'm armed with Python webscraping and matplotlib, rather than just copypaste and Libreoffice Calc - so if I find out I'll have to redo all my numbers again, then it should be a much quicker process!

In the graphs below, green represents CALS/conlangs, blue represents WALS/natlangs. The darker-coloured bars represent the difference between the two, showing what percentage of conlangs "should" have this feature and don't, or "shouldn't" have this feature and do (as it were).

As before, I'll be splitting things up into multiple posts.

---

PART 1: PHONOLOGY AND MORPHOLOGY

Phonology

Image

Starting off with a surprise - compared to the numbers from two years ago, the phoneme inventory sizes for conlangs and natlangs match up really well.

In general, as I've been looking through these graphs, it seems like phonology's matched up pretty well, morphosyntax reasonably well, and the big statistical differences only really show up on the clause or sentence level questions. I've generally concentrated here on the graphs showing big differences between conlangs or natlangs - if I've missed out a graph, then that's likely to be because conlangs and natlangs have this feature in pretty similar distributions.

Image

Slight tendency here towards large inventories. Possibly the larger more-interesting Indo-European vowel systems are pulling things off balance here.

Image

I'd be tempted to describe this one as a tendency towards being consistent - if you've got voiced plosives, then you'll be voicing the fricatives too. But that wouldn't explain the lack of no-voicing-distinction languages... Maybe people just like voicing.

Image

Here we see more conlangers going for the more "interesting" vowel systems.

Image

Last time, I remember being surprised that the tonal-conlang and tonal-natlang values weren't further apart. Since then, it looks like the values have got even closer - apparently there are a lot more tonal conlangs than I'm aware of. We could still do with some more, though!

Image

Conlangers seem to favour irregular stress patterns here, with "none","both" and "don't know" all favoured at the expense of the humble trochee. Iambs seem to be holding their own, though.

Image

Unsurprisingly, the big winner here is the conspicuously-English dental fricative /θ/, turning up in over 30% of conlangs surveyed but only 7% of natlangs.

Morphology

Image

Not what I was expecting here. I'd have thought conlangs would generally be straightforwardly concatenative, but apparently here we see the opposite - more isolation and ablaut making things more interesting.

Image

In general, conlangs have fewer categories per verb than natlangs - until we hit the extreme heights of 12-13 per word, where the kitchen-sink conlangs outnumber the natlangs. It seems conlangs are generally tending towards the extremes here.

Image

Here, we see conlangs being neater and more regular than natlangs. We've got an excess of normal straightforward dependent-marking languages, and a lack of "inconsistent or other".

Image

One of the clearest trends so far - conlangs don't really do reduplication anywhere near as much as natlangs do.

Image

Two things to take away from this graph. First, that conlangers like explicit case-marking - possibly trying to be non-English again. Secondly, even if we look only at the languages that do have cases, we can see that the conlangs are weighted much more towards "no syncretism" than the natlangs.

---

Coming soon: Nominal Categories and Nominal Syntax.

User avatar
PTSnoop
cuneiform
cuneiform
Posts: 178
Joined: 01 May 2013 23:07

Re: CALS vs WALS: A Comparison

Post by PTSnoop » 02 May 2015 14:34

PART 2: NOUNS

Nominal Categories

Image

Conlangs are distinctly lacking associative plurals. Apparently I'm not the only person who doesn't really understand what they are...

Image

Here, we see conlangs being generally more interesting than natlangs. We're lacking some straightforward interrogative-based indefinite pronouns, in favour of our own special roots or custom systems.

Image

And similarly for reflexives, with (perhaps) more conlangers prefering new words for new systems instead of already-existing ones.

Image

Here we see conlangs strongly favouring regular ordinal systems ("one, two, three" and "oneth,twoth,threeth"), while natlangs prefer "inconsistent" mixed systems.

Image

Here's that no-reduplication conlang tendency again.

Image

Similarly to the reflexive pronouns, here we're seeing conlangers being more likely to create new words than use preexisting similar ones.

Image

Possibly conlangers prefer not to use affixes to mark possession. Or possibly we're just seeing a bias in the natlang sample data here - the WALS chapter notes "that languages of this sort are proportionally underrepresented on the map; they are much more common than their frequency on the map might suggest".

Nominal Syntax

Image

It seems that conlangers don't like multiple possessive classes. Possibly we're trying to get away from English here, with it's "of the" vs "'s".

Image

A lot more natlangs than conlangs are perfectly okay with unmarked bare adjectives. Also, when we do have nouned adjectives, we'd rather have it marked on the adjective itself than have separate dummy-nouns.

Image

And, once again, we see conlangers making new words rather than reusing old ones.

Coming soon: Verbs and Word Order.

User avatar
Aszev
admin
admin
Posts: 1521
Joined: 11 May 2010 04:46
Location: Upp.
Contact:

Re: CALS vs WALS: A Comparison

Post by Aszev » 03 May 2015 23:55

Thank you for making this thread, it's quite interesting to get this perspective, especially in such an accessible format! [:)]
Sound change works in mysterious ways.

Image CE

User avatar
kilenc
cuneiform
cuneiform
Posts: 142
Joined: 27 Mar 2015 23:14
Location: louisville
Contact:

Re: CALS vs WALS: A Comparison

Post by kilenc » 04 May 2015 00:15

Aszev wrote:Thank you for making this thread, it's quite interesting to get this perspective, especially in such an accessible format! [:)]
i second this, really amazing and helps a lot with avoiding a relex by making me think about grammar more deeply!
eventually ill work out a good conlang :)

User avatar
k1234567890y
runic
runic
Posts: 3014
Joined: 04 Jan 2014 04:47
Contact:

Re: CALS vs WALS: A Comparison

Post by k1234567890y » 04 May 2015 14:51

Besides from trying to Make a more "logical" and "ideal" language(which probably explains why conlangers tend to make more words, another possible reason is that they simply didn't notice the nature of certain words), many of the biased use of features are probably due to that conk angers tend to use features from languages of Eurasia especially those from European languages, and languages of Eurasia tend to be dependent marking and tend to use cases.

Also, it seems that it is unlikely for a person who is neither a linguist nor an anthropologist nor a person belonging to certain ethnic communities to know a non-mainstream language, which could be a reason for the biased use of certain features
...

Squall
greek
greek
Posts: 583
Joined: 28 Nov 2013 14:47

Re: CALS vs WALS: A Comparison

Post by Squall » 04 May 2015 22:05

How does WALS count languages?
It needs criteria to distinguish language and dialect.
Some facts may make some unusual features appear common. That includes proto-languages that were successfully spread to a large area, and sprachbunds that have lots of languages.

Another database is Phoible. It tells how common a phoneme is. But it does not show the coexistence of multiple phonemes and how frequent or limited a phoneme is in a language.

I am not a linguist, but it seems that conlangers are more familiar with English or European languages.

The IPA charts inspire conlangers to include more phonemes in their conlang. That explains a large number of conlangs that have front rounded vowels.
Most conlangers think that tones are weird or do not know how tones work, for that reason, tones are not popular.
The th sounds are popular because of the English language.

The chart of size of vowel inventories is very generic. It would be better if it separated each quantity instead of grouping them into intervals. The interval between 7 and 14 has lots of differences.
English is not my native language. Sorry for any mistakes or lack of knowledge when I discuss this language.
:bra: :mrgreen: | :uk: [:D] | :esp: [:)] | :epo: [:|] | :lat: [:S] | :jpn: [:'(]

User avatar
PTSnoop
cuneiform
cuneiform
Posts: 178
Joined: 01 May 2013 23:07

Re: CALS vs WALS: A Comparison

Post by PTSnoop » 07 May 2015 13:53

Thanks for the kind words, guys!

k1234567890y: That's what I'd have expected - and yes, there are plenty of cases explained by "people write what they're used to". But we've got some other cases that - to my mind - are better explained by the opposite, "people try to write what they're not used to", even when the European / English way is actually more widely common (number of cases, for example). Pesky truth, resisting simplicity again.

Squall: yeah, more detailed conlang-natlang phonology comparison would be nice (and I had a go at something similar, back in the day - I expect it's dropped off the internet by now). Something for me to do once I've finished going through the WALS data, I guess!

---

PART 3: VERBS

Verbal Categories

Image

Quite a marked difference here. The majority of natlangs are generally fine with having no grammatical perfective/imperfective distinction, but the majority of conlangs have the distinction marked.

Image

Similarly, lots more natlangs than conlangs don't bother with the past/nonpast distinction.

Image

Natlangs are more likely than not to have some sort of special system for prohibitives. Conlangs prefer to use the same grammatical systems they already have, and keep things regular.

Image

Would that fewer conlangs had an optative!

Image

For "can" and "may", apparently conlangers are less likely to use verbal constructions. Maybe we're trying to get away from English again.

Image

Not entirely surprising here - we're seeing a split between "nothing marked" and "everything marked", with "only indirect evidentials marked" taking a back seat.

Word Order

Image

Most natlangs have Genitive->Noun; most conlangs have Noun->Genitive.

Image

But for adjectives, it's the reverse - natlangs are more likely to be Noun->Adjective, while conlangs are more undecided, with a roughly 50-50 split.

Image

Conlangs are less likely to have demonstratives after the noun, though they seem undecided on what they'd prefer instead.

Image

Similar to adjectives - conlangers would rather put them before the noun, natlangs would rather put them after.

Image

Most natlangs either have question particles at the end of the clause, or just don't have them at all. But conlangers prefer a range of more interesting options - or maybe we just like to know whether we're hearing a question before we get to the end of it.

Image

Natlangs prefer to give people more freedom in where they put their interrogative phrases - but conlangs are more likely to restrict things to the start of the clause, either all of the time or some of the time.

Coming soon: Clauses.

User avatar
eldin raigmore
korean
korean
Posts: 6381
Joined: 14 Aug 2010 18:38
Location: SouthEast Michigan

Re: CALS vs WALS: A Comparison

Post by eldin raigmore » 08 May 2015 14:09

This is a great thread! Or, at any rate, I'm very interested. Thanks! I don't want to forget to finish reading it.

User avatar
PTSnoop
cuneiform
cuneiform
Posts: 178
Joined: 01 May 2013 23:07

Re: CALS vs WALS: A Comparison

Post by PTSnoop » 14 May 2015 12:10

PART 4: CLAUSES

Image

The case-marking alignment graphs look similar for noun-phrases and pronouns - strong trend away from "neutral" (no case marking), with the excess split between nom-acc and active systems.

Image

Conlangs here are more likely than natlangs to have subject-position pronouns, either obligatory or optional. For natlangs, subject-marking on the verb is much more common.

Image

Here we see the opposite trend to some of the earlier graphs - natlangs prefer the more complicated polypersonal agreement, while conlangs are perfectly happy to go for the simple boring no-marking option.

Image

And while that tendency towards no-person-marking throws this next graph a bit, there's still another clear trend here - conlangs with person marking are more likely to choose the slightly-off-the-wall "some third person singular verb-markings are zero while others aren't". Why this is the case, I'm not sure.

Image

Conlangs are more likely than not to have a passive voice. Natlangs, on the other hand, are more likely not to have one.

Image

Here, again, we see conlangs tending towards either "none of this feature" or "all of this feature", whereas more natlangs have a more moderate approach.

Image

Conlangs are more likely to give negative clauses the same structure as indicative clauses, eschewing the asymmetrical systems - and even more eschewing the weirder inconsistent systems - that lots of natlangs have.

Image

Natlangs are overwhelmingly more likely to have their negative indefinite pronouns require the clauses to be negated too. Conlangs, on the other hand, shy away from these "double negatives". I blame the Victorian grammarians.

Image

Natlangs here have a whole bunch of different ways to say "he has the book". Conlangs, though, tend to avoid using topic markers or conjunctions for this kind of phrase, though, preferring a separate verb "has" or some sort of genitive construction.

Image

Conlangers like European-style comparative particles ("than"), at the expense of locational markers or conjoined noun phrases.

Coming soon: Sentences.

User avatar
k1234567890y
runic
runic
Posts: 3014
Joined: 04 Jan 2014 04:47
Contact:

Re: CALS vs WALS: A Comparison

Post by k1234567890y » 14 May 2015 14:38

one thing more to say, some people may not really have read what the corresponding WALS article says when they try to answer the question about a feature for their conlangs in CALS
...

clawgrip
MVP
MVP
Posts: 2406
Joined: 24 Jun 2012 06:33
Location: Tokyo

Re: CALS vs WALS: A Comparison

Post by clawgrip » 14 May 2015 14:52

You mean there is a higher likelihood of people misrepresenting their conlangs' grammar?

User avatar
k1234567890y
runic
runic
Posts: 3014
Joined: 04 Jan 2014 04:47
Contact:

Re: CALS vs WALS: A Comparison

Post by k1234567890y » 14 May 2015 15:09

clawgrip wrote:You mean there is a higher likelihood of people misrepresenting their conlangs' grammar?
such a chance does exist, and school grammar may sometimes be misleading in my opinion, but I don't know the likelihood, and it is possible that most conlangers interpret the choices offered by WALS/CALS accurately and I am wrong.
...

Prinsessa
runic
runic
Posts: 3226
Joined: 07 Nov 2011 14:42

Re: CALS vs WALS: A Comparison

Post by Prinsessa » 14 May 2015 15:16

I think one has got to be pretty sweaty to put one's lang on CALS in the first place, so the risks are likely to be low.

User avatar
eldin raigmore
korean
korean
Posts: 6381
Joined: 14 Aug 2010 18:38
Location: SouthEast Michigan

Re: CALS vs WALS: A Comparison

Post by eldin raigmore » 15 May 2015 15:19

Prinsessa wrote:I think one has got to be pretty sweaty to put one's lang on CALS in the first place, so the risks are likely to be low.
Sorry, but:
(1) what does "sweaty" mean in this context?
(2) risks of what are likely to be low?

Prinsessa
runic
runic
Posts: 3226
Joined: 07 Nov 2011 14:42

Re: CALS vs WALS: A Comparison

Post by Prinsessa » 15 May 2015 15:41

eldin raigmore wrote:
Prinsessa wrote:I think one has got to be pretty sweaty to put one's lang on CALS in the first place, so the risks are likely to be low.
Sorry, but:
(1) what does "sweaty" mean in this context?
Nerdy! I.e. knowledgable. Seasoned conlanger. WALS isn't interesting to a lot of language lovers and CALS probably isn't interesting to a lot of conlangers except those who would use it themselves. Is my theory. I might be wrong!
eldin raigmore wrote:(2) risks of what are likely to be low?
Of people misunderstanding, which was discussed right above.
Last edited by Prinsessa on 16 May 2015 08:17, edited 1 time in total.

User avatar
Imralu
roman
roman
Posts: 905
Joined: 17 Nov 2013 22:32

Re: CALS vs WALS: A Comparison

Post by Imralu » 15 May 2015 19:11

Brilliant! You've put a lot of work into this and it's appreciated.

I'm going through these and answering for my conlang Ngolu. I want to see if I'm Doing My Part for conlangers so I'm going to count how many features are overrepresented versus underrepresented. Not that it matters in any case as I like my lang how it is, but I've had to read a lot of the WALS descriptions to get a lot of it, so it's been a learning exprience. Most of it is probably only of interest to me, so I'll hide most of it. There's a couple I have questions on though.

Ngolu:
Spoiler:
Consonant Inventory
17, moderately small - slightly underrepresented

Vowel Quality Inventory
5, average - underrepresented

Voicing In Plosives And Fricatives
In both plosives and fricatives - overrepresented
(I love voiced fricatives.)

Front Rounded Vowels
None - underrepresented

Tone
Simple tone system - underrepresented

Ngolu has a tonal accent apparently counts but it feels like cheating ... especially since tone is almost entirely regular, high on final syllables, low everywhere else.

Rhythm Types
No rhythmic stress - overrepresented

Presence Of Uncommon Consonants
None - underrepresented

Fusion of Selected Inflectional Formatives
Isolating/Concatenative - overrepresented

TAM markers are isolating. Case markers are inflections.

Inflectional Synthesis Of The Verb
0 categories per word - overreprsented

Verbals are isolating.

Locus of Marking: Whole-language Typology
Dependent marking - overrepresented

Mostly isolating but with fourteen cases on the nominals (pronouns/articles).

Reduplication
Full reduplication only - neutral

I may change this and add partial reduplication in something a bit like the "sinner schminner" kind of thing. I haven't decided on the consonant or the meaning yet though. Reduplication occurs prominently in ideophones (such as ualauala 'wobbly'/'precarious') although, as uala on its own means nothing, I'm not sure that's really reduplication.

As phonologically separate words, reduplication adds the meaning "examplary", such as mala mala "a house that is exactly what a house should be" (kind of like "a house among houses" but literally just "house house"), tta tta "enormous" ("big big"). It can also be used in a kind of ideophonic way like ttio ttio "hit and hit", but that can be a bit ambiguous as it could also mean "hit really hard".

Case Syncretism
No syncretism - overrepresented

The Associative Plural
Unique periphrastic associated plural - very underrepresented

Cool. I didn't realise I had one until I read about it.

ja Hana imu
NOM.3s.ACS.DEF Hana NOM.3p.ICS.SPEC
Hana and others

Indefinite pronouns
Generic noun based - neutral

This was hard to classify. French is classified as generic noun based. Ngolu is essentially the same except that the indefinite determiners indicate animacy (as well as specificity or non-specificity) and thus make the generic nouns almost entirely redundant and therefore usually omitted.

u (golu)
NOM.3s.ANIM.NSPC (person)
"someone / anyone / a (non-specific) person"

a (tiaka)
NOM.3s.INAN.NSPC (thing)
"something / anything / a (non-specific) thing"

Cf.

u mala
NOM.3s.INAN.NSPC house
"a (non-specific) house"

But, you could also see the words like a and u as special forms I suppose. Pronouns are articles are pronouns though, so I figured generic noun based is the best.

Intensifiers And Reflexive Pronouns
Differentiated - overrepresented

(Until I saw this thing in WALS a long time ago, I assumed conflating them would have been too English, like conflating savoir and connaître.)

Ordinal Numerals
First, two-th, three-th - very underrepresented

I had no idea this was the most common way in natlangs.

Distributive Numerals
Marked by prefix - slightly underrepresented.

ahu = one
kahu = one each

Conjunctions And Universal Quantifiers
Formally different - overrepresented

It took me a while to get my head around this. Turkish is listed as "formally different", but to say "whoever", you use "who" in an if-clause, so I think this should count.

Position Of Pronominal Possessive Affixes
No possessive affixes - overrepresented (with caveat - unterrepresented in WALS)

Possessive Classification
No possessive classification - overrepresented

Ngolu distinguishes simple genitive (associative) from possessive (legal ownership) and also uses locative to show "on one's person" possession. It's entirely a semantic distinction, however, and not classification. Tani guni means "my sibling" in the normal sense and tani gunu means "my sibling" in the sense of a person who is somebody's sibling but my possession (ie. slave). It's not something you'd say often, but it illustrates that all combinations are grammatically possible.

Adjectives Without Nouns
Without marking - underrepresented

There's no marking because adjectives are not a word class. They're all verbals. Unless you want to say that the closed class of nominals (basically pronouns/articles) are the only nouns, in which case, everything other than this closed class of nouns is marked by a preceding word.

ju muja = the man
ju jagi = the tall one
ju = the animate being / he / she

Noun Phrase Conjunction
"and" different from "with" - overrepresented

"And" is completely absent from noun-phrase conjunction. Only juxtaposition is used. Two arguments in the same case, whether adjacent or located separately in the sentence, are interpreted with an invisible "and".

Perfective/Imperfective Aspect
No grammatical marking - unrerrepresented

The Past Tense
No past tense - underrepresented

The Prohibitive
Special imperative + special negative - underrepresented

It's essentially just a contraction of e kka to kke. The word order has changed though. Saying kkas e (without contraction and with an epenthetic "s") means "don't have to".

The Optative
Inflectional optative absent - underrepresented

One common way of making a periphrastic optative is to use the vocative form of the complementiser: ezuo, literally kind of like "oh that ..." Not relevant, just fun.

Situational Possibility
Verbal constructions - underrepresented

Semantic Distinctions of Evidentiality
No grammatical evidentials - overrepresented

It wouldn't really fit with the rest of Ngolu. There are simple ways to do this, but none of them obligatory.

Order Of Genitive And Noun
Noun-genitive - overrepresented

Consistently head initial so ...

Order Of Adjective And Noun
Noun-adjective - underrepresented

Head initial ...

Order Of Demonstrative And Noun
Demonstrative-noun - overrepresented

Demonstratives generally act as heads in Ngolu. Otherwise they'd often have to come between things or have ambiguous reference.

Order Of Numeral And Noun
Numeral-noun - overrepresented

Numerals also act as heads.

Position Of Polar Question Particle
Initial - overrepresented

Head ... Basically all of these questions just come down to head-dependent order for me.

The question particle can occur at the end of a sentence though but this is as an afterthought and thus indicates a tag question.

Position Of Interrogative Phrases In Content Questions
Not obligatorily initial - Unterrepresented

In situ. For pragmatic reasons, they may gravitate towards the end of the sentence.

Alignment Of Case Marking Of Pronouns
Nominative-accusative - overrepresented

Morphosyntactic alignment is a funny question for Ngolu. It depends on the semantics of the verb and some verbs have unusual argument structures.

Expression of Pronominal Subjects
Optional pronoun in subject position - overrepresented

All arguments are optional in Ngolu.

Verbal Person Marking + Third Person Zero Of Verbal Person Marking
No person marking - Overrepresented

Passive Constructions
Present - Overrepresented

I guess it's present. Passive-like meanings are usually expressed simply by omitting the nominative argument. The passive-ish construction is mostly used within arguments to form the equivalent of "-ee" as in "interviewee".
  • ttio ju
    hit NOM.3s.ICS.DEF
    He hits. (Verbs of violence are marked if the subject is a human other than an initiated man, so only "he" can be interpreted here.)

    ttio ji
    hit ACC.3s.ICS.DEF
    S/He gets hit.

    Rarely:
    • he ttio ju
      undergo hit NOM.3s.ICS.DEF
      S/He gets hit.
    ju ttio
    NOM.3s.ICS.DEF hit
    the hitter, he who hits

    ju he ttio
    NOM.3s.ICS.DEF undergo hit
    the hitter, s/he who is hit
Periphrastic Causative Constructions
Both (???) - Overrepresented

I think? I read the description on WALS over and over couldn't quite get my head around it. Is the difference simply whether there is some marker of purpose associated with the effect clause? Also, they say it has to be bi-clausal and then they use examples like (7) which is pretty clearly one clause.

I think (1) represents sequential and (2) represents purposive, yeah?

(1)
Kue ju jo hu nu.
cause he.NOM that.ACC go I.NOM
"He cause(s/d) me to go."

(2)
Go ju kuajo hu nu.
do.something he.NOM that.BEN go I.NOM
"He act(s/ed) so that I (would) go." (The benefactive is also used for purposes.)

The two sentences differ semantically. In (1), you know that I went. In (2) you only know that that was his aim, whether he was successful or not is not mentioned. And if I'm correct in saying (1) is sequential and (2) is purposive, why is English given as having only sequential???
Spoiler:
Symmetric And Asymmetric Standard Negation
Symmetric - very overrepresented
Negative Indefinite Pronouns And Predicate Negation
Predicate negation also present - underrepresented
Natlangs are overwhelmingly more likely to have their negative indefinite pronouns require the clauses to be negated too. Conlangs, on the other hand, shy away from these "double negatives". I blame the Victorian grammarians.
Ngolu, like many of the languages mentioned, doesn't have specifically negative indefinite pronouns, so there is clause negation without there being a double negative.

Kka kaus u.
not eat NOM.3s.ANIM.NSPC
"Nobody ate." (Literally kind of like "Anybody didn't eat." The -s u can also be dropped.)

Kka kau mu.
not eat NOM.3s.ANIM.SPEC
"Somebody didn't eat." (Literally: "A specific-person didn't eat." With sufficient context, the mu can be dropped.)
Spoiler:
Predicative Possession
Genitive - overrepresented

There are two kinds of genitive and also locatives to show possession, and then there's also an ornative derivational prefix as well as two "have" verbs equivalent to the two genitive cases ("have" and "own") and also a "be the location of" verb equivalent to the locative case. To say, "I have a dog" it could be oko uni, oko unu, okos ana, osoko nu, ios oko nu, nies oko nu, lu oko nu (I have a dog, I own a dog, I have a dog with me, I am bedogged, I have a dog, I own a dog, I have a dog with me.) depending on exactly what you're trying to say. Syntactically, I guess Ngolu is kitchen sink as fuck.

Comparative Constructions
Conjoined - underrepresented

Not like "this house is big, that house is small" but like "this house is more big, less that".

Relativisation On Subjects
Gap - underrepresented

Variety of possibilities but gap most common for subjects.

Relativisation On Objects
Pronoun-retention - slightly overrepresented

Purpose Clauses
Balanced - overrepresented

Hand And Arm
Different - overrepresented

Numeral Bases
Decimal - underrepresented

Number Of Non-Derived Basic Colour Categories
Five - underrepresented

Number Of Basic Colour Categories
Six - underrepresented

Green And Blue
Green/Blue - underrepresented

Tea
Other - overrepresented.
Of course. It's off-world.

Para-Linguistic Usages Of Clicks
Logical - underrepresented
Result - slightly more overrepresented than underrepresented. I'm a conlanger.
___________________________________________
In Intensifiers And Reflexive Pronouns, why the hell does Swedish get counted as identical but German not?!?!?!

Jag gjorde det själv.
Ich habe es selbst gemacht.
I did it myself.

Jag hatar mig (själv).
Ich hasse mich (selbst).
I hate myself.

It is much more common to use själv in Swedish than selbst in German, and maybe it's even becoming obligatory in normal Swedish, but even then, you don't say Jag gjorde det *mig själv, so the reflexives mig/dig/sig/oss/er själv(a) are still different from the intensifier själv(a).

EDIT: Added more.
Last edited by Imralu on 21 May 2015 14:16, edited 1 time in total.
Glossing Abbreviations: COMP = comparative, C = complementiser, ACS / ICS = accessible / inaccessible, GDV = gerundive, SPEC / NSPC = specific / non-specific, AG = agent, E = entity (person, animal, thing)
________
MY MUSIC

User avatar
eldin raigmore
korean
korean
Posts: 6381
Joined: 14 Aug 2010 18:38
Location: SouthEast Michigan

Re: CALS vs WALS: A Comparison

Post by eldin raigmore » 16 May 2015 06:52

Prinsessa wrote:
eldin raigmore wrote:
Prinsessa wrote:I think one has got to be pretty sweaty to put one's lang on CALS in the first place, so the risks are likely to be low.
Sorry, but:
(1) what does "sweaty" mean in this context?
Nerdy! I.e. knowledgable. Seasoned conlanger. WALS isn't interesting to a lot of language loves and CALS probably isn't interesting to a lot of conlangers except those who would use it themselves. Is my theory. I might be wrong!
Thanks! Now I know.
Prinsessa wrote:
eldin raigmore wrote:(2) risks of what are likely to be low?
Of people misunderstanding, which was discussed right above.
Huh. I missed it the first time through. Thanks.

kaleissin
rupestrian
rupestrian
Posts: 8
Joined: 29 Apr 2015 13:51

Re: CALS vs WALS: A Comparison

Post by kaleissin » 16 May 2015 21:44

PTSnoop wrote:Hi, I'm PTSnoop. You may vaguely remember me from the previous CALS vs WALS thread almost two years ago.

I originally set out to do a statistical comparison of the conlangs on CALS and the natlangs on WALS, to see what intriguing features of natural languages we conlangers tend to pay more/less attention to. Then I found out that the CALS numbers I was using had natlangs mixed in with them, realised I'd have to redo all my graphs, ran out of round tuits, got thoroughly distracted by other things, then some time later noticed I had a whole bunch of Conlangery episodes sitting in my rss feed reader and got thoroughly distracted back into the world of conlangs once again.
What would you need to make this easier? I'm the (sole) developer of CALS. A read only REST-API is planned, but not high up on the long TODO.

User avatar
PTSnoop
cuneiform
cuneiform
Posts: 178
Joined: 01 May 2013 23:07

Re: CALS vs WALS: A Comparison

Post by PTSnoop » 21 May 2015 10:30

kaleissin: A CALS REST api would be a very useful thing to have. For me personally, though, I don't think it'd help all that much - I've already written a web scraper to pull all the raw numbers off the website. Also, this is the final set of graphs, so I'm pretty much done for now. Thanks for the offer, though!

---

PART 5: SENTENCES

Complex Sentences

Image
Image

Big trend here: conlangs are quite a lot more likely to use relative pronouns for relative clauses, and less likely to use the "gap strategy" of missing out the head noun.

Image

Natlangs prefer their purpose clauses to have verb forms different to normal declarative clauses ("deranked"). Conlangs are more likely to use the same verb forms ("balanced").

Lexicon

Image

Conlangs are much more likely to have distinct words for "hand" and "arm". New words for new things.

Image

Conlangs are overwhelmingly more likely than natlangs to have strange and interesting numeral bases. Also, there are a fair few natlangs with restricted numeral systems - systems that don't have numbers much higher than 20 - but we hardly have any conlangs matching this category. This is probably just as well.

Image

Conlangs strongly tend towards having more non-derived colour terms.

Image

This graph shows the colour-word tendency more clearly - a strong bias towards "all the colour terms!"

Image

And the same again here - the overwhelming majority of conlangs have separate words for green and blue, even though grue is significantly more popular among natlangs.

Image

I'm surprised the conlang "other" category isn't larger here, to be honest.

Image

And finally: lots more conlangs don't have para-linguistic clicks. Tut, tut!

---

If anyone's interested in a category that I've not covered here, the complete set of graphs can be found at http://sasha.sector-alpha.net/~ptsnoop/calswals2a/ .

HoskhMatriarch
mayan
mayan
Posts: 1779
Joined: 16 May 2015 17:48

Re: CALS vs WALS: A Comparison

Post by HoskhMatriarch » 22 May 2015 23:28

When I get my language done it's going to help balance things out a little, due to my strategy of "let's do this just to be normal because I don't really care much about this part of the language". I do have one question that I'll have to be clear about once I have enough done before I put it in CALS: does WALS group epiglottals with pharyngeals? I thought they'd know the difference, but they talk about a "pharyngeal stop" or something like that so I don't know. My language has epiglottal sounds but not actual pharyngeal.

Also, the reason a lot of natlangs don't have a passive voice is because a lot of natlangs aren't nominative-accusative, which means they'll have some other voice like antipassive, not that you have to say every dang thing in active voice. The reason a lot of conlangs require relative pronouns in relative clauses is because a lot of conlangs don't conjugate verbs for person, number, gender, or whatever to show what role the person/thing would have in the relative clause. A lot of these things are correlative even in conlangs, there are just different tendencies in conlangs.
No darkness can harm you if you are guided by your own inner light

Post Reply