Barsoomian- Will it be a simple conlang?

The first clues about Paul Frommer’s Baroomian are out. He mentioned that the language was going to be simple so that a hypothetical American on Mars could learn it as fast as the novel implies he could learn it. Easy normally means, “similar to one’s mother tongue”. That isn’t possible. So did Frommer create a simple conlang?

By simple I mean, simple the way toki pona is. I doubt it, but who knows? Toki pona is simple because it is grammatically and lexically closed. Real life simple languages, like creoles are simple, but they aren’t closed. They are in the process of accumulating the tons of tiny details and complications of long lived languages. Pidgins, depending on your definition aren’t even languages, since pidgin speakers will give up grammar rules if they think it will help them communicated (i.e. a creole will have a consistent way of picking SVO order, pidgin speakers will try all combinations if the first didn’t work)

Interestingly he assumes the ownership of the language is Disney’s. But the copyright for the novel is expired & and only the script lines and audio material is Disney’s. That creates the opportunity to reverse engineer the language and create a public domain version that may be somewhat compatible with the movie version. But should anyone do that much work? It looks like Burroughs didn’t put much thought into the language, neither by wordlists or the made up linguistic situation on mars, ref. this book review.

Posted in conlang, xenolangs | Leave a comment

Fake Languages and Sign Languages

N.B. Sign languages, like ASL, are real, complete languages. Manually signed English, finger spell, lip reading, and the other various systems forced upon the deaf community are fake. Now that we got that out of the way… [And if you watch ASL, everytime the signer hits a fingerspell word, you feel like someone injected a chemical formula into a sentence, like dimethydiethlytetraisopropolyne-- a word so gargantian compared to the rest that it looks like a design error]

At various points in time people have suggested creating new languages organically, the same way that languages are created in contact situations, with a collaborative con-pidgin/con-creole. But only recently have a noticed that homesign– the ad hoc signs systems that are created by hearing family to communicate with a non-hearing family member are essentially ad hoc conlangs, or con-pidgins. And they have simple grammar! They have a dominate word order, etc. And ASL, Nicaraguan Sign Language and most of the other sign languages are also recently new sign languages, created from nothing once deaf children first got access to each other (although from current research we can assume that even they had a sort of real but simplified grammar in their home sign before they got together)

toki pona Sign language.
If I were to create a toki pona sign language, it wouldn’t be manually signed toki pona. It would be lexically constrained, but it wouldn’t use a system of particles: li, pi, e etc. In sign language grammatical particles are usually indicated by facial expression (which differs from the facial expressions we make to show emotion), and how you move the hands (e.g. repetitively, quickly, etc). A sign language also has a limited number of hand shapes that hands are allowed– these are like the morphemes of the language. They can also happen to be finger spell letters, but really the shape is arbitrary. Also, anaphora tends to be done by pointing in space, which establishes a prounoun that can be referred to later by pointing to the same space. This is way better than toki pona’s impoverished ona and ni.

So a toki pona sign language would replace e, o, pi, with facial expressions, e.g. pouting lips, wrinkled nose, etc. It would have a remarkably few legal hand shapes. It would likely be all single hand signs to keep with the principle of simplicity.

Just like written toki pona, it would be horribly verbose, but fairly easy to learn, and pretty hard to understand.

So if I were to make a toki pona sign language, I would probably make some reforms that could cut down on the verbosity.

Posted in toki pona | Leave a comment

Parameters and Lexical Syntax

So when I am sitting around, staring at the ceiling, trying to thing of an idea for a new human communication system, I try to imagine what would be different, preferably really different. From reading “Atoms of Language,” one gets the impression that natlangs are all pretty similar and not as varied as they imaginably could be, because they are constrained in design by a hierarchy of instinctive parameters, which once chosen either dramatically or subtly decide all sorts of things about how the system will work.

One thing that the author said was unattested was lexical syntax. (Keep in mind the author is much smarter than me, so if you are of an academic bent, don’t fling feces at me, just humor me and let me discuss the things that popped into my head whilst reading and I’m calling “what he said”) By that, that he meant you don’t observe language where the neutral order for “to run” is SVO and the neutral order for “to eat” is OSV.

Of course this immediately made me want to go create a language with lexical syntax. Depending on the verb, there would be one of the six different orderings of the core participants. Depending on the adjective, there it would go before or after the phrase it is attaching to.

Another thing was that the parameters beyond “Is Polysynthetic” are sparsely documented. There are few languages that mark a verb for all its core participants and they aren’t so well studied. On the otherhand, the non-polysynthetic languages are numerous and well studied. So if you create a conlang on that branch of the parameter hierarchy, it is almost certain to be similar to a current natlang.

So from reading this book I have two ideas for a conlang. To make them work together, since a polysynthetic language tends to be nonconfigurational (i.e. not care much about basic order), I’ll need optional polysynthesis. But if it is completely optional, people won’t use it, they will stick to something familiar. So I thing it should switch on TAM (time-aspect-mode) triggers. Most TAM systems focus a lot on whats done/not done and did it happen now or yesterday. That”s been done before. What hasn’t been done before is focusing in on irrealis. So things that are in the future, mostly are possibilities, things in the present are actualities, things in the distant past are possibilities again because they’re in cloudy history.

The next feature is that I’m bored with pronoun systems that stick to person, number, animacy and gender. Why not divy up the world of core participants by solid/liquid/gas? That would be better than “it”. Why not by friends/enemies/romantic possibilities and actualities/sessile non-human/non-sessile non-humans. But I would want semantic noun classes. Arbitrary grammatical gender has been done before.

So I think I finally have a grammar to go with my word list I made a year or two ago.

Posted in Uncategorized | 4 Comments

Gloss equivallency

I may have invented a concept. Well, it’s new to me. It’s similar to the idea of a relex, which is whole sale replacement of all morphemes in a language with new ones, but otherwise leaving it the same. What if someone created a language and it wasn’t a relex per se, but the glosses were the same? What if the gloss were the same except for the ordering of phrases, especially if the word/phrase order has no particular impact on meaning? For example

L1: blah blip-it bloop-ir.
L2: hma hmug-a srrp-ah.

L1: 1S see-Pres cat-Acc
L2: 1S see-Pres cat-Acc
I see cats

Euroclones are probably gloss equivalent. A euroclone adds suffixes to words, draws on the same pool of pronouns, the same pool of grammatical structures. The idea of gloss equillence lets you decide if among a set of design options, if maybe you aren’t just comparing apples to apples.

Word Order
Let’s image the following now. The word order of L1 is now OVS

L1: bloop-ir blah blip-it.
L2: hma hmug-a srrp-ah.

L1: cat-Acc 1S see-Pres
L2: 1S see-Pres cat-Acc

I would expect that in a language like the above, that OVS and SVO would both be valid, since O V and S are all marked regardless to word order. So word order isn’t affecting the gloss that much. If hypothetically fronting meant the front is the topic, then the gloss would be different

L1: cat-Acc-Top 1S see-Pres
Now, the pairs of languages are “mostly” equivalent.

Now an example that is not gloss equivalent, although the root words are mostly the same.

L1: bloop-ir blah-nit blip-it.
head-1S-Poss See-Real cat-Anim.
My head (really, not the hypothetical future or hearsay past) sees a (living) cat.
I see a cat.

Advice
Learn to gloss English (or your mother tongues). Gloss your example text. Gloss your conlang text. Compare the glosses and see how much they match.

A Metric
You should be able to count morphemes that match in each, regardless to order and get a percent of morphemes and mechanisms that match up. Matches by word order is more difficult to meausure, you’ll have to eyeball it.

Posted in conlang design | Leave a comment

In praise of the relex and how to make a better one

Anyhow, I read this article on how to create a language in one day and really it is about machine generated relexes. And that isn’t a bad thing, it has a legitimate purpose.

From reading the article it seems there is a continuum of encodings of existing languages from merely using a different font to generating gibberish that uses a natural language as the seed.

1) new font — it not only isn’t a conlang, it isn’t even a relex.

2) new pronunciation, but still recognizable, like pig latin, or Elmer Fudd talk. These are easy to produce and easy for anyone to understand.

3) words swapped out one for one with unrecognizable ones — This is a true relex

3a) words are swapped out with others and the sound is carefully chosen to sound nice– maybe for the conlang that truly cares not for anything past phonetics

3b) words are adjusted to be an length appropriate for their frequency– common words short, rare words are long

4) dictionary and grammar too scanty and ill defined, so the conlanger writes more or less the way he would in English, with a handful of exceptions inspired by the conlang’s description. Probably can be created almost effortlessly, but is probably unintelligible to an English speaker

4a) a conlang with a lot of calques, loans and otherwise doesn’t fall far from the creators mother tongue. Maybe well documented.

5) condialect– a dictionary and grammar that are carefully assembled and has a strong resemblance to a language on purpose. Probably can be understood efforlessly, but can’t be produced easily. (I understand people in British movies, but I can’t imitate them and I certainly wouldn’t know when to use their slang)

Since the goal of a relex is to rapidly create fake language text without the fake language looking suspicious, it seems there should be a word generator/lorem ipsum generator strategy that creates an endless steams of text with sentence about the right length, words about the right size and possibly with the right feel when read aloud.

Notice that for all forms of Relexes, there is no particular level of skill required of the consumer. When it is nonsense, it it is just pleasant sounding nonsense. When it is meaningful, it is easy to produce or understand (although sometimes not both). If the audience is children, or people who you know aren’t going to invest the 2000-3000 hours to gain fluency in a natural-like language or if you’re just trying to illustrate the possibilities of a phonetic inventory, then a relex can be the right thing to do.

Posted in 30DayConlang, conlang design | Leave a comment

30 day Conlang “things”

So I’m 18/30th of the way through a conlang in 30 days and I’m realizing that, unlike a novel, there are many ways to engage with fake languages for a month.

– Things that appeal to people creating fake languages for art and recreation
A write a novel using a conlang in 30 days, i.e. NaNoWriMo
B write a novel written in a conlang in 30 days, i.e. LoCoWriMo
C create 30 days worth of fictional anthropological reports and histories (conworlds)
– Things that appeal to people creating fake languages as a social activity
D create a conlang in 30 days. i.e. 30DayColang
E learn a conlang in 30 days.
F promote/teach a conlang for 30 days

Also, you can do all of the same but substitute “conlang” with “Dead language”. It’s kind of the same exercise when you are working with a language without a living community.

A and B have a natural end to them. Eventually you will finish writing your book. C, F don’t have a natural end to them, you can promote and write fictional histories, build fictional scale models forever. Learning a conlang can have an end to it– someday you’ll learn all there is or acquire near “native” fluency, but it sure won’t take less than 30 days, unless it is a very, very small language and your competency goals are modest. A conlang might have a natural end to it, but there are many different design agendas for creating conlangs and not all of them have a natural end to them.

And A,B,C have a natural start. In a contest it would seem fair to start from scratch, as one does in NaNoWriMo. In D,E,F– it would just be a waste of effort to keep switching horses midstream.

For the very long and infinite tasks, the goal is to start a habit. There is a rule of thumb that it takes about a month to establish a habit, so a month of promoting, studying or consistent scribbling of some sort could kick of a multidecade project.

Posted in 30DayConlang | Leave a comment

30 Day Conlang- Time to stop f*ing around and start screwing around

So, back to the fundamental idea– writing a language in thirty days. All you have to do is image what a complete language definition is and then divide by thirty. So I solicit some advice on the Conlang mailing list and get lots of good advice.

The language is done when it has it’s first native speaker.
A native speaker is someone who speaks that language from birth. So, if you noticed, one of the key elements of the language definition here is a living, human child. So if we we aren’t making babies, then were just fucking around with tangential materials. It is time to get down to the hard work of screwing around.

A Modest Proposal for a Conlang Design Agenda
This isn’t entirely without precident. When previously dead languages are brought back to life, a couple will need to resolve to learn a language as a second language and use it at home extensively enough that the child will begin to use it. Eliezer Ben-Yehuda and his wife got the language going again by speaking only Hebrew at home around their son, who did eventually figure out the language, despite the Hebrew as a second language skills of his parents. Later Hebrew really took off because it was taught at the public schools. Unless you have your own nation state, you won’t be able to jump start your conlang by the second route, but the first route isn’t in the realm of science fiction, just merely difficult.

It will take more than 30 days. At least nine months just to prepare the initial materials. Before you get that far, you will also need to find a suitable boyfriend or girlfriend. Since this language doesn’t exist in the outside community, it will become the “bonding language” or language that a kid uses with his parent(s). Kids vary and it’s hard to say if they will keep using it with their parents. People also have a habit of switching to the language that facilitates communication. So if you speak your own conlang poorly, but speak English, your child may be tempted to speak only English to you as soon as they realize how much more effective that is.

The optimal scenario is:

Marry a monolingual foreigner-someone who doesn’t speak your mother tongue
Move to a community where your mother tongue isn’t spoken–so that your child doesn’t learn your mother tongue.
Speak only your conlang at home–but of course! How else will this become a living language?

Since your child never learns your mother tongue and you speak the language of the community as a second tongue, then the easiest and most effective language for communication will be the conlang.

Next Generation
A language is dead as soon as it is no longer being taught to the rising generation of children. So to keep the language alive, you really need not just a spouse that is willing to speak a conlang at home, but you need to recruit people to the “club” so to speak.

From research in Anthropoly and space colonization, the rule of thumb is that you need to have 50 to 500 members of your conlang club to have a viable and genetically healthy community.

So the recipe for a conlang
Find a girlfriend/boyfriend who is also a conlang enthusiast or is willing to speak an unusual language at home. This is the moral equivalent of speaking a really, really rare living language at home, since opportunities for using it out of the house are pretty low.

There currently are tens if not dozens (as in more than a single dozen) of people on OkCupid world wide with an expressed interest in conlangs.

Step two. Make a baby. There are instructions, both textual and video, elsewhere on the internet.

Step three. Speak your conlang exclusively at home.

Step four. Document the form spoken by your child or children.

Piece of cake. You now have completed your conlang.

Posted in 30DayConlang | Leave a comment

Day 14 of 30 day conlang

No progress, but I have an idea. A nonconfigurational language (a bag of unordered words) based on existing data structures as found in C#. The language would have an instant parser. The text could be fed into a special page that would randomize the order of words in each sentence for each new view of the page, which would help people break their use of syntax and rely on everything else.

I made a trivial amount of progress on writing a community website. The stumbling block is that I need to write the smallest thing that could possibly work, but as soon as I start I think, “Man, if I do it that way, it’s going to create more work in the future” and then I do nothing.

And I have an idea for a new conlang methodology, but it will take more space than I have time to write.

Posted in 30DayConlang | Leave a comment

Pictish- Missed opportunties for movie conlangs

Pictish is a dead language. It was probably yet another Celtic language, but we don’t have enough information to rule out anyone’s pet theory. It is a remnant language that lives on in a few possible given names, place names and that’s about it.

This is at least the second time I’ve seen a movie with Picts that weren’t speaking Pictish. First was King Arthur- the Picts spoke English. Now in the The Eagle, they Pict speak Gaelic, which is good for Gaelic to get some screen time, but this was also an opportunity for a director to hire a conlang creator.

What options are there?

Embrace Gaelic, Scots
Gaelic and Scots are rather endangered languages, so anything that can create a some use for them would be a good thing. Maybe we can do some good for a living language by using it, instead of the impossible task of reviving a nearly, completely dead language.

On the down side, there isn’t much in the way of a technical or design challenge here for the conlanger.

Create a Pict inspired Conlang
We know about the Pict’s physical culture, a handful of names that were possibly Pictish. From that we could with meticulous effort maybe construct phonetic inventory a plausible phonotactics. Next, we’d have to pick a language family. Since the Pict’s physical culture was almost identical to the surrounding people, it follow they may have spoken a language in the same family– so something Indoeuropean, and of the options, probably something Celtic. If it wasn’t, then one would have to either create an isolate– without any recognizable link to any living language–or if you like Ruhlens and Greenburgs work, something that looks like it could be in the Indo-Euralic family. (Or a minor variation on the idea, the Nostratic superfamily)

So a conlang Pict would be a known Celtic language with enough changes to make it sound like a Celtic language without really being mutually intelligible.

As a movie language, it would be constrained by how pronounceable it was for the actors and if it sounded pleasant or appropriate to the audience.

Also, unlike some remnant languages, like Virginian Algonquin, there isn’t any pre-existing corpus for which one needs to maintain compatibility. Meticulously extracting all the information about a language from a series of word lists and a few sentences is a huge task that takes scholars years. Movie conlangs need to be created on time and on budget, so there isn’t room for decades of decipherment. In Picts case, the conlang creater is very rapidly at the point where they have to choice but to make things up, and will not have the opportunity to be tempted to write a conlang that is a laborious reconstruction in the same way that PIE has been reconstructed.

Posted in ghostlang | Leave a comment

30DayConlang – Day 8

Since I’m making remarkable little progress, I decided to work on defining my goal, the thing that tells me, “Yes, you wrote a language and you wrote it in 30 days.” And if I could make that goal quantifiable, so that some could visit a webpage and upload documents or fill out a survey, then one could do the next step of saying, “These people completed a language, these people didn’t quite finish in time”

I submitted this question to the Conlang mailing list and will incorporate some of the ideas from there.

Done as in perfect vs done as in objectively along the line starting at [no effort] and ending at ["like a natural language"]
I very much have the philosophy on NaNoWriMo in mind. In NaNoWriMo, contestant write a 50,000 word novel in 30 days. At the end of that, they only really know for sure that they have 50,000 words. It could be 50,000 words of “Ni!” or it could be a rather readable story. The point isn’t that 50,000 words is magic, the point is that a quantifiable goal will motivate some people to do something rather than nothing.

While it feels good to say that language development will never be done, it’s done when it is perfect, must take an infinite amount of time, etc, it isn’t really motivating to contemplate that. In fact, even if conlanging really is like trying to drain the sea with a teaspoon, let’s just ignore that and work on something quantifiable. People need new languages, they need to be written and shipped. And language learners need to be able to decide which languages are ready to learn and which aren’t ready. If some conlangers don’t plan to ever reach that point, that’s fine– follow your joy, a 30 day conlang project is probably not going to fit your design agenda anyhow.

Morally better vs more complete. I think I’ve made it clear here that quantification isn’t about writing a better language. A paragraph describing an really great idea for a conlang might be morally better than a language that has a dictionary larger than the OED, a corpus larger than the combined published works of all human languages todate, etc. But at the end of a month, I’d feel more pleased by accomplishing the later, or something close to it, than accomplishing the former. And I think numbers can be put on that.

Multidimensional measures.
What ever measure turns out to be pragmatic, I expect it to be multidimensional. CALS is already choc full of conlang descriptions that exist only as a check list of features and there are plenty of conlangs that exist only as a long list of words and single word English translations. With a little of each, it is possible to write something in a conlang, with only one of each, almost nothing can be written. So these two measures are multidiminisional and sort of multiplicative.

Some measures are additive, or at lease sometimes additive– i.e. a signed language will be complete without anything being written about the phonetic inventory or phonotactics.

Dictionary Size
Of the top of my head I though of dictionary counting, which has some technical problems, especially with how you lemmatize your words. In a language with lots of morphology, it can be tricky to say what is a word and what is a part of a word. Word counting works the best for isolating languages, worst for polysynthetic languages. Word count can’t take into consideration the quality of words– for example a dictionary of commonly used words represents a more complete description of a language than a dictionary with a similar number of scientific and technical words.

A potential fix for the lemmatization problem, suggested by And Rosta
Count listemes. – Theese are word with meanings that can’t be inferred from the parts. So write, do, work, rewrite, redo and rework would be about four listemes: re-, do, work, write (or maybe 3 and derivational morphology would be in the grammar, not the dictionary.)

Corpus Size
Words here can be counted, with some of the same caveats as with dictionaries. The word counts will be most comparable with similar languages (i.e. analytic compares to analytic, agglutinating compares to other agglutinating). To do interlanguage comparisons, you’d need similar texts. Unfortunately, there are few texts that people are commonly translated, e.g. Lord’s Prayer, Babel Story, UN Declaration of Human Rights. These are short and pretty specific semantic domains. Even a large text, like the entire Bible, doesn’t cover enough scenarios to say that the language is usable in modern contexts, as demonstrated by Modern Hebrew, which had to invent numerous neologisms to cope with modern situations.

Grammar Size
There are templates and lists of questions that attempt to cover a broad area of grammatical topics that typically occur in languages. WALS/CALS is one, the outline of “Describing Morphosyntax” is another. If you stick to these too closely, they will influence the design of your language. Imagine if you used your high school French textbook as a model– you’d end up with a French-like conlang. Many of these question sets can yield “Does not apply” if the language in question is too different than what the question set was expecting. For example, toki pona uses almost no morphology at all, so any morphology sections would be “skipped”. Should these skipped sections count as complete or incomplete?

Merely counting word count of the supporting documents works only if the general quality and depth is similar and those criteria are hard to measure. On the other hand, this is the exact same problem than NaNoWriMo has when comparing a 50000 word good novel and 50000 words of “Ni!”

Maybe a compromise could be found by requiring that a descriptions follow a series of sections and that a certain minimum word count be in each section, or some percent of sections (to take into account that some languages have more going on in morphology than syntax or vica versa)

(Some clarifications on the potentials of this strategy from Jim Henry, phrasing is all my own)

Highest Assessable Competency
This idea get’s its inspiration from the competency exams people take to assess their skills in English, French, etc., such as the TOEFL and many other like it.

Let’s take the Black Language of Tolkien. There is one sentence written in it. If someone one were to write a competency test, at best someone could test out to being incompetent. On the other hand, Esperanto is so complete in this sense that you could test out as level 1, 2, 3, etc. up to “passes as a native,” since there are native speakers of Esperanto.

I like this idea, and I suppose in the 30DayConlang context, this could be implemented as writing a series of exams similar to the real life exams, starting at the easiest and working up to the most advanced. The acid test of course would be to take these tests oneself and pass. But that certainly would take more than a month. It’s in a range of imagination to write a language in a month, but learning a language in a month is really a pipe dream, unless there isn’t much to it.

(General idea suggested by Sam Stutter, rephrasing all my own.)

Achievements
Achievements is a common videogame system, where you can rack up points in variety of domains. In the conlang sense, you might gain achievement points for translating certain texts, for completing a 1000 word dictionary, for writing a lesson. These achievement can be earned multiple times, so you can earn 20 lesson achievements, earn the 1000 words in the dictionary achievement five times, etc. Mathematically, this allows for certain interlanguage comparisons.

Language A: 5 achievements of type Q, 6 of R, 10 of S
Language B: 12 of Q, 18 of R, 22 of S
Language C: 1 of Q, 44 of R, 15 of S

B is clearly more complete than A, but C can’t be ranked without making some potentially arbitrary decisions about weighting of scores. The nice thing about a website is that if I feel ambitious enough to do the programming, I could allow people to set their own weighting and sort languages on themselves.

An advantage of achievement scores is that people can compare languages based on criteria that they believe in. For example, if you think the grammar is a bunch of nonsense, then you could sort a list by quantity of corpus to decide who has a complete language and who hasn’t really been trying.

(General idea suggested by David Peterson, rephrasing all my own.)

Completeness by Translation Challenge
So the idea here is to take a large corpus in, say English, and pick sentence at random. If the maximally competent speaker/user of the language at the moment can translate that sentence, and many more, then it is more complete than another language where the best speaker says, “There just isn’t a way to say that yet”

Now this gets tricky with extendable languages and deciding how convenient a translation can be before one accepts it as a translation. English can be translated into toki pona, but whilst doing it, one feels like they are inventing a lot and the result is sometimes clumsy and wordy. If one can find a way to say most things using toki pona– a langauge impoverished in most mechanisms lexical and syntactical, then I suspect many other language that a man in the street might call “incomplete” might have some clumsy way to say just about anything.

All that said, this certainly could be done by survey. A reasonably honest and cooperative conlanger could answer a survey and then say yes/no to “This sentence was translatable” and “I didn’t have to innovate to translate this sentence” and at the end, count up how many yes. In a long enough survey and across enough different people, the individual peculiarities of how people answer should disappear leaving a rough statistic of completeness.

(General idea suggested by And Rosta, rephrasing all my own.)

Posted in 30DayConlang | 1 Comment

30DayConglang – Day 1

I spent the day reading about monkeys because I was too tired to do much else. Since I said I was going to do some conlanging everyday, so here I am. Despite being lazy, I’m like two weeks ahead of where I was last year because I have a suitable word generator going already.

Okay, as planned, the language will a subset of US English phonotactics. The root words will be machine generated.

Ideas I’m kicking around for some guiding principles
- It will be apriori, except the phonotactics
- It will be small and most bound and unbound morpheme categories will be closed
- This could be the language for hunter and gathers stuck on a multigeneration spaceship (there are several novels and movies on the theme)
- OR this could be another Arlington, VA “home-lang”
- This can’t be a catlang. The catlang will have to wait for another year.
- This could use a data structure other than trees, like maybe unordered sets. Think non-configurational.

Paste the following into Wordo to see some samples. At the moment diplongs and consonant clusters are generated too frequently. I used wikipedia’s list of valid onsets, vowels and codas. Currently, its showing IPA, but I’ll pick practical orthography soon.

StartingRule word 500

Tokens Onsets m n ŋ p b t d k ɡ tʃ dʒ f v θ ð s z ʃ ʒ h ɹ j w l pl bl kl ɡl pr br tr dr kr ɡr tw dw ɡw kw pw fl sl θl fr θr ʃr hw sw θw vw pj bj tj dj kj ɡj mj nj fj vj θj sj zj hj lj sp st sk sm sn sf sθ spl skl spr str skr skw smj spj stj skj sfr

Tokens Vowels a e i o u oʊ aʊ aɪ eɪ ɔɪ ɪɚ ɛɚ
Tokens Codas m n ŋ p b t d k ɡ tʃ dʒ f v θ ð s z ʃ ʒ h ɹ j w l lp lb lt ld ltʃ dʒ lk rp rb rt rd rtʃ rdʒ rk rɡ lf lv lθ ls lʃ rf rv rθ rs rz rʃ lm ln rm rn rl mp nt nd ntʃ ndʒ ŋk mf mθ nθ ns nz ŋθ ft sp st sk fθ pt kt pθ ps tθ ts dθ dz ks lpt lfθ ts lst lkt lks rmθ rpt rps rts rst rkt mpt mps ndθ ŋkt ŋks ŋkθ ksθ kst

Rule word {
Token Onsets
Token Vowels
Loop 75[1] 25[0] {
Token Codas
}
}

Posted in 30DayConlang | Leave a comment

30 Day Conlang Month Starts Tomorrow

Follow the #30DayConlang hashtag on twitter.

This obviously is inspired by NaNoWriMo, which is the “write-a-50,000-word-novel-in-November” event. This is not to be confused with LoCoWriMo, which is “write a novel in your already complete conlang”. LoCoWriMo, for me, is putting the cart in front of the horse because I haven’t a reasonably complete conlang, and if I did, it will take me many months to learn to use it.

Last year, Gary Shannon wrote a conlang in 30 days. I also tried, and made more progress than usual. Also, last year the 30 Day Conlang, for me, was in November. I think it makes sense to do this in October because the most common reason to create a conlang nowadays is not for international communication (that project is done), but to support a fictional novel. Even when people don’t follow through with the novel, they still often write the conlang as if it were for a novel and are generally following a Tolkien design agenda. So I imagine some people will want to follow this up with a NaNoWriMo novel that uses the conlang they just wrote.

From NaNoWriMo, we learn that to create something cool and sort of large, we should give ourselves a feasible but tight deadline and to not worry about quality. Instead, worry about getting it done. Because if we worry too much about quality, we get bogged down in our creative project and don’t finish. Ever. That said, once people do finish their rushed, slipshod work, they often look back and say, “Gee, that really isn’t that bad” and they fix it up. Or it is a train wreck and it goes into the trash. Either way, you’re better off than if you’d skipped the exercise entirely.

The best thing I got out of last years 30DayConlang was a lot of thought about methodology, machine assisted conlanging and rapid second language acquisition. I wrote a manifesto (a sort of set of design guidelines), I machine generated vocabulary, I sat down one evening with MS-Access and filled in meanings, and failed to finish my morphology and syntax and so no corpus text was created.

Gary Shannon “spoke (his language) into existence”, by babbling and assigning meaning and syntax to the babbling that came out. (See the Brown Conlang mailing list for Nov 2010)

I’m sure more methodologies exist, but unlike a novel, it’s not so much “just sit in front of the computer and pound the keyboard.” any one of the language creation tasks could burn up 30*24 hours– dictionary making is especially laborious if you let it take you over. So you have to explicitly slice up your time and spread it across the various tasks of conlang making– lexicon building, grammar building, grammar testing, language learning (creating flashcards, creating test questions, etc)

It takes 500 hours to learn a language (that may actually be the statistic for becoming an expert at any subject), so I figure if I finish a conlang this year, I could learn it over the upcoming year and then participate in LoCoWriMo.

Rules
Start October 1st, End one minute before November 1st.
Ignore your inner critic and editor, just write it!
Publish your results November 1st.
You have to define “done”, in the sense of what a “complete conlang is”, and it should be easily measurable, to keep with the NaNoWriMo sense of done. For example 2000 words or lexemes, 100 syntax rules, 75 bound morphemes and 1000 words of corpus or example sentence text. Unlike NaNoWriMo, there isn’t a simple metric like the 50000 word novel metric.
Optional. If you wrote a conlang to support a novel, write that novel in November!
Optional. If you wrote a conlang for use, go forth and build a community.

30DayConlang Discussions
Fraith Wiki

Posted in 30DayConlang | Leave a comment

2011 30 Day Conlang month soon, t-minus days.

Every year I remember that I should be writing and completing a conlang about November because of Nanowrimo month. I wouldn’t write a novel in a conlang, and I’m not really a competent writer, so I wouldn’t write a novel that uses a conlang as an element. I just want to write a conlang. I got a reasonable distance last year, but didn’t finish.

My plan for this year is to actually work on two conlangs, last years and a new one. It sounds stupid to divide effort, but I figure if I can keep from adding a language feature to a language that doesn’t need it, that is a good thing. The other language will get the left over feature(s).

Also, this year I’ll be doing flash cards.

Day one- phonotactics. Phonetic inventory will probably an English subset. My audience is people who live in the same city as me, so I know they won’t do a good job of producing vowels and consonants that they can’t hear.

Day two- generate words.

Day three- Assign meanings using a core + branch technique. The idea is that you pick a generated word, assign a meaning, and then figure out how a language with that core word would work. If you have a verb for the time the sun crosses the horizon, by metaphor, you have a (part) of a word for death, the end, and other associated concepts. In you add a bound morpheme, try to bind it to all the existing words and see what it generates. The key here is that early assigned meanings will block future word building strategies. After you have “dusk+man” to mean a human’s death, you don’t need a basic word for death.

End of week 1 – Flash cards loaded into anki and being reviewed on a daily basis.
End of week 2- Pidgin corpus. Write some provisional sentences. Imagine one is communicating to a fluent speaker and you need to rotate through all the possible variations on how to say something in order to make yourself understood. The specific meanings available will
End of week 3- Create verbal paradigms, syntax and what not to formalize and make efficient patterns from the pidgin corpus. If a feature was planned for but isn’t realized yet, make it happen here. For example, if you don’t have a cool pronoun system that varies by speaker height, implement it now.
End of week 4- Write up drafts of community documents: dictionary, grammar, license

Day 30, postmortem.

Posted in 30DayConlang | 1 Comment

Sound change applications uses for those who don’t care about diachronics.

I generally am not interested in conlangs that come with diachronic versions–i.e. hypothetical versions where plausible or otherwise changes mutate the words from one pronunciation to another. A crude version of what plausible is, is a rule like “The sounds change to corresponding sounds a row lower on the IPA chart” This simplified rule was mentioned in a McWhorter lecture I watched. Your mileage may vary.

Why not? First, I’m not all that interested in tolkien-style conlangs, nor it’s development agenda, i.e. supports a fictional story or fake world, where the story might never be written, isn’t written for fan usage, often has “poison pills” that make fan usage unusually problematic, etc. Second, I prefer conlangs that a fan could actually learn. Not necessarily as an Esperanto style auxlang, but as a general purpose usable system of communication. A language that comes in several versions is going to be harder to use than one that doesn’t. If you put archaic, late, modern and future Fakese, then some fans will eventually try to write in all of them and the corpus is going to get more and more difficult for people to learn.

Why? But these tools exist and I think there is a use for them. My favorite conlang, toki pona, appears to be instable. I mean it looks like if a real speaking community were to use it, the words would almost instantly collapse into shorter words and certain minimal pairs would mutate to increase the distance between them.

So if we took these sound change applications, applied a set of plausible rules and the words didn’t change much, then we could say that the language is stable. If a small likely change cause the vocabulary to mutate beyond recognition, then there are problems. Similarly, if a small likely change results in excessive homophones, causes grammatical markers to become unsuable, then that is useful information for design of a single language.

Another possible methodology is to apply a variety of different plausible change and keep applying them over and over until the language reaches an equilibrium and doesn’t change anymore. Obviously some sound change rules will eventually result in all letter eroding to nothing. So this will only work if there is some offsetting force in the sound change rules to prevent degenerative states, such as all of the sounds disappearing from all words.

Here is one that I hope to try out someday.

http://code.google.com/p/phonix/

Posted in machine assisted conlanging | Leave a comment

Singletons- A Category of Conlang in Search of a Better Name

A singleton, is a conlang that may have been created for whatever purpose, but happens to be suitable for someone else, maybe for the intended purpose, maybe for something else and it hasn’t been discovered by anyone yet. Hence the name “singleton”– the language didn’t go beyond it’s first creator, either for lack of motivation or for lack of successful marketing. But it’s a bad name because it doesn’t really cover all the parts of the idea I have in mind. Sigh.

What I imagine could happen with a singleton is that a random person could discover the language and independent of what the original creator had in mind, they might think, hey, this is a good language for machine communication, communicating for the disabled, communication with alien life forms, for writing a fictional story, or who knows what– I’m skeptical that conlang creators really know the strengths and weaknesses of their languages.

Some features of a singleton:

You can say most anything.
They are reasonably complete (i.e. have 2000-4000 words, a reasonably comprehensive grammar, maybe 20-100 pages long, they have a corpus of exemplary texts)

You can become competent
It’s reasonably easy or has learning materials that make it reasonably easy to learn.
The language needs to be pretty darn stable. In the real world, communication systems are incredibly conservative. A conlang that is continually changing on a fundamental level is going to really discourage use.

The language’s fate isn’t tied to a fad, a political opinion, or other unstable situation
The culture is superficial– by this I mean, you don’t get culture shock from trying to implement the system.
The culture is in the corpus– by this I mean, you could write about Elves, but you could just as easily write an autobiography or a calculus text book.
The corpus covers both real and fictional scenarios. This is a strong version of what I just said. A language that proves that it can communicate things as disparate as Shakespeare, the Bible and Star Trek is a good test for expressivness in different cultural contexts. That said, Klingon texts, if I understand correctly, do their Shakespeare translations by pretending that the events took place on Qu’nos, which isn’t as impressive a feat of cultural flexibility.

The language has no significant competition
I think the huge rafts of Elvish and Esperanto look-a-likes are failing to get adoption because if you like Elvish style languages (half completed drafts of a communication system that has a several versions with plausible but fictional diachronic relationship) then you are going to study Sindarin/Quenya. If you like languages that are an average of European languages, then you already can use Esperanto.

The better communication system doesn’t win, the one that is already established wins, unless! Unless the communication system is so different that it doesn’t really compete on the same grounds.

The languages fate isn’t tied up with the original creator
They are extendable. By this I mean extendable in the way that English speakers extend English, by being able to derive new words and more rarely, new wrinkles in the existing grammar, when necessary.
There are no legal barriers to use in any form.

Conlang that are based on a book or movie where a single author or movie studio has a lot of rights to the various elements are big question mark. Klingon and Na’vi don’t appear to be extendable and there are significant legal questions about what legally can be done with these languages. But at least there is some law to guide us. Typically for singleton, there might not be any specific licensing information from the conlang writer.

Example Singletons
GZB, Kelen. I’d list more, but it takes so much time to determine if a posted conlang fits the above criteria. Someday.

Famous Conlangs. Famous conlangs are singletons only in the sense that I suppose they are complete and could be adopted for some radically different purpose, sort of the way the Esperanto was used an an interlanguage for a multilanguage dictionary project.

Posted in conlang design, conlang use | Leave a comment