By the way, it looks like Dothraki has a published spec. Now on to other topics.
As someone with the facial expressions of a robot, I’ve always been partial to robots and some of my earliest attempts at programming were to create chat bots and AI. I failed, of course. But now I have some ideas on how to make it work.
Our human brains have some sort of knowledge representation system, it turns our stage– the world around us, into facts represented by neurons linked by axons and dendrites, which chatter using neurochemicals. We lack a technology to accurately and usefully use a neurological model to represent reality. But, hey, we got other ways to represent reality. For example, we use documents and relational databases to keep track of inventory and the business activities of all large businesses and government in the world.
Normally, when this need to be communicated, we use protocols like HTTP to send (often technology independent) serializations of database records that can be sent across a wire. We then using UI’s and binding to turn this into human consumable materials.
But lets get back to robots. Robots are machines that would want to be like people, and thus use a natural language. That means they could possibly deal with people directly. But English is hard, so maybe a conlang or restricted version of English would be better.
Representations of reality:
Name – Phone Number
Joe – 555-1234
Jane – 444 – 5678
If this was toki pona, we could serialize this as:
nanpa pi jan Joe li 555 – 1234.
By some complicated system of equalities, we could work out that this is the same as:
jan Joe li jo e nanpa ni: 555 -1234
If the robot heard a sentence, it would attempt to use deserialization & equality checks to transform the utterance into a known data type:
jan Mato li jo e nanpa ni: 111 -8989 ==> Mato – 111 -8989
A lojban style processor could also answer utility questions like,
nanpa Jane li 444-4678 la ona li toki tawa mi.
If Jane’s number is 444-4678, then she’s talking to me.
And the robot would respond, after binding & processing pronouns:
jan Jane li toki tawa sina.
Indeed, Jane is talking to you.
Or utility questions might involve common computer tricks like, “how many digits are in Jane’s phone number? What is the sum of the digits in Jane’s phone number?” A human actually excels at this arbitrary discussion, where as a robot has to be programmed for each exchange of that sort.
Pronouns seem like something that would be really, really hard for a computer. If my computer only had a knowledge representation system for the phone book, it would need to know who is a person, who is capable of having a phone number and so on. People excel at common sense, modern code doesn’t. Databases rely on nonce, unique names and variables that might be bound to anything are used only in limited scopes to make sure that they do only bind to 1 thing at a time.
Next, is the chat bot problem.
Chat bots respond to what ever you ask. Usually it’s modeled as a command. But human languages only sometimes use commands.
If Jane’s number is X, then she’s talking to me. (Implied, asking for confirmation)
I know Jane. (Implied, asking for additional information about Jane, e.g. Oh, you do? I know her too, her number is X)
Another thing a chat bot should be able to do, is serialize things into something that is suitable for saying over the phone. Most code dumps text to the screen, often in a grid format. A good robot would be able to tell a story in a way that takes into account attention span. A bad robot would read all 5000 phone numbers. A smart robot would say, after reading two, “and so on” or “do you want me to keep going or are you looking for someone in particular”
State– some of the best chat bots are sadly stateless. They don’t incorporate anything you say into their base of knowledge. Some do, but it’s kind of wonky– they just remember that after saying “Good day” people usually just repeat “Good day”.
A good robot takes all utterances and converts them into a system of knowledge.
My phone book robot, if I said:
mi jo e soweli.
Would interpret that as asking the database to create a new table like so:
who – inventory
jan Mato – soweli
And if two minutes later I asked:
mi jo e seme?
The robot should be able to look it up even though 10 minutes ago, this robot only knew phone numbers.
This is the flip side of serialization– turning language back into the knowledge representations system.
Anyhow, this has been done before, MS SQL had a natural English processor, it was probably similar to what I have described, although I bet it only dealt with turning english into SELECT statements and turning the tables of data, maybe into English sentences. Turning English into tables that can be queried again is probably hard.
A tp fact database would rely heavily on equality tests:
mi jo e soweli lon tomo mi.
Does this factually contain the following?
mi jo e soweli. Yes.
Anyhow, hopefully personal life will allow the free time to write such a thing. So to recap:
Knowledge representation system: E.g. relational tables.
Serialization system: E.g. turns rows and tables into sentences
Deserialization system: Creates tables and binds utterances to a table, then inserts 1 or more rows.
Persistence: All commands, factual or otherwise, become part of the system of knowledge.
Query language: Questions, or statements that prompt retrieving information and serializing it back to the interlocutor.
Utility: Processing tasks that are not really related to retrieving and updating a representation of knowledge. For example, answering if at least 3 people in the phone book have names starting with “G”
Equality and Transformations. Natural languages can serialize into many equivalent forms.