So I can open a novel, and as part of being a human raised in an English speaking community, I pretty much understand everything. I can open a textbook on Calculus on Logic and while I can read the whole thing in English– there is even awkward but grammatically correct ways to read off the formulae– I’m not going to understand it just because I know English. I think this is some pretty conservative evidence that math and logic are not really natural languages, they are more like a foreign language embedded into a natural language.
So I was trying to deal with conjunction in toki pona. Sometimes they are made unnecessary by the “chain pattern”– one similar structure after another implies “and”. Sometimes they indicate discourse connectors, by tagging a sentence with “or” or “but”. Those two forms of logic are effortless to parse (except when people ignore the chain pattern and try to explicitly add “and” words) Finally we get these monsters:
1) jan li suli. (simple, no “and”)
2) jan li suli li laso. (chain patter, one right after another implies “and”)
3) jan li suli en laso. (different structures imply different meaning, maybe the qualities are mixed, like blue and red can be mixed)
4) jan li suli taso mute.
5) jan li suli anu mute.
6) jan li jan suli en mute.
7) jan li jan suli anu mute.
8) jan li jan pi suli en mute.
9) jan li suli en mute anu soweli. (mixed and, or, but)
3, 4, and 5 imply that you can “and”/”or”/”but” qualities without a head, so modifier phrases would be something like:
To parse all of the above, 1-8, you need a data structure that looks like this… and will lead to some monstrous maximal forms.
Head modifier (optional)
(Maybe a pi, depends on if you are predicate or modifier of a headed phrase)
Ands: en + modifiers — repeated
Ors: anu + modifiers — repeated
buts: taso + modifiers — repeated
And maximally something like:
jan li suli en mute taso lili taso laso anu soweli anu waso. (Grouped)
jan li taso lili en suli anu soweli en mute taso laso anu waso. (Jumbled up.)
How to parse this? I have no idea, it reads like a logic puzzle and you’d have to introduce a foreign logic system to do something with it. It looks syntactically valid. So I’m thinking my parser should represent a modifier chain as above, but make no claims about what it means. So it parses one way, and if someone (ha! unlikely) ever decided to implement a logic subsystem, they could take this parse and then transform it into all the possible meanings, truth tables and so on.
But for these applications, we don’t care:
grammar check– it’s valid syntax.
glossing– It glosses to English, and is equally ambiguous and unintelligible in English.
syntax highlighting– you only need to recognize an “and”/”or”/”but” sequence to color the text, you don’t need to know what it means or parse it as just one parse tree.
chat bot– A chat bot would never explore these corners of possible meaning in the universe of representable meanings that toki pona can represent.
1) * jan li kepeken ilo en kepeken soweli. (Don’t use can to combine prep phrases)
2) */? jan li tawa en kama. (Don’t use en when you can use li– but if this was a modifier chain, and a predicate sentence, then its probably okay)
3) * jan li kepeken ilo anu kepeken soweli. (Don’t “or” prep phrases)
4) * jan li moku e ilo anu e soweli. Don’t use both anu and e, don’t use both taso and e [Update, changed to moku because kepeken has had some recent POS confusion from toki pona version pu)
5) */? ante jan li kepeken e ilo. Don’t use anything but anu or taso as a tag-conjunction.
6) * en jan li kepeken ilo. Don’t start sentence with en. (En is implied, although it would have made for a nice audible sentence demarcation)
7) ? waso pi laso en pimeja li pona tawa mi. This is really hard to parse. “and”ing modifiers in the subject slot is only sometimes distinguishable from mistakes and “and”ing subjects.