Friday, January 5, 2018

Semantic Primes

I've been thinking about how to build up a dictionary how you would if you were trapped on a desert island with someone who didn't speak your language and you wanted to teach it to them from scratch. I got a very interesting book called Semantics: Primes and Universals by Anna Wierzbicka that talks about the very first words you would use to start such a project. The "semantic primes" are around 40 words that she takes as undefinable but from which a much larger number of words can be defined. She says that these words are special because every human language has a word (or a sense of a word) that is a direct translation of each of these 40 words, and that they are among the first concepts that children learn to express. She also talks a little about a simplified grammar that lets you combine these words.

learnthesewordsfirst.com is an online dictionary/lesson plan that builds up English in this way. It starts with

61 semantic primes, defined mainly by pictures and examples. Using only these 61 words, it defines

300 "semantic molecules." Using only these semantic molecules, it defines

2000 words used in the Longman Defining Vocabulary. These words are used to define

230,000 words in the Longman Dictionary of Contemporary English.



Now imagine that you found a way to program the meaning of these 61 words into a robot, and programmed in the ability to read a sentence using these 61 words and derive the meaning of a new word from that. You could build up to the meaning of all the words in the dictionary this way. Programming those first 61 words and the grammar would be challenging, but I don't think impossible.



I don't think this would be sufficient to understand everything about those concepts. Suppose I gave the definition "an arc-shaped fruit, around 6-12 inches long, with soft white flesh and a skin that is green when unripe, yellow when ripe, and soft and brown when overripe, and grows in bunches." This would be enough to pick out a banana from any other food in the supermarket, but it wouldn't tell you much about what a banana really looks like. You wouldn't be able to recognize a banana split from such a definition. A good definition generally tells you just enough to distinguish the item from any potential confusers. But it would be an excellent start that you could begin to flesh out with other capabilities.