Punctilious and Parsimonious Primitives

Some programming languages have operators. I borrowed the description "an operator-oriented language" for Perl a long time ago; the origin of that phrase probably goes through Larry. Other languages focus almost exclusively on nouns. Some languages derive practicality in programming from a small set of carefully chosen axioms.

One recurring debate in programming language design is whether the mathemeticians should have any say. (I phrase this deliberately.) Human factors such as convenience, discoverability, learnability, and efficiency all govern the ultimate design of any programming language intended for people to use. Yes, The Little Schemer implements multiplication in terms of addition and recursion in order to teach symbolic reasoning and recursion, but a practical Scheme compiler will ultimately take advantage of the fact that multiplication at the transistor level doesn't use recursion.

... but all of that gets into silly technical debates where people throw around insulting phrases such as "Turing Tarpits" and "Periodic tables of operators" and "Using every symbol on the keyboard" and that's how some people miss the really interesting point.

The interesting point is that the representation of a program in source code is, at every level, a data representation—and compression—problem.

Consider spoken language. I have a delicious baked pastry, cooked in a round tin, with strawberries and rhubarb and a bit of sugar and gelatin in the middle. It tastes delicious with a frozen dessert made of vanilla beans, cream, some sugar, and ice. I enjoy eating this combination very much. In other words I like pie.

That's easy to read and that's easy to understand and it's easy to type, but it turns out I don't like mincemeat pie and I don't care much for shepherd's pie and I'm ambivalent about chocolate mousse pie but think that key lime pie is also as delicious. You get the picture, though.

Granted, you can specify the type of a particular pie, or you can express a generic "Any pie will do" with the word "pie" (or perhaps you take advantage of allomorphism) and you get a good understanding of my interests in the abstract. Sometimes you need to be more specific, as when I'm baking in the kitchen or when I'm at a bakery grunting and pointing at something that looks delicious surrounded by two pies that have raisins. Yuck.

Even so, the three word phrase "I like pie" communicates volumes of information... except that I can speak it sarcastically, or as a question, or with emphasis, and the tonality of spoken communication adds additional information that a series of twenty six letters and the space character can't convey. I like pie (but she does not). I like pie (contrary to what you say). I like pie (but that's technically a torte, you Philistine)! I like pie? (Find a good way to mark a rising intonation at the end of a sentence and you'll convey irony online.) Perhaps I'm admitting an addiction to the dessert arts (I... like pie).

You gain some advantages in expressivity by adding additional ways to convey information. Even though I can explain the difference between a torte and a slice of pie with a couple of sentences, it's still easier to say "I like pie" and give the little cake a dubious expression, as if waiting for it to change form or you to assure me that it's equally delicious.

Even though you can reduce the set of primitives of English language further (you probably need to keep a few punctuation symbols, but if you require everything use declarative sentences or simple questions, you can get rid of exclamation points, apostrophes, colons, semicolons, and dashes) by using an encoding such as Morse code, you don't gain in expressivity. Try explaining my feelings about muffins versus cupcakes in telegraph style.

Then again, the mathemeticians have a point. Is a pictoral language without a semi-phonetic writing system necessarily more expressive than English? By no means. That's because another axis of information compression and representation in natural language governs the paucity or abundance of available words. You see this when one language borrows words from another, or where a new word (especially a verb) enters the popular lexicon.

It's easy enough to describe how you performed a search online for details about a good bakery, but it's shorter to say that you Googled it. The latter might even be more expressive; you can take advantage of synecdoche and metaphor to gloss over specific details about how a search engine works and the use of a web browser. Sometimes a convenient fiction improves understanding.

This is not to say that such language techniques as metaphor and irony and sarcasm are always immediately apparent in written language. (The lack of tonality and body language present difficulties, but it's *possible* to convey meaning effectively with a little cleverness. You can argue that adding in features such as the comma or the dash—in this context they're infix operators used, in part, to separate independent clauses—gives you possibilities that you don't have without them. You must write with some degree of awkward precision and arrange your sentences and points with as much directness and left-to-right straight through readability as possible to convey the same meaning that a tiny bit of punctuation would render with much less ceremony.

Natural language does have its ambiguities, and it's not clear that adding features to a language (whether words, tonality, punctuation, or idioms) always clarifies by providing more opportunities for abstraction and encapsulation of thought. Certainly jargons exist for the stated purpose of clarity but the effective purpose of obfuscation to discourage non-professional practitioners (law is particularly bad about this, but so is programming and even so math, which has its own history of adding often conflicting notations).

Yet the question isn't "Is adding primitives bad?" but "What's the right balance between expressivity and the possibility of extendible expressivity?" You see this in discussions about the value of operator overloading (and synonyms and homophones). It's not even a matter of mathematical purity (tell a math programmer that he needs to use the method form for performing the cross product of two matrices he's defined with a nice declarative syntax). It's a matter of human factors and usability and convenience for people to communicate with each other.

Yes, humans destroy rigorous formalisms by being so unpredictable. Yes, people can make messes. Yes, people can be confusing. Yet somehow we manage to communicate effectively. Why not explore some of those principles when designing mechanisms of communication?

1 Comment

zby | June 8, 2010 1:28 AM

You are asking here "What's the right balance between expressivity and the possibility of extendible expressivity?" - but yet it seems that you already know the answer, or at least know that that right balance is more on the side of expressivity that it is in most current programming languages. There is no doubt that human factors are most important here - but there are human factor based arguments against too much expressivity. For example it is about the difficulty to read programs because you need to know the whole language to do that while to write you need to know only a specific part - so you can expect that people will write programs without reading what is already there. And then there is also the argument about expressivity disasters like indirect method calls in Perl5 - it is later nearly impossible to remove them from the language. I know you are for systematic language evolution and purging - but still you have to admit that it is painful and you have to take that pain into account when thinking about the right balance.

Punctilious and Parsimonious Primitives

Tags:

1 Comment

Modern Perl: The Book

Categories

Monthly Archives

Pages

About this Entry