Lisp, the Universe and Everything: 2011-11

I gave a rather messy lightning talk at the recent ECLM on this topic (see below). I think, the messiness can be attributed mostly to my undesire to criticize anything, built with good intentions, including Clojure. Yet in software development there's clearly a need for thorough evaluation of different approaches, languages and technologies, because, we must admit, lots and lots of decisions on such things as architecture or language/platform choice are made on purely subjective and even emotional bases (see "hype"). So below is a more detailed account of the (excessive) complexities, I've encountered working with Clojure in a real-world environment: a half-year project to develop a part of a rather big Java-based server-side system. Also I should note, that I was following Clojure almost since its initial public release, have participated in the early flames on c.l.l. and even edited a Clojure introduction article in the Russian functional programming journal fprog.ru. But this was only the first chance to make a reality check...

But, first of all, I'd like to refer you to the talk of Rich Hickey at the Strange Loop conference "Simple made Easy", the principles of which really resonate with me. Yet it's so often the case, that it's hard to follow your abstract principles, when you're faced with reality (also guilty of that). And another point is that it's not really beneficial to push the principles to the extreme, because there's always the other side, and engineering is the art of making trade-offs: if you don't find room for them, the other side will bite you. So the points below basically boil down to these 2 things: examples of "complecting", and "things should be made as simple, as possible, but not simpler" (attributed to Einstein).

Interactive development

Lisp is considered a so called platform-language in a sense that it implies the existence of certain facilities and runtime environment constraints, that form a distinct and whole environment. Unlike some other languages, usually called scripting, which rely on a pre-existing environment, like the OS (e.g. Posix), web server, web browser etc. Other platform languages are, for example, Java, C# or Erlang, while scripting languages are JavaScript, Perl or PHP. Clojure is a language on a pre-existing platform, which is JVM, and so doesn't define its own platform. This is the source of, probably, the biggest complecting in the language, as it tries to be a stand-alone dynamic, functional language, explicitly discouraging imperative object-oriented style. But the JVM-platform is oriented at static imperative object-oriented ones.

From the Lisp dynamic point-of-view a lot of JVM's facilities are inferior:

mostly static runtime image and object system with only partial opportunities for redefining things on-the-fly instead of a fully dynamic one
static namespaces (tied to a file system) instead of dynamic packages
static exception system instead of a dynamic (restartable) condition system
limited and flawed number representation system instead of a full numeric tower
more limited calling convention (only positional parameters and absense of multiple return values)
more limited naming scheme
XML-based build facilities instead of eDSL-based ones - althouh, here Clojure can provide its own option, but the currently existing one - Leiningen is a weird beast of its own (for example, you can get CLASSPATH out of it, but can't put it in except through project.clj, which has a lot of limitations)
absense of tail call optimization

Surely, there are also some current advantages of the Java platform:

availability of a concurrent GC (although it's not the default one)
good JIT-optimizing compiler
and what's most important, larger amount of available tools and libraries

Yet, if we return to the top of the list of shortfalls, we can see, why interactive development in Clojure is much less productive, then in Lisp. What adds to it is, that Clojure uses a one-pass compiler (not very modern).

Going into more details on this will be a whole separate post, so I'll just sum up, that interactive development in Clojure is hampered both by the JVM sticking in different places (especially, if you work on projects, combining Clojure and Java code) and Clojure's own misfeatures.

Syntax

From its early days Clojure was positioned as a "modern" Lisp. And what this "modernity" actually implied is:

more accessible syntax and broader support for vectors and maps as opposed to Common Lisp, in which only lists, allegedly, were first-class citizens
built-in concurrency primitives
being Lisp-1 instead of Lisp-2, which makes heavy functional style programming more concise
cleaning up some minor "annoyances": 4 equality operators, interning reader, etc.

Starting from the bottom, the minor issues make it not easy to start, but they are actually conceptually simple and useful, once you get accustomed. Lisp-1 vs Lisp-2 is a matter of taste: for example, I prefer #', because it's an annotation, while others perceive it as noise. And there's no objective advantage of one over another: yes, Lisp-1 makes something like (<map> </key>) instead of (gethash <map> <key>) possible, yet it makes macros more complicated. And concurrency I'll discuss separately.

What's left is broader support for vectors and maps, including destructuring. I agree, that declarative syntax for common datastructures is a crucial for productive use of any language up to the point of defining the same literal syntax ({}) for hash-tables in CL. Thankfully, that is supported by the language, so this syntax is as first-class in CL, as in Clojure, and, as in many aspects, nothing prevents "modernizing" Lisp in this aspect without creating a whole separate language... This doesn't hold for Clojure, as it doesn't have facilities to control the reader in the same way CL does: actually, in this regard Clojure is very different from Lisp — it hardly provides facilities for controlling any aspect of the language — and this control is a critical part of Lisp's DNA.

And pushing syntax to the extreme has it's shortcomings in Clojure as well. Rich argues, that defining both lists and vectors with parens (in Lisp list is '() and vector is #()) is complecting. But, I'd say, that a much harder case of complecting is this:

)))))))])))))])]) — the joy of debugging ASTs

And it's not even about Clojure: although here it's even worse, because for some reason, that escapes me, let (and defn, and many others) uses vectors for argument lists. Aren't they called lists for a reason? So this once again actualizes the problem of counting closing parens, effectively solved for Lisp long ago with emacs.

At the same time "modern" Clojure poorly supports such things as keyword arguments in functions or multiple return values and many other not so "modern", but very effective facilities, that I personally would expect to see in a modern language...

There's only one true way: functional

In my talk I referred to one of the ugliest pieces of code, I've ever written, which was a very complicated Clojrue loop, i.e. a loop/recur thing (and I've honestly tried to factor it out).

Basically there are two opposite approaches to iteration: imperative looping and functional recurring. Many languages have a strong bias towards one or another, like Python discouraging recursion and Clojure discouraging imperative loops by imposing immutability. But the thing is, that there are problems, for which one of the approaches yields by far more concise and understandable code. If you want to traverse a tree, recursion is a way to go. While if you are accumulating several sequences at once, which may reference results, obtained at the previous computations and also at each iteration there's not one, but several outcomes, recursions often becomes too messy. Yet in Clojure there's not even good support for recursion (which has an advantage of factoring different pieces of code into functions), but a special construct loop/recur, which shares the downsides of both approaches and does hardly provide any of the benefits. That's a pity, as iteration is the basic programming construct and no code file can do without it. And here we see a case of detrimental over-simplification.

And there are also lazy sequences, which complect sequences and streams. In theory, those are the same things, but, as the saying goes, in theory, theory and practice are the same, but in practice... Surely this makes writing a compiler easier at the cost of complicating reasoning about sequences in day-to-day programming.

EDIT: as was correctly pointed by the commentators, you can have sort of mutable local variables with transients, and that allows to express some of the imperative loops in a more concise manner.

Concurrency

In the early days of Clojure in one of the discussions on c.l.l. S.Madhu (if I remember correctly) called Clojure's concurrency primitives "snake oil". I thought, that there may be some personal issues in such attitude, but having tried it for myself and learned about all the alternatives in the concurrency space, I don't think it was too far from reality. First of all, Clojure addresses only shared-state concurrency on a single computer, while most hard concurrency problems appear in the context of distributed systems and are currently solved with approaches, like the Actor model or MapReduce. And on the single machine I've seen very few problems, that can't be solved with a combination of thread pools (with thread-local storage), Lisp-style special variables and databases. In fact Clojure provides its own (often limited) variants of all the above mentioned approaches and nothing more:
- agents instead of actors
- vars (analogue of CL's special variables)
- STM for databases
- map/reduce, pmap

To be able to provide them Clojure imposes some restrictions, most notably the one of immutability. Also in their implementation it doesn't follow the Lisp principle of giving control to the programmer: you can't swap one STM variant/strategy for the other. Heck, you can't even control the number of threads, that pmap uses!

Among all this approaches, STM is the most notable one (as others are just a copy of the established technologies). And it has a tough competition from databases and other transactional datastores. The advantages are no impedance mismatch and, possibly, better efficiency. Yet the database is language-agnostic, which is often useful, it's more accessible and understandable. And, what's most important: there's durability and there's a choice, that you can utilize, depending on your needs. The best use case for STM I've found so far was to hold statistical counters, accessed simultaneously from many threads, yet this problem is easily solvable with Redis, for example. And the same applies to other uses I can think of, So, the paradox of Clojure is that, although it was ushered with the idea of solving concurrency problems, it still has a lot to prove in this space: and not with toy examples of ant colonies, but real-world applications, made possible with its approach (like RabbitMQ or Ejabberd showcasing Erlang's aptitude for building massively parallel systems).

Final thoughts

I've written a long article, but there are many more details left out (just adding code snippets, will make it twice as big). There's also a lot of good things about Clojure, which I didn't mention: its seamless integration with Java, for example, which makes it a good scripting language for Java applications. Macros, basic datastructures, vars, they all work great. Ring was also an enlightenment.

Yet, overall the language doesn't leave up to its promise of being a modern Lisp: actually, it's not a Lisp at all. Well, how do you define Lisp? S-expressions, macros, closures etc? Surely, all (at least most of) those features may be present, although Dylan was a Lisp without s-expressions, for example. But, in my opinion as a Lisp programmer, what makes a true Lisp is dynamicity, simplicity and putting control in developer's hands (flexibility and extensibility). The mentioned features are mostly a derivative of this principles, and also they all should have a backend for extension: reader, compiler, condition system, object system, special variables etc, they all do in Lisp. And Clojure gives up on all these principles if not altogether, than substantially: many of the features are there and the new ones arrived, but the backend is missing.

Common Lisp is well optimized for the common use-cases in the design space of dynamic flexible languages. And it's probably very close to local maximum in it. At least it's good enough. Clojure is more of a hybrid, and hybrids are built by complecting: in Clojure's case complecting Lisp and Java, Lisp and Haskell. The result is interesting, but, in my opinion, it's not optimal. Surely it has its use-cases.

Still, Clojure sees decent adoption (which is good, because it proves, that parens aren't a problem after all :) I think it's a combination of several things. The language still has a lot of good stuff from Lisp (as well as some good things from Haskell, although they don't match very well), and those who start using it are mostly Java and Ruby programmers, for whom it breaks the psychological barriers to using Lisp (we all know the FUD). The other thing is lack of a decent native (not like Jython or JRuby) dynamic language on the JVM. And finally, there's marketing, i.e. Clojure concurrency primitives. And the challenge for Clojure community is to produce a "killer" application, that will utilize them, or the hype will wane, and pretty soon...

Lisp, the Universe and Everything

2011-11-30

Clojure & Complexity

Interactive development

Syntax

There's only one true way: functional

Concurrency

Final thoughts

2011-11-27

Videos from ECLM 2011

About Me

My Book

Blog Archive

Feeds