2011-11-30

Clojure & Complexity

I gave a rather messy lightning talk at the recent ECLM on this topic (see below). I think, the messiness can be attributed mostly to my undesire to criticize anything, built with good intentions, including Clojure. Yet in software development there's clearly a need for thorough evaluation of different approaches, languages and technologies, because, we must admit, lots and lots of decisions on such things as architecture or language/platform choice are made on purely subjective and even emotional bases (see "hype"). So below is a more detailed account of the (excessive) complexities, I've encountered working with Clojure in a real-world environment: a half-year project to develop a part of a rather big Java-based server-side system. Also I should note, that I was following Clojure almost since its initial public release, have participated in the early flames on c.l.l. and even edited a Clojure introduction article in the Russian functional programming journal fprog.ru. But this was only the first chance to make a reality check...

But, first of all, I'd like to refer you to the talk of Rich Hickey at the Strange Loop conference "Simple made Easy", the principles of which really resonate with me. Yet it's so often the case, that it's hard to follow your abstract principles, when you're faced with reality (also guilty of that). And another point is that it's not really beneficial to push the principles to the extreme, because there's always the other side, and engineering is the art of making trade-offs: if you don't find room for them, the other side will bite you. So the points below basically boil down to these 2 things: examples of "complecting", and "things should be made as simple, as possible, but not simpler" (attributed to Einstein).

Interactive development


Lisp is considered a so called platform-language in a sense that it implies the existence of certain facilities and runtime environment constraints, that form a distinct and whole environment. Unlike some other languages, usually called scripting, which rely on a pre-existing environment, like the OS (e.g. Posix), web server, web browser etc. Other platform languages are, for example, Java, C# or Erlang, while scripting languages are JavaScript, Perl or PHP. Clojure is a language on a pre-existing platform, which is JVM, and so doesn't define its own platform. This is the source of, probably, the biggest complecting in the language, as it tries to be a stand-alone dynamic, functional language, explicitly discouraging imperative object-oriented style. But the JVM-platform is oriented at static imperative object-oriented ones.

From the Lisp dynamic point-of-view a lot of JVM's facilities are inferior:
  • mostly static runtime image and object system with only partial opportunities for redefining things on-the-fly instead of a fully dynamic one
  • static namespaces (tied to a file system) instead of dynamic packages
  • static exception system instead of a dynamic (restartable) condition system
  • limited and flawed number representation system instead of a full numeric tower
  • more limited calling convention (only positional parameters and absense of multiple return values)
  • more limited naming scheme
  • XML-based build facilities instead of eDSL-based ones - althouh, here Clojure can provide its own option, but the currently existing one - Leiningen is a weird beast of its own (for example, you can get CLASSPATH out of it, but can't put it in except through project.clj, which has a lot of limitations)
  • absense of tail call optimization

Surely, there are also some current advantages of the Java platform:
  • availability of a concurrent GC (although it's not the default one)
  • good JIT-optimizing compiler
  • and what's most important, larger amount of available tools and libraries

Yet, if we return to the top of the list of shortfalls, we can see, why interactive development in Clojure is much less productive, then in Lisp. What adds to it is, that Clojure uses a one-pass compiler (not very modern).

Going into more details on this will be a whole separate post, so I'll just sum up, that interactive development in Clojure is hampered both by the JVM sticking in different places (especially, if you work on projects, combining Clojure and Java code) and Clojure's own misfeatures.

Syntax


From its early days Clojure was positioned as a "modern" Lisp. And what this "modernity" actually implied is:
  • more accessible syntax and broader support for vectors and maps as opposed to Common Lisp, in which only lists, allegedly, were first-class citizens
  • built-in concurrency primitives
  • being Lisp-1 instead of Lisp-2, which makes heavy functional style programming more concise
  • cleaning up some minor "annoyances": 4 equality operators, interning reader, etc.

Starting from the bottom, the minor issues make it not easy to start, but they are actually conceptually simple and useful, once you get accustomed. Lisp-1 vs Lisp-2 is a matter of taste: for example, I prefer #', because it's an annotation, while others perceive it as noise. And there's no objective advantage of one over another: yes, Lisp-1 makes something like (<map> </key>) instead of (gethash <map> <key>) possible, yet it makes macros more complicated. And concurrency I'll discuss separately.

What's left is broader support for vectors and maps, including destructuring. I agree, that declarative syntax for common datastructures is a crucial for productive use of any language up to the point of defining the same literal syntax ({}) for hash-tables in CL. Thankfully, that is supported by the language, so this syntax is as first-class in CL, as in Clojure, and, as in many aspects, nothing prevents "modernizing" Lisp in this aspect without creating a whole separate language... This doesn't hold for Clojure, as it doesn't have facilities to control the reader in the same way CL does: actually, in this regard Clojure is very different from Lisp — it hardly provides facilities for controlling any aspect of the language — and this control is a critical part of Lisp's DNA.

And pushing syntax to the extreme has it's shortcomings in Clojure as well. Rich argues, that defining both lists and vectors with parens (in Lisp list is '() and vector is #()) is complecting. But, I'd say, that a much harder case of complecting is this:
)))))))])))))])]) — the joy of debugging ASTs

And it's not even about Clojure: although here it's even worse, because for some reason, that escapes me, let (and defn, and many others) uses vectors for argument lists. Aren't they called lists for a reason? So this once again actualizes the problem of counting closing parens, effectively solved for Lisp long ago with emacs.

At the same time "modern" Clojure poorly supports such things as keyword arguments in functions or multiple return values and many other not so "modern", but very effective facilities, that I personally would expect to see in a modern language...

There's only one true way: functional


In my talk I referred to one of the ugliest pieces of code, I've ever written, which was a very complicated Clojrue loop, i.e. a loop/recur thing (and I've honestly tried to factor it out).

Basically there are two opposite approaches to iteration: imperative looping and functional recurring. Many languages have a strong bias towards one or another, like Python discouraging recursion and Clojure discouraging imperative loops by imposing immutability. But the thing is, that there are problems, for which one of the approaches yields by far more concise and understandable code. If you want to traverse a tree, recursion is a way to go. While if you are accumulating several sequences at once, which may reference results, obtained at the previous computations and also at each iteration there's not one, but several outcomes, recursions often becomes too messy. Yet in Clojure there's not even good support for recursion (which has an advantage of factoring different pieces of code into functions), but a special construct loop/recur, which shares the downsides of both approaches and does hardly provide any of the benefits. That's a pity, as iteration is the basic programming construct and no code file can do without it. And here we see a case of detrimental over-simplification.

And there are also lazy sequences, which complect sequences and streams. In theory, those are the same things, but, as the saying goes, in theory, theory and practice are the same, but in practice... Surely this makes writing a compiler easier at the cost of complicating reasoning about sequences in day-to-day programming.

EDIT: as was correctly pointed by the commentators, you can have sort of mutable local variables with transients, and that allows to express some of the imperative loops in a more concise manner.

Concurrency


In the early days of Clojure in one of the discussions on c.l.l. S.Madhu (if I remember correctly) called Clojure's concurrency primitives "snake oil". I thought, that there may be some personal issues in such attitude, but having tried it for myself and learned about all the alternatives in the concurrency space, I don't think it was too far from reality. First of all, Clojure addresses only shared-state concurrency on a single computer, while most hard concurrency problems appear in the context of distributed systems and are currently solved with approaches, like the Actor model or MapReduce. And on the single machine I've seen very few problems, that can't be solved with a combination of thread pools (with thread-local storage), Lisp-style special variables and databases. In fact Clojure provides its own (often limited) variants of all the above mentioned approaches and nothing more:
- agents instead of actors
- vars (analogue of CL's special variables)
- STM for databases
- map/reduce, pmap

To be able to provide them Clojure imposes some restrictions, most notably the one of immutability. Also in their implementation it doesn't follow the Lisp principle of giving control to the programmer: you can't swap one STM variant/strategy for the other. Heck, you can't even control the number of threads, that pmap uses!

Among all this approaches, STM is the most notable one (as others are just a copy of the established technologies). And it has a tough competition from databases and other transactional datastores. The advantages are no impedance mismatch and, possibly, better efficiency. Yet the database is language-agnostic, which is often useful, it's more accessible and understandable. And, what's most important: there's durability and there's a choice, that you can utilize, depending on your needs. The best use case for STM I've found so far was to hold statistical counters, accessed simultaneously from many threads, yet this problem is easily solvable with Redis, for example. And the same applies to other uses I can think of, So, the paradox of Clojure is that, although it was ushered with the idea of solving concurrency problems, it still has a lot to prove in this space: and not with toy examples of ant colonies, but real-world applications, made possible with its approach (like RabbitMQ or Ejabberd showcasing Erlang's aptitude for building massively parallel systems).

Final thoughts


I've written a long article, but there are many more details left out (just adding code snippets, will make it twice as big). There's also a lot of good things about Clojure, which I didn't mention: its seamless integration with Java, for example, which makes it a good scripting language for Java applications. Macros, basic datastructures, vars, they all work great. Ring was also an enlightenment.

Yet, overall the language doesn't leave up to its promise of being a modern Lisp: actually, it's not a Lisp at all. Well, how do you define Lisp? S-expressions, macros, closures etc? Surely, all (at least most of) those features may be present, although Dylan was a Lisp without s-expressions, for example. But, in my opinion as a Lisp programmer, what makes a true Lisp is dynamicity, simplicity and putting control in developer's hands (flexibility and extensibility). The mentioned features are mostly a derivative of this principles, and also they all should have a backend for extension: reader, compiler, condition system, object system, special variables etc, they all do in Lisp. And Clojure gives up on all these principles if not altogether, than substantially: many of the features are there and the new ones arrived, but the backend is missing.

Common Lisp is well optimized for the common use-cases in the design space of dynamic flexible languages. And it's probably very close to local maximum in it. At least it's good enough. Clojure is more of a hybrid, and hybrids are built by complecting: in Clojure's case complecting Lisp and Java, Lisp and Haskell. The result is interesting, but, in my opinion, it's not optimal. Surely it has its use-cases.

Still, Clojure sees decent adoption (which is good, because it proves, that parens aren't a problem after all :) I think it's a combination of several things. The language still has a lot of good stuff from Lisp (as well as some good things from Haskell, although they don't match very well), and those who start using it are mostly Java and Ruby programmers, for whom it breaks the psychological barriers to using Lisp (we all know the FUD). The other thing is lack of a decent native (not like Jython or JRuby) dynamic language on the JVM. And finally, there's marketing, i.e. Clojure concurrency primitives. And the challenge for Clojure community is to produce a "killer" application, that will utilize them, or the hype will wane, and pretty soon...

68 comments:

gtrak said...

"hybrids are built by complecting: in Clojure's case complecting Lisp and Java, Lisp and Haskell."

In the 'Simple Made Easy' talk, Rich made an analogy of a woven castle vs a lego castle. I think clojure is much closer to a lego castle than you imply in the quote. It's composing, not complecting, some basic principles from Java, Lisp and Haskell, and this gives you choices and flexibility. For example, if you don't want laziness, there is no need to use it, for instance, you can use a vector, run it through 'seq' and all the functions that rely on seq abstractions will work on it.

Vsevolod Dyomkin said...

If it was composing, I wouldn't see Java stacktrace interleaved with Clojure one. For example.

Regarding vector and seq: yes, it works that way. I don't mean, that it's completely unusable, no. But the default is lazy, so you have to apply some extra effort to get non-lazy behavior (with seq, doall etc). While, I'd argue, that the default use case is eager. At least for me it was always that way. You also get lazy sequences in Lisp or Python, for example. And it is, usually, harder to work with them, than with eager ones. But that's fine, if the default use case is eager...

Michael Campbell said...

You are talking about a multitude of different things. gtrak is talking about composing at the language level, you're talking about composing at the debug output layer (I honestly can't even being to make the leap you're grasping at, there.)

Then you jump to default lazy vs. default eager. While it might be easier *FOR YOU*, that says more about you than the language, (and not necessarily in a bad way.)

Haskell is almost violently default lazy also, and yet the Haskell guys seem to have the complete opposite opinion of what it "should" be; who's right? Perhaps no one, but again, that's a "what you're used to" issue, not a language one.

gtrak said...

Yes, the platform integration isn't what I was talking about though :-). It makes some tradeoffs but if you're using a seq and you care about the details of its realization, you're no longer in a realm of purely functional programming. Haskell takes it further but I think it's the same idea. I think in general it's not always clear what's complex and what's simple, but I pointed out an obvious issue. You were talking as if clojure is a hybrid of multiple systems weaved together in a complex way, but that wasn't its intent. It's instead a language created to help people get stuff done, manage state and do it with performance in mind.

Vsevolod Dyomkin said...

Yes, the Haskell approach is opposite to Lisp's. That's why they match poorly :)

Regarding stacktrace, I'm talking about the level of interleaving. SBCL is implemented in C, but you don't see C stack, when you encounter an error in your Lisp code. In Clojure you actually deal with Java stacktrace not only as an implementation, but as a concept. You deal with Java exceptions (up until very recently Clojure even used checked exceptions, which is a very good example of interleaving: a very static check in a dynamic language). As I've said, you have to deal with Java calling conventions and many other Java things. Clojure more often just embraces them, rather then build on top of them. And they don't play together very well with dynamic interactive Lisp concepts, that's why using Clojure REPL is at times more daunting, than it could be.

Like sometimes you :use a symbol in ns definition from some Clojure library, load the code, and then decide, that you want to define a function in your ns with this symbol. You reload the code and get an exception. Why should that happen? And there are lots and lots of such small inconsistencies, which arise from poor matching of Lisp's dynamic and Java's static features...

And if you look at the stacktrace, do you see the line of code, where the error happened? Or at least in which file? This is not Clojure stacktrace, it's purely a Java one.

trimr already refers to: #'clojure.string/trimr in namespace: test.corpus
[Thrown class java.lang.IllegalStateException]

Restarts:
0: [QUIT] Quit to the SLIME top level

Backtrace:
0: clojure.lang.Namespace.warnOrFailOnReplace(Namespace.java:88)
1: clojure.lang.Namespace.intern(Namespace.java:72)
2: clojure.lang.Compiler$DefExpr$Parser.parse(Compiler.java:427)
3: clojure.lang.Compiler.analyzeSeq(Compiler.java:5369)
4: clojure.lang.Compiler.analyze(Compiler.java:5190)
5: clojure.lang.Compiler.analyze(Compiler.java:5151)
6: clojure.lang.Compiler.eval(Compiler.java:5428)
7: clojure.lang.Compiler.load(Compiler.java:5857)
8: clojure.lang.RT.loadResourceScript(RT.java:340)
9: clojure.lang.RT.loadResourceScript(RT.java:331)
10: clojure.lang.RT.load(RT.java:409)
11: clojure.lang.RT.load(RT.java:381)
12: clojure.core$load$fn__4511.invoke(core.clj:4905)
13: clojure.core$load.doInvoke(core.clj:4904)
14: clojure.lang.RestFn.invoke(RestFn.java:409)
15: clojure.core$load_one.invoke(core.clj:4729)
16: clojure.core$load_lib.doInvoke(core.clj:4766)
17: clojure.lang.RestFn.applyTo(RestFn.java:143)
18: clojure.core$apply.invoke(core.clj:542)
19: clojure.core$load_libs.doInvoke(core.clj:4800)
20: clojure.lang.RestFn.applyTo(RestFn.java:138)
21: clojure.core$apply.invoke(core.clj:542)
...

> It's instead a language created to help people get stuff done, manage state and do it with performance in mind.
This sounds awesome, but can be neither verified, nor refuted. ;)

gtrak said...

This is kind of a tiresome argument we're having, but I just want to clarify, whether it achieves its goals is debatable, but the intent is also important and informative to the discussion. Your expectations do not line up with the design intent, so they're somewhat silly. The rationale for clojure is important for many applications, and judging it against what it sets out to do is more useful, IMO. Otherwise the argument is a straw man.

http://clojure.org/rationale

Vsevolod Dyomkin said...

> Your expectations do not line up with the design intent, so they're somewhat silly. The rationale for clojure is important for many applications, and judging it against what it sets out to do is more useful, IMO.

I agree with you and, actually, have mentioned that in the article in some way. But you may also say, that I was evaluating, how the points from the rationale play together (from my perspective, surely).

skuro said...

I might be terribly wrong, but asserting that there's no real concurrency issue in the single machine scenario sounds like a gross mistake. My roots are in the Java world, and I can assure you multithreading *is* an issue, for which the Java answer has been "try to avoid it as much as you can" for really long time.

You can sort out shared mutable state with an RDBMS or a NoSQL store if you want, the same way you can open a door with a bazooka and close it by replacing it: is it convenient? I honestly find Clojure STM quite a handy library, and not for databases: to simply ease concurrent programming.

It might be because of I have no real background in other Lisps, but most of your arguments are more of a personal opinion rather than real language issues (I personally find arguments vectors as a better visual hint for parameters, destructuring seems fair enough when it comes to handle multiple return values, ...).

The real issue I completely agree with you is about TCO. It's indeed due to the decision of being hosted on the JVM, and it's imposing a limit on the language, for which I too find loop/recur and trampoline more of a workaround (while in general I don't mind writing recursive loops).

I also argue the tradeoff is fair: I guess one of the major points of Clojure that helped it reach the popularity it has right now *is* that's hosted on the JVM, giving direct access to all Java libraries on the planet. It seems to me that benefits are proving themselves to be more than the shortcomings.

Vsevolod Dyomkin said...

@skuro I take more of a middle-ground approach here. I like Clojure STM as a library, but I don't like it imposing restrictions on a whole language. The same with loop/recur - it's a great idea, because it implements a very common pattern. But it doesn't fit 100% of cases, not even 50%.

And speaking about the JVM: although many people praise it so much, I don't see anything special about this platform. We run it in production, and it leaks memory like every other thing, it doesn't properly close sockets and so on. Neither does it have some outstanding performance. And considering how it's governed by Oracle, I think in 10 years it will, probably, be irrelevant. And there's nothing special about Java libraries: there are good ones in some areas, in others - not so much (e.g. Python and Ruby libraries for web stuff are much more advanced). And yes, there's no decent language on the JVM right now, so many people are in constant search (some find Clojure and it fits their needs, while it disappoints others).

skuro said...

If you're really saying that immutability is required only because of STM, I honestly see it the other way around: immutability is incredibly desirable by itself, and STM is needed because you require shared mutable state every now and then.

As per the reasons why immutability is desirable, that's the state vs identity topic, about which there's plenty of discussions around the web.

Vsevolod Dyomkin said...

Actually, what's desirable is referential transparency. Also there can be a valid argument, what should be the default: mutable or immutable. But completely disallowing mutable is extreme. I know, that it's a general FP approach, and I've just tried to show in this article how it adds complexity. (Just another opinion point in the general debate, so to say)

skuro said...

About the JVM as a platform, there are some features that make it a good languages host, but it's IMO the humongous community and number of libraries around it that gives it real value.

I'm not saying the JVM is the silver bullet, but if you were to write a new language, if you decide not to be hosted on some other runtime you are most likely going to be incompatible with any other language, and if you instead decide to be hosted, the JVM is probably (one of) the best choice.

skuro said...

Mutability it not really forbidden by the language:

- transients are there for pure clojure mutable state http://clojure.org/transients

- any time you use interoperability with Java you're dealing with a mutable world

You might argue that interop as a mean to achieve mutability is like cheating, and that would be partially true. The thing is, the choice of being hosted on the JVM, and the whole interop syntax, makes it an intimate part of Clojure DNA.

Vsevolod Dyomkin said...

- transients are not for mutability, but for better performance (I've tried them)*
- using interop for mutable state is the example of (extreme) complecting I was pointing to

* From the docs: Note in particular that transients are not designed to be bashed in-place. You must capture and use the return value in the next call. In this way, they support the same code structure as the functional persistent code they replace. (and I'd add: only the same code structure)

skuro said...

The fact that they're not designed for mutable state comes from the original phylosophy of Clojure, that mutable state is evil. Given this postulate, performance is the only valid concern for transients. This is not changing the fact that transients implement pure clojure mutable state.

For interop being complect, you're probably right, point taken.

oap said...

The following statement is valid only for small companies and small projects: "on the single machine I've seen very few problems". Concurrency issues exist in the enterprise apps for years and are extremely hard to reproduce or spot.

Managing synchronization in Java is similar to managing memory in C. Hard, but possible.

Consider the simplest real-world case - a new mid-level programmer is joining your team. Are you confident that she will hold and release locks in the right order. Will she avoid data races when adding a new small feature?

oap said...

yet this problem is easily solvable with Redis, for example

when you are accessing an external system, that implies the context switching. What is the speed of the same operation (conditional modification a couple of related counters) done through Clojure STM and Java+Redis?

oap said...

Just curious - would you advocate using "ABCL 1.0" on the JVM platform instead of Clojure?

Vsevolod Dyomkin said...

@oap Locks are hard, no questions here. I'd say, they are appropriate only for very simple cases. Clojure refs/agents are also not very simple, I should add. But, surely, I'd recommend to use them over locks, when you are faced with the appropriate problem. But also don't forget to look at lock-free data structures, data parallelism, which are more appropriate for some of share-state concurrency problems.

Regarding ABCL. In general, I wouldn't recommend it over Clojure, because it's not as mature (not to say, that Clojure is very mature itself). Yet, if you are constrained to use the JVM, I would still use Java now, not Clojure, unless your team has a strong FP background or your project needs full power of Clojure concurrency suppport. And I'd also use ABCL for some specific supporting tasks and experimentation. But there are other stable Lisp implementations, that I'd use instead of Clojure, if I'm starting a new project, not constrained by the need to be on JVM: SBCL, CCL or LispWorks depending on your main target platform (Linux, OSX or Windows).

Regarding speed: if it's an issue, there are also embeddable stores (e.g. tokyo). The positive things about using some separate datastore are, that it's not NIH and it's just a library, that doesn't influence how you do other things. And it can be easily accessed from other language/environment. But surely, there can be cases, in which you have to resort to some internal solution.

oap said...

Thanks for the post! A lot of valid points that arise when Clojure is used in production environment!

oap said...

To me, compared to Java, concurrency in Clojure is made easier by two things:

- all of the Clojure collections are immutable and persistent. So that you are free from choosing and implementing such a collection in Java on your own.

-just as you mentioned, Clojure facilitates the usage of FP. Having less states in your system, makes it harder to share a state in the wrong way


You are definitely right that some flexibility has been sacrificed to make it work this way.

gtrak said...

Regarding using interop and transients for mutable state, in the 'Simple Made Easy' talk, variables are a complection of value and time. I think the real purpose of clojure is to make the simpler(better) choice proportionally easier, so I would hate to see mutability become easier :-). Bad things should look ugly.

gtrak said...

Having sensible defaults becomes more valuable when you're bringing on a new programmer, for instance. But the control is still there if you need it.

ivant said...

I think you're looking at Clojure form Common Lisp perspective and judge the language without actually knowing it. For example your argument regarding macros in Lisp-1 is more applicable to Scheme than to Clojure (read about syntax quote here[1]).

Also "modern" can mean "contemporary", not necessary "fashionable". Clojure is a modern dialect of List in the sense that it's not an implementation of CL or Scheme or something. This is for a good reason: better integration with the JVM (and not just to be different).

Also emacs handles brackets and curly braces just as well as parentheses. Your example may be good for tweeting, but that doesn't make it right.

Frankly, most of your arguments aren't very good. Ask somebody to read your blog post to you and you'll see what I mean.

[1] http://clojure.org/reader

Vsevolod Dyomkin said...

@ivant I've actually used the language in practice for nearly half a year on a real-world project alongside a big Java application (I've mentioned that at the beginning of an article). And believe me, I was very interested, that it worked well and I could continue using it.

Common Lisp and Scheme are also quite contemporary ;)

> Also emacs handles brackets and curly braces just as well as parentheses.
In CL if I have unbalanced parens somewhere, I just add/remove the last one. In Clojure this doesn't work anymore, because of interleaving of block-closing parens and let/binding/... argument list-closing brackets. This problem I encounter almost every day. Maybe, you could tell me a solution?

Vsevolod Dyomkin said...

@gtrak Once again, transients are not for mutable state (read the docs). And actually local mutable state is not a bad thing at all. Variables are a complection of value and time for a reason: actually, all algorithms (not pure functions, of course) complect values and time and variables are a tool to do that in the simplest manner. If you remove variables, you don't remove the need to solve the problem, you just make it more complicated.

skuro said...

Transients *are* local mutable state, no matter what the documentation says. As long as assoc! changes the very same data structure instance I pass to it, I call that data structure mutable.

The documentation states a strong opinion in Clojure: mutable state is evil, and you should refrain from using it as much as you can.

Then, my personal solution to balancing parens is Emacs+Paredit. I would add that without proper highlighting and/or IDE support I would feel lost and abandoned in parens, which might support your point of view.

Then, having a hundred closed parens and brackets might be a first hint of good refactoring chances.

Vsevolod Dyomkin said...

@skuro If you can't do in-place assignment, you can't use transients any other way, than you can use Clojure's persistent data-structures. So they don't support the same programming approaches, that are supported by ordinary mutable variables.

gtrak said...

In my compilers class we talked about translating code into an intermediate Static Single Assignment form, as this simplified many optimizations. I think this is an argument for immutability. If you have to jump through a hoop to mutate things, you're less likely to create problems for yourself. Just use an atom if it's what you really want. Use something else if performance is an issue. It's an intentional, single hoop whose benefits outweigh the costs.

Robert said...

I have looked for a scripting language on top of the JVM , on and off. One alternative I didn't see mentioned (as a point of comparison) in this discussion is Per Bothner's Kawa Scheme on top of the JVM. It seems more highly optimized to scripting the JVM than either ABCL or Clojure (this not to slam Clojure; just to say it's aimed more at generating full programs than scripting).

Any opinions about that contrast? I don't know how the Scheme bias towards recursion fits onto the Java bias against recursion.

Pascal Costanza said...

What I'm a bit worried about is that, once people figure out that Clojure is not that great after all, they draw wrong conclusions about other Lisp dialects and start spreading prejudices about them. (You have that kind of interaction to some extent already between Scheme and Common Lisp, that some people who know one language already falsely assume they know a lot about the other one as well.)

It's good that you wrote this blog posting, because it helps preventing that effect.

eugu said...

for some reason, that escapes me, let (and defn, and many others) uses vectors for argument lists

AFAIR rationale for this is that parens are mostly used for what is function/macro call and square brackets for data.

ivant said...

I'm using paredit mode for emacs. Also I don't have more than 5 or 6 closing parentheses in a row so it's normally not such a big deal anyway.

eugu said...

i meant when representing code as data structures

demoss said...

This is really beside the point, but in the interest of accuracy: "SBCL is implemented in C" is untrue. SBCL's runtime is in C, which basically consists of the garbage collector and signal handling stuff. Everything else is in Common Lisp, compiled to native code.

(Back to topic.)

eugu said...

For Vim users surround.vim is helpful.

Vsevolod Dyomkin said...

@gtrak > In my compilers class we talked about translating code into an intermediate Static Single Assignment form, as this simplified many optimizations. I think this is an argument for immutability.
This is an optimization, performed by compilers. Maybe programmers shouldn't be forced to do machine's job?

@Robert Unfortunately, I can say anything about Kawa.

@Pascal that was one of the reasons I took to writing this article

@nikodemus I know, that was little bit of over-simplification, Yet, I don't think, that it changes anything, does it?

gtrak said...

Actually, SSA helps programmers write new optimization passes more simply. Similarly, you can write code that interacts with immutable code more simply. The transformation from mutable code to immutable can be automated (only locally), but it's really not that big of a deal. If there existed a mutable let, I don't think I would use it.

gtrak said...

In java, I re-use variables for multiple values in cases of aggregation, counting, and iterating. In clojure, I have map/reduce. It's much simpler. Is there another use-case?

Dmitry said...

Clojure addresses only shared-state concurrency on a single computer

Для распределённых вычислений есть clojure-hadoop, и ещё недавно некий Avout появился.

Deepak Surti said...

No MOP [CLOS] in clojure sucks.

Anonymous said...

You brought up some correct points, however, I mostly have to disagree. Lots of things that you mention are of theoretical nature, which have no practical limitations. Over three years I am now doing professional Clojure development (probably was one of the first who professionally used Clojure, at least in Germany), and prior to this I did six years of CL (over one of it was professional development). In these three years I never missed tail call optimization. This sounds purely theoretical, at least in the domains where I work. Instead of writing (my-fn x y z) I write (recur x y z). Where is the practical problem with that? How does this reduce the productivity of developers? What project deadlines can you not hold?
Computers themselves have limits which are far more relevant than this TCO thing. I would like you to provide factual evidence of how the lack of this feature in Clojure reduces the productivity of dev teams.

Your )))))))])))))])]) example is another such fabrication. How often per year do you stumble upon such a construct in the real world? Two times per year it will cost you 2 minutes to dive through Emacs’ highlightning to solve this. But in exchange you gain every day increased readability, because the eye can stick to () and [] and {}.

Lisp-1 is more readable than Lisp-2, and even heavy use of functional programming is not required.

Your (Map Key) vs. (gethash Map Key) example is flawed, because it unfortunately is (gethash Key Map). And there are tons of other inconsistencies in CL that Clojure finally threw away, which cleaned up the language and makes code more readable.

I agree with you that it would be nicer to be able to have more control over pmap. But good that there is such a construct, while in CL there isn't. And lots of things in CL can also not be controlled so nicely in general.

About killer apps: CL did not deliver too many of those, although it had decades of time, while Clojure showed very nice developments in only three years. The trend is obvious.

The STM: it is not so deeply integrated that you couldn't use some few other STMs when you do Clojure development. Just provide your own version, put them online at Bitbucket and others can use it. It is just one STM that ships directly with Clojure. CL does not even ship one.

For distributed problems there already are loads of helpful libs, you listed some. Those are lib addons, and don't belong into the language itself. The CL world lacks things such as Terracotta & Co.

You critizized Leiningen and the build tools, while Clojure is way ahead of all what CL can offer. Quicklisp is nice, but it still lacks versions, without which one can nearly not develop professionally. You will have to use subrepos (in hg for example) and do the work yourself.

So, for professional development I would always go with Clojure. And in private too, because it feels much nicer and is decades more modern than CL.

Anonymous said...

@skuro: transients are mutable state, but one that can only be mutated in one thread. So the concurrency problems can not apply here, because you are the only user of a resource.

skuro said...

@kury I know, that's why I called them *local* mutable state. And it's quite nice that they remain local, as all the pitfalls of concurrent access are avoided at once. I'm happy for the OP that he's not struggling with it, but I personally do care for multithreaded concurrency issues.

I brought transients into the discussion to counter the statement that mutability was completely banned from Clojure, which I find simply untrue because of transient, interop and STM.

Vsevolod Dyomkin said...

@kury

> You brought up some correct points, however, I mostly have to disagree. Lots of things that you mention are of theoretical nature, which have no practical limitations.

But I encountered them in my daily practice of using Clojure. I've honestly tried to become friends with it and help my co-workers start using it.

> In these three years I never missed tail call optimization.

My problem isn't with lack of TCO. Actually, loop/recur pattern is quite good in itself (although, sometimes you'd prefer to have a separate function, which isn't possible without TCO). What I very much miss are imperative loops, which amount to 30-50 percent of my loops (and loops are essential). I mean imperative loops, that manipulate some local state, that isn't possible in Clojure.

> Your )))))))])))))])]) example is another such fabrication. How often per year do you stumble upon such a construct in the real world?

Actually, it's not my example - it's from the Python world. I'd say every week at least I had to deal with it. If you do some involved binding in let using mapping and so on, you would encounter that. But I agree, that it's a very minor issue. But the whole point was, actually, a counter-point to Rich Hickeys mention in his talk of '() and () being interleaving (which makes makes not more sense).

> Lisp-1 is more readable than Lisp-2, and even heavy use of functional programming is not required.

This is purely subjective. For me - the opposite. I'm sometimes confused, when I see in function position something, that resembles a (data) variable. I'm never confused in CL with this.

> Your (Map Key) vs. (gethash Map Key) example is flawed, because it unfortunately is (gethash Key Map). And there are tons of other inconsistencies in CL that Clojure finally threw away, which cleaned up the language and makes code more readable.

Yes, Clojure clean-up some bits in CL, but also introduced a lot of its own inconsistencies. And it just omitted tons of useful functionality from CL.

> I agree with you that it would be nicer to be able to have more control over pmap. But good that there is such a construct, while in CL there isn't. And lots of things in CL can also not be controlled so nicely in general.

So how long does it take to create pmap in CL? (Hint: here you can find one of the variants - http://marijnhaverbeke.nl/pcall/). It's totally a library function. And being a library function you get the benefit, that when you don't like an implementation, you can use any other. In Clojure you also can, but it's not very good to redefine the language core, don't you agree?

Vsevolod Dyomkin said...

> About killer apps: CL did not deliver too many of those, although it had decades of time, while Clojure showed very nice developments in only three years. The trend is obvious.

It's a pity, that you view this post as antagonistic between Lisp and Clojure. Lisp may have or not it's killer apps (I would say, Emacs is one of such: and you should know, that elisp is CL's mostly incomplete copy), but that doesn't effect Clojure. What effects Clojure's adoption is that people, who are looking for solutions to concurrency problems, will see RabbitMQ as an example of effective solution in Erlang, but won't find such examples in Clojure...

> The STM: it is not so deeply integrated that you couldn't use some few other STMs when you do Clojure development. Just provide your own version, put them online at Bitbucket and others can use it. It is just one STM that ships directly with Clojure. CL does not even ship one.

The point was, that you can't control Clojure STM. This is unlispy, in my view. That's all.

> For distributed problems there already are loads of helpful libs, you listed some. Those are lib addons, and don't belong into the language itself. The CL world lacks things such as Terracotta & Co.

Surely. Was I saying, that CL is positioned as the best language to solve hard concurrency problems? The question was about Clojure.

> You critizized Leiningen and the build tools, while Clojure is way ahead of all what CL can offer. Quicklisp is nice, but it still lacks versions, without which one can nearly not develop professionally. You will have to use subrepos (in hg for example) and do the work yourself.

Actually, versions are not an issue of quicklisp, but of ASDF. And it supports versions, although not to the desired extent (only equal comparisons of versions, not greater/lesser comparisons). But that is quite enough to develop professionally, because, as you know, two versions of the same system can't be loaded into one image anyway. Neither in CL, nor in Java (and Clojure consequently). So if you specify exact version dependencies in ASD files, quicklisp will do the right thing for you.

What I didn't like about leiningen is that it works like a black box. And you can't trust it: a couple of times it wiped the contents of my lib/ folder, which had some java libraries in it, that appeared there before leiningen was even installed. The other issue was with the ability to specify multiple java source paths, which was fixed some time ago, but too late for me to find out about that (and there's still no mention of that in the docs).

> So, for professional development I would always go with Clojure. And in private too, because it feels much nicer and is decades more modern than CL.

I've expressed my view here. You have your own, and you are free to boast about it in your blog. To make an informed decision people need to consider different views.

In the project, I've mentioned (which was not heavy on concurrency, but heavy on algorithms) Clojure didn't work for me. I had to complete it in Java, because Clojure gave no outstanding benefits, while was very hard to work with for other members of the team, who didn't have any Lisp background, like myself, and the desire to overcome various hurdles, posed by the language.

Vsevolod Dyomkin said...

And, once again, to all, who call transients mutable local state, please translate into Clojure code, using transients, the following simplest example from CL:

CL-USER> (let ((a '()))
(dolist (b '(1 2 3))
(setf a (cons b a)))
a)
(3 2 1)

Jochen H. Schmidt said...

Its as always with hypes - its not the fault of the hyped thing. Clojure is a cool project and a versatile tool. The "Next Lisp" it ain't be. Like the hypers have seen it. It will perhaps be just yet another Lispy JVM language; a good one even if it is a relatively unlispy one.

Clojure has a difficult stand:
Only time will show if Non-Lispers really go beyond its still Lispy Syntax and its own self grown oddities, or if another language has to come which delivers the Clojure cool-aid in a more traditional package.

Lispers, particularily Common Lispers and Schemers will not find a complete replacement in Clojure. It misses important bits of those languages.

My gut tells me, that Clojure is here to stay, but its influence will be more used in other languages than within itself. But hey! Thats just my guts talking.

Sam Aaron said...

"...actually, [Clojure is] not a Lisp at all"

Yet another article that declares Clojure to not be a "Lisp" without providing a strong argument as to why. In this case the argument starts:

"But, in my opinion as a Lisp programmer..."

So it seems that not only is this argument clearly driven by opinion, it's self-referential - you have to already be a Lisp programmer to be able to differentiate Lisps from non-Lisps.

I'd have much preferred to read an argument about which features of Clojure have merit and where the author believes Clojure to be lacking - purely from a language standpoint without having to resort to "Lisp variant A is not a True Lisp therefore not as good as Lisp" nonsense.

Vsevolod Dyomkin said...

@Sam I have clearly stated the principles, that a Lisp variant should follow: dynamicity, simplicity and putting control in developer's hands (flexibility and extensibility). And tried to show with concrete examples, how they aren't followed by Clojure in critical parts of the language. And, as it turned out, I've examined mostly the same areas, that are covered in Clojure rationale (http://clojure.org/rationale), so, indeed, those are critical parts of the language.

Surely, the whole article is opinionated and grounded in my experience. And, I think, we should admit, that this is how every argument is in software development. For me "purely language standpoint" is just a spherical cow, because, probably, any feature can have merit from some specific use-case or point-of-view, so if you don't declare your principles first, everything else is just word jiggery.

gtrak said...

@Vsevelod Dyomkin

per your example:
"CL-USER> (let ((a '()))
(dolist (b '(1 2 3))
(setf a (cons b a)))
a)
(3 2 1)
"


Here's how you do it in clojure:
=> (reduce conj '() '(1 2 3))
(3 2 1)

Show me another :-)

Vsevolod Dyomkin said...

@gtrak Oh I see, you like trolling. Show me some of your public code (have you heard of github?), or I won't waste any more time answering you.

gtrak said...

There is stuff in the works to support mutability more flexibly and safely. Rich Hickey spent some time talking about "Pods" at the last clojure-conj, they basically let you safely take advantage of the performance gains of transients from multiple threads.

http://kotka.de/blog/2010/12/What_are_Pods.html

But, I personally think the use-case for mutability is very rare. I'd like to see some examples where a mutable, imperative example cannot be made more concise and simpler with a functional approach, or cannot be handled by clojure's iteration functions (doseq, for)

gtrak said...

I do enjoy trolling, but in this case I just didn't see you asking for an implementation with transients until I had submitted already. My goal is to learn new, better approaches, if if the conversation's not going anywhere I'll take my efforts elsewhere, too.

Vsevolod Dyomkin said...

@gtrak ok. Actually, there can't be neither an implementation of this exact pattern with transients, nor with anything else in Clojure.

> But, I personally think the use-case for mutability is very rare. I'd like to see some examples where a mutable, imperative example cannot be made more concise and simpler with a functional approach, or cannot be handled by clojure's iteration functions (doseq, for)

This is purely hypothetical. If you engage in some projects, that require implementation of complex algorithms, you'll soon see the need for it...

gtrak said...

Ah, I just tried to do it with transients, turns out they're not implemented for lists, which makes sense, since they're just cons cells and there wouldn't be a performance benefit.

From the clojure site: "Note that not all Clojure data structures can support this feature, but most will. Lists will not, as there is no benefit to be had."

Conj'ing onto a vector appends to the tail, since that's the end that grows.

The closest we can get for this example is loop/recur, which really isn't that bad.

(loop [in '(1 2 3) out '()]
(if (empty? in) out
(recur (rest in) (conj out (first in)))))

But this example in general goes against what clojure's optimized to do, lazy sequences/streams and the relevant functions.

Vsevolod Dyomkin said...

@gtark yes, even if you take vectors, you won't be able to do the same with them. Your only option is loop/recur (and that was what I mentioned in the article). And there are case, when it's far from the best pattern: like, when you have to accumulate data in maps, for example. Actually, it's pretty hard to show it on toy examples, but when you encounter real-world problems, you can have massive loop/recur loops, that can span more than a page of code (actually, a couple of nested loops). This is when it gets really nasty...

Felix Breuer said...

Thanks for the post! I am on still on the fence deciding whether using Clojure has made my programming life easier or not (and whether to use it for my next project). So your perspective as a Lisper was very valuable!

skuro said...

@Vsevolod Dyomkin your CL code wasn't translatable into Clojure + transients for the following reasons:

- transients in Clojure are indeed limited (e.g.: there's no transient for lists)
- vectors append to tail (and I assumed order matter)

Maps are different beasts, thus:

user=> (let [a (transient {})]
(doseq [b {:a 1 :b 2 :c 3}]
(apply assoc! a b))
(persistent! a))
{:a 1, :b 2, :c 3}

Or would your point be really about order and reversal even in case of maps?

Vsevolod Dyomkin said...

@skuro Thank's a lot, that was what I was looking for. Strange, that I couldn't accomplish the same with transients myself - as it's quite simple... This makes Clojure a little bit more usable, than before.

gtrak said...

I think you should be careful there, it's noted that transients aren't designed for in-place bashing, you're still expected to use the result of the function in the next call, not the initial reference. In my minimal testing, I haven't found an issue but some coworkers have had problems trying to use it that way. It's definitely not guaranteed to work, though.

gtrak said...

a source: http://osdir.com/ml/clojure/2010-11/msg00399.html

gtrak said...

Sorry for so many comments, last one I promise :-). Atoms are quite fast in single-threaded code, since they only have to try once to swap. I think a combination of atoms and transients might be the best we can do to imitate mutability until pods come along. About 100x slower than doing nothing and only ~20x slower than a simple increment.

user=> (time (let [_count (atom 0)]
(dotimes [a 10000000] (swap! _count inc))))
"Elapsed time: 1292.590795 msecs"
nil

user=> (time (let []
(dotimes [a 10000000] (inc a))))
"Elapsed time: 81.46009 msecs"
nil

user=> (time (let []
(dotimes [a 10000000] nil)))
"Elapsed time: 20.368163 msecs"

Alex Yakushev said...

I want to thank you, Vsevolod, for sharing this piece since there is not much detailed analysis (with disadvantages for the rest) on Clojure around the web. Yet I find most of arguments prejudiced towards CL for just the fact that you are more used to the latter.
The argument that has struck me the most is one about the lack of assignment and therefore inability to write proper imperative code. That is no wonder to me because Clojure is - wait for it - a functional language! I don't see any space for further discussion here, Alonzo Church left us without it long time before. If the idea you are trying to bring forward is that a language should be imperative/multiparadigm to be called a Lisp, then this point of yours is just irrelevant. There is no definition of "Lisp" anyway.
Your attack on the concurrency model in Clojure is purely superficial. You complain about the features STM lacks/doesn't match with your expectations? Well, CL hasn't any at all. Well, you can write them in CL rather quickly, so do you in Clojure. I don't see this point being a candidate for an argument either.
The only major Clojure drawback I agree with you on is that it allows no custom reader macros. Though this drawback is thought-out and has its reasons behind. Lisp code is not extremely easy to read and custom reader macros only added to this fact. Leaving users only with a set of predefined macros to learn eliminate the abuse of alien custom syntax which appeared in every other CL project.
Well, if you tried using Clojure in a big project and that didn't work out, then it's just not for you. Anyway it is good to know that someone makes efforts to introduce Clojure as an industry language. I'm sure your experience will be taken into consideration the next time someone thinks about pros and cons they can get when choosing the language for the next project.

Alex Yakushev said...
This comment has been removed by the author.
Alex Yakushev said...
This comment has been removed by the author.
Vsevolod Dyomkin said...

@Alex, the presence of similar opinions and delusions, that you'd expressed, prompted me to write this post. Thank you for sharing them - below are my comments.

> Yet I find most of arguments prejudiced towards CL...

Surely, they are grounded in my CL experience. Yet I wouldn't call them prejudiced, as there's no one true opinion on this topics. I just express an alternative opinion.

> The argument that has struck me the most ... Clojure is a functional language!

This is confusion number 1. A functional language isn't defined as the language, disallowing assignment. You confuse it with the term "purely functional language" (of which the only widespread example is Haskell). Most functional languages: Scheme, OCaml and, also, Clojure, support some form of assigment, and thus allow to violate referential transparency. So what defines a functional language, as for me, is treating the program as consisting of expressions and not statements, as it is in imperative languages.

> I don't see any space for further discussion here, Alonzo Church left us without it long time before.

Alonzo Church doesn't have anything to do with this discussion, as we aren't talking about lambda calculus theory, but about practice of programming. None of the languages at question implement Church's lambda calculus, although it is possible to reason about some of them in such terms.

> If the idea you are trying to bring forward is that a language should be imperative/multiparadigm to be called a Lisp, then this point of yours is just irrelevant. There is no definition of "Lisp" anyway.

This is confusion number 2. First of all, there is a definition of Lisp, which is even an ANSI standard. And even if you disregard the standard, I was basing my judgement in the history of the language and the views of its community, expressed in numerous books and other materials. Following your train of thought, you could tell just the same about any language. Is there a definition of Javascript? What if someone says, that Coffeescript is the modern JavaScript or, say, Dart is. And when you'd say, "No, Dart doesn't support prototype inheritance" your opponent would answer: "Is it essential to JavaScript? No, for me the essential features of JavaScript are that it runs in the browser and has automatic memory management". This way you can call anything by any name...

> Your attack on the concurrency model in Clojure is purely superficial...

The argument was, that these features are advertised as very important and essential to the language. Moreover, they are built-in and influence the design of the whole language. Yet I couldn't find a significant benefit in them. Although I admit, that there may be many specific use-cases, that this model will suit very well. But they don't seem general enough for me to justify such core dependency. This is in contrast with Erlang, which implements really fundamentally different model, from the currently existing one. Yet, surely I don't object against having Clojure concurrency primitives as a library and using it.

> The only major Clojure drawback I agree with you on is that it allows no custom reader macros. Though this drawback is thought-out and has its reasons behind.

Surely. But it goes against the established Lisp principles.

> Lisp code is not extremely easy to read and custom reader macros only added to this fact. Leaving users only with a set of predefined macros to learn eliminate the abuse of alien custom syntax which appeared in every other CL project.

I don't know about your experience with CL, but in my almost 5-year experience I always found special reader-syntaxes beneficial for the project. I also don't agree with the notion of Lisp code being hard to read, but it's almost impossible to argue here, because it depends on very basic human perception qualities, so is very subjective.

Vsevolod Dyomkin said...

> Well, if you tried using Clojure in a big project and that didn't work out, then it's just not for you...

I wouldn't have written this article, if not for the fact, that Clojure is marketed as "a modern Lisp". If Clojure was more properly called as a new language, borrowing CL syntax and a couple features, I would've being better informed, and might not have spent my time on it. This article is the result of my experience, so that other lispers. approaching the language, were able to make a more informed choice.

I'm sure, that many people will find Clojure useful for them, including me, possibly. But I don't like all the prejudices around Lisp and don't want new misconceptions to appear, because people will think, tha Clojure is a Lisp. As it is not.