Lisp, the Universe and Everything

2013-01-03

Lisp Hackers: John Fremlin

John Fremlin has created a couple of very performant Common Lisp programs beating on some microbenchmarks the fastest similar software written in any other language, including C: the teepeedee2 dynamic webserver, that managed to break the c10k record on a single core machine, and cl-irregex regex library. Working at MSI in Japan he also had written an object persistence DB for CL manardb. Besides, he writes interesting blogs on topics of software optimization, programming languages and technology in general.

Tell us something interesting about yourself.: I've been to more than eighty countries; I want to go everywhere!
What's your job? Tell us about your company.: I work at Facebook on the growth team, on data-driven improvements to the sign-up flows.
Do you use Lisp at work? If yes, how you've made it happen? If not, why?: I used to at msi.co.jp. It is a Japanese consultancy based in Tokyo called originally Mathematical Systems Institute. Mr Kuroda leads the Lisp group there and I think it has hovered around five or six people over many years. He's done a great many very interesting projects for a range of companies over the years: for example a crash-test data inspection tool for a big Japanese car company, text mining, graph visualisation and so on. I worked primarily on building up a visualisation and mapping of a very large set of routers for the world's biggest telecoms, which led to the creation of manardb.

I really enjoyed working for Mr Kuroda and I'm sorry I had to leave for personal reasons. There were always many very fascinating problems around — with great people to discuss them and find solutions. It was a very stimulating workplace!
At Facebook, I use PHP, Python, C++, Java and miscellaneous things. I think we would all be better off if we hadn't balkanised the different systems that we program for — and Lisp is one of the few programming languages with the flexibility to serve in all these roles.
What brought you to Lisp? What holds you?: My initial programming was following Michael Abrash's graphics books and building on his ideas, by doing things like runtime native code generation for drawing dynamically generated bitmaps efficiently. This is not so interesting for modern processors as they have good branch prediction but the idea of code generation stuck with me and Lisp is one of the few programing languages that makes this easy and efficient.

I appreciate the intellectual coherence of Lisp, and its sensible approach to numeric computations. In terms of using it today, I feel that Common Lisp has an advantage over many other programming languages in that it has multiple mature independent implementations. Running on multiple compilers tends to greatly increase the quality of a program in my opinion, as the code is exposed to different static analyses.
What's the most exciting use of Lisp you had?: I helped someone use Lisp for an automated trading project.
What you dislike the most about Lisp?: In trying to make efficient code one ends up fighting against the compiler and the runtime system and most of the time is spent in coming up with clever ways to circumvent and outwit both. This is not a good use of resources, and means that it usually makes more sense to start with C++.
Tell us about your approach(es) to optimizing Common Lisp code (and maybe code optimization in general)?: The most important thing is to try to hold in your head an understanding of where the program is going to spend time. Profilers can be misleading and inaccurate, and it is sometimes difficult to get representative workloads to profile. I think their main utility is in confirming that there is no sloppy mistake (in Lisp, typically, consing accidentally) that prevents you from achieving the natural performance of your approach.

Complexity analysis in terms of computation, network usage, disk accesses and memory accesses is a first step as obviously if you can improve the asymptotic usage of a bottlenecked resource, you will very likely do much better than trying to tweak some little detail. The second step is to try to characterize interactions with caches and, in Lisp, garbage collection, which is pretty tricky.
Among the software projects you've participated in what's your favorite?: I think the one I enjoyed most was an embedded H.264 decoder in 2005. This was for the VideoCore, a really wonderful CPU architecture that could deal with parallelizable problems incredibly efficiently if programmed correctly. It would have been awesome to use Lisp for it!
If you had all the time in the world for a Lisp project, what would it be?: I wish there were Lisp bridges to other runtime systems (Java, Android, Objective C, Perl, Python, C++, R, etc.) so that the libraries and tools for each could be leveraged efficiently in Lisp and vice versa. That would mean being able to call Java code and handle Java objects in Lisp, for example -- perhaps initially by spinning up a Java implementation in a separate process running a CL-SWANK style interface.

I really don't think this would be that difficult and it would make a huge difference to the convenience of building programs in Common Lisp!
Describe your workflow, give some productivity tips to fellow programmers.: I use emacs and I have a bunch of elisp code that I keep meaning to publish!

2013-01-02

"Real" List Comprehensions in 24 Lines of Lisp

I've just come across a post on Hackernews titled List Comprehensions in Eight Lines of Clojure. It's definitely a nice little example. But it also feels kind of unreal, even cheating ;) Because who would really use such kinds of list comprehensions? It seems to me, that the whole purpose of this construct is to make the code be concise and resemble a set-theoretic notation for sets. But here we need to use them inside a list-comp macro. This actually looks more like just another iteration construct.

What you really want from a list comprehension syntax is for it to be able to "comprehend" something like this:

{x|x ∊ some set}

or like this:

{x|x ∊ some set|x < 10}

Just like math.

But, surely, it's impossible to implement such syntax in Clojure or any other language without an extensible reader. The only route you can go is what Python does — to implement it as part of the language itself.

But you can do more, if you have control of the reader. This is one of the many cases, when Lisp's reader macros prove indispensable.

So here's an implementation of "real" (i.e. really resembling mathematical notation, no cheating) list comprehensions in 24 lines of Lisp (if you don't count a utility function group):

Unfortunetely, I had to use || instead of |, because on its own the | is used to escape charfactes in symbols, and also <- instead of ∊ for ease of typing, obviously. And as for the filter part, there's an implicit and, so you can write several conditions, and they should all hold. Otherwise, I think, this can be considered a literal implementation of the idea.

PS. And using named-readtables instead of plain set-macro-character this syntax can be used on a per-file basis, just like in-package forms.

PPS. I won't discuss here the issues of list vs. sets or lists vs. sequences. They are an implementation detail, worth another post.

submit

2012-12-14

Ansi Common Lisp на русском

Недавно вышел русский перевод книги Пола Грема "Ansi Common Lisp", к которому я немного приложил руку в качестве "научного" редактора. На форуме lisper.ru уже были сообщения от счастливых обладателей бумажной версии книги, а на сайте издательства даже доступен ее электронный вариант по свободной цене.

Хотя изначально я скептически отнесся к выбору именно этой книги для перевода, сейчас я рад, что так вышло. Работая над переводом, хочешь-не хочешь, а пришлось прочитать книгу практически от корки до корки, и могу сказать, что это, пожалуй, самое краткое, простое и доступное введение в язык. Practical Common Lisp лучше открывает глаза, и все-таки остается самой лучшей книгой по Lisp'у в целом, но он существен больше. В общем, ANSI CL — очень хороший вариант для начинающих. И хотя стиль Пола Грема часто критикуют в современном Lisp-сообществе, эта книга достаточно сбаллансированна и не содержит каких-то апокрифических мыслей :)

Книга состоит из двух частей, меньшая из которых — справочник — фактически бесполезна из-за наличия Hyperspec'и. Но это хорошо, поскольку остается меньше текста для прочтения :) Первая же часть состоит из 13 глав, описывающих разные аспекты языка, и 3 глав с решением практических задач. Главы про язык содержат множество примеров использования различных структур данных и реализации с их помощью нетривиальных алгоритмов, что может позволить неплохо прокачать это направления тем, кто не занимается постоянным решением алгоритмических задачек на Codeforces. Особенно, учитывая красоту и ясность реализации этих алгоритмов на Lisp'е. Несколько глав были весьма полезны и мне с моим пятилетним практическим опытом использования языка: например, я смог по достоинству оценить элегентность structs и стал намного больше пользоваться ими, интересными также были главы про оптимизацию и структурирование программ. В последних 3 главах разобраны классические для Lisp'а задачи: логический вывод, создание своей объектной системы (фактически, реализация внутренностей JavaScript'а) и генерация HTML из мета-языка — это те вещи, на которых видны некоторые из самых сильных сторон языка.

Из-за проблем издательства работа над переводом велась очень долго — что-то около двух лет. Точнее, сама работа длилась намного меньше, но ее отдельные части были разделены большими временными промежутками. Переводил allchemist, и сделал это задорно и весело. Своей задачей я видел прежде всего исправление отступлений от оригинала и работу с терминологией. Что касается второго пункта то тут я хотел напоследок рассказать занимательную историю про стог и пул.

Стог и пул

Пару лет назад Иван Сагалаев, который выступал в той же роли научного редактора для книги "Coders at Work", написал следующее по поводу роли научного редактора:

Кто не знает, научный редактор — это человек, совершенно необходимый для специальной литературы. Он берёт сырой перевод и приводит специфичную терминологию в соответствии с принятой в реальном мире. В результате вы читаете книжку, в которой написано не "процесс синтаксического разбора", а просто "парсинг", и не "интерфейс прикладной программы", а "API".

Применительно к Кодерам, которые должны читаться как приключенческий роман, я согласен с подходом Ивана. Но вот что касается таких книг, как ANSI CL, предназначеных прежде всего для (относительных) новичков, я считаю, что выбор должен делаться в сторону максимальной понятности терминов, а не привычности их для людей, которые уже в теме. Т.е., конечно, не "процесс синтаксического разбора", а просто "синтаксический разбор" и местами "разбор" — но не "парсинг". Почему? Да хоть потому, что "парсинг" для новичка создает некий магический ореол вокруг этого термина и выделяет его из ряда других, названных на родном языке, хотя ничего выделяющегося в нем нет. Да, часто подобрать адекватный термин на родном языке очень трудно, порой их даже приходится изобретать, но именно так и происходит развитие терминологии.

По этому поводу в этой книге было 2 очень интересных примера, за первый из которых меня можно смело закидывать помидорами, но я все же буду продолжать настаивать на нем. Давайте перечислим абстрактные структуры данных, с которыми мы чаще всего встречаемя — это, конечно же, лист, три, кью, стек, хип, дек. Ой... Т.е., я хотел сказать: список, дерево, очередь, куча, колода и... стек. Как-то так вышло, что у всех этих структур имена как имена, а вот стек какой-то особенный. Почему? Наверно, из-за лени, но не важно. Если заглянуть в словарь, то для английского слова "stack" можно найти 2 вполне подходящих перевода. Первый из них — стог :) По-моему, удивительный случай созвучности, и, по-своему, очень забавный вариант. Именно его я предложил использовать в качестве термина, когда речь идет об этой структуре данных, и он продержался практически до последней ревизии, однако, в последний момент все-таки был заменен на менее одиозный вариант стопки. Это тоже хороший перевод и с точки зрения соответствия реальности даже более адекватный, так что я остался доволен. Удивительно, почему он так редко встречается в литературе!

Но тут есть еще одна трудность: а как быть со стеком вызовов функций программы, который уже не абстрактная структура данных, а конкретное технологическое решение, вокруг которого есть еще и другие термины, типа "stacktrace"? Вот тут, конечно, намного труднее, и я остановился на том, что в данном случае, чтобы не создавать путаницы, лучше использовать устоявшийся термин, т.е. стек. Возможно, с прочным вхождением в обиход стопки, можно будет перенести этот термин и сюда: стопка вызовов — звучит банально. Зато никакой дополнительной случайной сложности :)

Вторым термином, которым я остался недоволен, был пул. Тут случай хуже, т.к. адекватного перевода его на русский и вовсе нет. Ну не бассейн же. Я так ничего и не придумал. Но, если у вас будут мысли на эту тему, делитесь...

2012-12-12

Утилитарный Lisp

Вот как выглядит "клиент" (если для такого простого кусочка кода уместно столь громкое название) для набирающего популярность лог-сервера Graylog2 на современном Lisp'е: По-моему, этот кусочек кода неплохо развеивает миф о проблемах с библиотеками в Lisp-среде: в нашем пайплайне сначала сообщение сериализуется в JSON библиотекой cl-json, затем кодируется в байтовый поток babel, затем зипуется salza2, а затем отправляется через UDP-шный сокет usocket. А еще есть повод использовать прекрасную библиотеку для работу со временем local-time, основанную на статье Эрика Наггума. Ну и чуть-чуть синтаксического сахара из rutils, в том числе и буквальный синтаксис для хеш-таблиц (как в Clojure), модульно подключаемый с помощью named-readtables. Ничего лишнего.

2012-12-07

Lisp Books

All Lisp books in one place!

PS. If you know more, drop me a line, and I'll add them.

2012-11-04

cl-redis: Separation of Concerns in Library Design

TL;DR This article describes a "lispy" approach to implementing a spec, connection handling, and namespacing in the cl-redis library with special variables, generic functions, macros, package, and restarts, and a comparison of it to object-oriented one.

Redis is a simple and powerful tool, that can have a lot of different uses: coordination in a distributed system, message queue, cache, static database, dynamic database for ephemeral data — to list just a few. Seeing such potential, I have created the cl-redis Lisp client back when Redis hadn't yet reached version 1.0. A couple of weeks ago version 2.6 was released, which, as usually, added a handful of commands (now there're 140 of them — more than twofold increase since cl-redis was first released with 64), and a couple of small but sometimes incompatible communication protocol changes.

Upgrading the library to support 2.6 I have implemented a couple of improvements to the overall design, which made me rethink its original premises, and prompted to write this article to summarize those points, as they may be not the most mainstream manifestation of the very mainstream and basic principle of separation of concerns.

Anticipating a rapid rate of change to Redis from the beginning I decided to base the library on the following principles:

uniform declarative command definition, separated from the details of protocol implementation
a sort of TDD approach: adding each new command requires adding a test case for it (which can currently be extracted from the official docs)
implementing each command as a regular Lisp function, exported from REDIS package; and prefixing each command's name to avoid potential symbol conflicts (for instance, get is at the same time a core Lisp function and a Redis command, so it gets defined as red-get in cl-redis)

Declarative Command Definition

Such an approach should be very scalable regarding addition of new commands. The actions required should be: just copying the definition from the spec, putting parentheses around it, copying a test from the spec, recording and running it. And the protocol changes should have no or very little effect on the commands' definition. Those were the assumptions and they worked out pretty well, allowing to relatively easily handle all the 76 new commands added over time, go through the transition from old protocol to new binary-safe one (which in the end prompted only one change on the interface level: removing an output spec from command definition).

This is how a command definition looks in Redis spec:

HMGET key field [field ...]

Available since 2.0.0.
Time complexity: O(N) where N is the number of fields being requested.
Returns the values associated with the specified fields in the hash stored at key.
...
Return value
Multi-bulk reply: list of values associated with the given fields, in the same order as they are requested.

And this is its representation in Lisp code:

(def-cmd HMGET (key field &rest fields) :multi
  "Get the values associated with the specified FIELDS in the hash
stored at KEY.")

The difference from defun is that a return "type" is specified (:multi in this case) and that there's no body — it's a piece of code, that handles communication and is generated automatically.

But still there were some quirks. The biggest one was a small impedance mismatch between how Lisp handles function arguments and how Redis does. It should be said, that among all programming languages Common Lisp has the richest function arguments protocol, only matched to some extent by Python. And from the first look at Redis commands it seamed, that Lisp will be able to accommodate all of them as is. Yet, Redis' version of a protocol turned out to be more ad hoc, and so for some commands additional pre-processing of arguments was required. For instance, the ZRANGE and ZREVRANGE commands have a WITHSCORES argument, which if present should be the string "WITHSCORES". This is something in-between Lisp's &optional and &key arguments. Both choices required some pre-processing of arguments. My final choice was to go with &optional, but ensure, that whatever non-nil value is provided, it's transformed to a proper string. Still it was relatively easy to realize, because the Redis interaction protocol is implemented as 2 generic functions: tell for sending a request and expect for receiving the response. This provides the ability to decorate the methods with additional processing or override them altogether for some specific command. In this case a slight pre-processing is added to tell:

(defmethod tell :before ((cmd (eql 'ZRANGE)) &rest args)
  (when (and (= 4 (length args))
             (last1 args))
    (setf (car (last args)) :withscores)))

There are some more involved cases, like the ZUNIONSTORE command, that poses some restrictions on its arguments and also requires insertion of special keywords WEIGHTS and AGGREGATE:

(def-cmd ZINTERSTORE (dstkey n keys &rest args &key weights aggregate) :integer
  "Perform an intersection in DSTKEY over a number (N) of sorted sets at KEYS
with optional WEIGHTS and AGGREGATE.")

(defmethod tell ((cmd (eql 'ZUNIONSTORE)) &rest args)
  (ds-bind (cmd dstkey n keys &key weights aggregate) args
    (assert (integerp n))
    (assert (= n (length keys)))
    (when weights
      (assert (= (length keys) (length weights)))
      (assert (every #'numberp weights)))
    (when aggregate
      (assert (member aggregate '(:sum :min :max))))
    (apply #'tell (princ-to-string cmd)
           (cl:append (list dstkey n)
                      keys
                      (when weights (cons "WEIGHTS" weights))
                      (when aggregate (list "AGGREGATE" aggregate))))))

Overall, among 140 Redis commands 10 required some special handling.

Proper Incapsulation

The only drawback of the described solution, or rather just a consequence of it being implemented in Common Lisp, is the somewhat ugly format of Redis commands: red-incr looks definitely worse than r.incr. If the commands were defined by the names of their Redis equivalent (incr) this won't allow to import the whole REDIS package into your application, because of name clashes with the COMMON-LISP package and inevitable conflicts with other packages — these names are just too common. This is where objects-as-namespaces approach seams to be better, than Lisp's packages-as-namespaces. But it shouldn't be so, as the Lisp's approach implements proper separation of concerns, not "complecting" things, if you use Rich Hickey's parlance.

And it isn't: the solution is actually so simple, that I am surprised, that I didn't think of it until the latest version of the library. It is to define two packages: the REDIS one with all the ordinary functions, and a special package just for Redis commands — I called it RED, because it's a totally syntactic-sugar addition, so it should be short. RED package should never be imported as a whole. This way we basically get the same thing as before: red:incr, but now it's done properly. You can import a single command, you can rename the package as you wish etc. So this solution is actually more elegant, than the object-oriented one (we don't have to entangle the commands with connection state), and we don't have to sacrifice the good parts, described previously.

Connection Handling

I also worked with a couple of other Redis libraries, written in Java and Python. Generally, they use a classic object-oriented approach: the commands are defined as methods of a Redis class, which also handles the state of the connection to the server. The resulting code looks pretty neat, it is well encapsulated and at first glance has no boilerplate even in Java:

Redis r = Redis.getClient();
Long id = r.incr("WORKER_ID_COUNTER");
r.sadd("WORKERS", id.toString());
r.hset("WORKER" + id, "queue", queueName);

There's an issue of not forgetting to return resources (close a connection), but in a more advanced languages like Python it's solvable with a contextmanager:

with redis(port=3333) as r:
    id = r.incr("WORKER_ID_COUNTER")
    r.sadd("WORKERS", id)
    r.hset("WORKER%s" % id, "queue", queueName)

What can be simpler?

Yet, below the surface it suffers from a serious problem. In my work I've seen a lot of cases (I'd even say, it's a majority), where the Redis connection should be persisted and passed from one function to another, because it's very inefficient to reopen it over and over again. Most of the object-oriented libraries use some form of connection pooling for that. In terms of usability, it's not great, but tolerable. The much greater problem though is handling connection errors: in the case of these long-living connections, they always break at some unpredictable point (timeout or network hiccup), which throws an exception. This exception (or rather two or three different types of exceptions) should be handled in all functions, that use our client by trying to reconnect. And the contextmanager wouldn't help here: this is one of the cases, where it really is no match for its macro-based counterparts, that it tries to mimic. Those conenction errors also break the execution of a current function, and it's often not trivial to restart it. With connection pooling it's even worse, because a bad implementation (and I've seen one) will return to the pool broken connections and they will have an action-at-a-distance effect on other parts of the code, incidentally acquiring them. So, in preactice, connection pooling, which may seam like a neat idea, turns out to be a can of worms. This is one of the cases of excessive coupling, that often arises in object-oriented languages. (It is not to say, that connection pooling can be useful in some cases — it can, but it should be restricted only to those ones).

In Lisp there are elegant ways to solve all these problems: as usual its my favourite Lisp tool — special variables, combined with macros and the condition system. A couple of times I've seen such critic of the Lisp condition system, that its restart facility isn't actually used in practice. Well, it may not be used extensively, but when it's needed, it becomes really indispensible, and this is one of the cases.

The solution is to introduce the notion of a current connection — a *connection* special variable. All Redis commands operate on it, so they don't have to pass the connection around. At the same time, different macros can be created as proper context-managers to alter the state of connection and react to its state changes via condition handlers and restarts.

So if you just work with Redis from the console, as I often do (it's more pleasant, than working with the native client), you simply (connect) and issue the commands as is. A one-time job, that is run inside a function, can use with-connection, that is the analogue of a Python context-manager. The with-pipelining macro will delay reading of Redis replies until all commands in its body are sent to the server to save time. Such trick isn't something special: although a Java library needs to create a special object, that handles this strategy, in Ruby it is done simply with blocks (like: node.pipelined{ data.each{ |key| node.incr(key) }}).

But what can't be done gracefully in this languages is handling a connection hiccup. In the latest cl-redis a macro with-persistent-connection is responsible for handling such situations:

(defmacro with-persistent-connection ((&key (host #(127 0 0 1))
                                            (port 6379))
                                      &body body)
  `(with-connection (:host ,host :port ,port)
     (handler-bind ((redis-connection-error
                     (lambda (e)
                       (declare (ignore e))
                       (warn "Reconnecting to Redis.")
                       (invoke-restart :reconnect))))
       ,@body)))

It doesn't work on its own and requires some support from the command-definition code, though a very small one: just one line — wrapping the code of the command in (with-reconnect-restart ...), which is another macro, that intercepts all the possible failure conditions and adds a :reconnect restart to them, which tries to re-establish the connection once and then retry the body of the command. It's somewhat similar to retying a failed transaction in a database system. So, for instance, all the side-effects of the command are performed twice in such a scenario. But it's a necessary evil, if we want to support long-running operation on the server side.

The Lisp condition system separates the error-handling code in 3 distinct phases: signalling a condition, handling it, and restarting the control flow. This is its unique difference from the mainstream systems, found in Java or Python, where the second and third parts are colocated. It may seem unnecessary, until it's necessary, and there's no other way to acheive the desired properties. Consider the case of pipelining: unlike the ordinary command, sent in solitude, the pipelined command is actually a part of a larger batch, so restarting just a single command after reconnection will not return the expected results for the whole batch. So the whole body of with-pipelining should be restarted. Thanks to this separation it is possible. The trick is to check in the condition handler code, if we're in a pipeline, and not react in such a case — the reaction will be performed outside of the pipeline.

And here's the whole with-pipelining macro implementation. Did I mention, that the pipelined context is also managed with a special variable (even 2 in this case)?.. ;)

(defmacro with-pipelining (&body body)
  `(if *pipelined*
       (progn
         (warn "Already in a pipeline.")
         ,@body)
       (with-reconnect-restart
         (let (*pipeline*)
           (let ((*pipelined* t))
             ,@body)
           (mapcar #'expect (reverse *pipeline*))))))

In total, there's a proper separation of concerns: the package system ensures namespacing, the commands are just functions, which operate on the current connection, and connection-handling logic is completely separate from these functions.

(let ((redis:*echo-p* t))
  (redis:with-persistent-connection (:port 10000)
    (loop (process-job)))

(defun process-job ()
  (red:blpop "queue")  ;; block until job arrives
  (let* ((id (red:incr "id"))
         (results (do-something-involved)))     
         (result-key (fmt "result~A" id)))
    (redis:with-pipelining
      (dolist (result results)
        (red:lpush result-key result))
      (red:lpush "results" result-key))))

In this code we can independently and dynamically toggle debugging, use persistent connection, and pipelining for some part of commands, and the commands bear a bare minimum of information necessary for their operation.

A Note on Testing

In general I'm not a fan of TDD and similar approaches, that put testing at the head of design process, and prefer a much less disciplined REPL-driven development. :) Yet this is one of the classical examples, where testing really has its benefits: we have a clear spec to implement and there's an external way to test the implementation. Developing a comprehensive test-suite covering all commands (except a couple of maintainance ones, that just can't be easily tested) really sped up the whole process and made it much less error-prone. Literally every time I updated the library and added new tests, I have seen some of the tests failing! In total, the test-suite has accumulated around 600 cases for those 140 commands, and yet I didn't come up with a way to test complicated failure conditions on the wire, for which I had to resort to the REPL.

Afterword

In this article I wanted to showcase the benefits of the combination of some of Common Lisp's unique approaches to managing complexity and state: special variables, that have a context attached to them, macros, generic functions and condition-handling protocol. They provide several powerful non-mainstream ways to achieve the desired level of concern separation in the design of complex systems. In Lisp you should really remember about them and don't constrain your solution to recreating common patterns from other languages.

Finally, I would like to acknowledge Kent Pitman and his essay "Condition Handling in the Lisp Language Family", that is one of the most profound articles on the topic of protocols and separation of concerns — highly recommended. I think, the example of cl-redis is a clear manifestation of one trait of the CL condition system: you don't need it, until one day you do, and when you do, there's actually no other way to deal with the problem otherwise.

2012-10-25

Lisp Hackers: Slava Akhmechet

Slava Akhmechet published several enlightening essays at defmacro.org, of which one I often recommend to people, interested in learning about Lisp: The Nature of Lisp. He also created a continuation-based Lisp web-framework - Weblocks, backed by a delimited continuations library cl-cont. Other then that he is a co-founder of a startup company RethinkDB, of which he tells a bit in the interview.

Tell us something interesting about yourself.: For a long time I thought that human achievement is all about science and technology. In the past few years I realized how misled I was. Hamlet is as important an achievement as discovering penicillin. I wish I'd figured out earlier that science, for all its usefulness, is very limiting if one adopts it as an article of faith.
What's your job? Tell us about your company.: I'm a founder at RethinkDB. We spent three years building a distributed database system that we're about to open source and release in the next two weeks. The system allows people to easily create clusters of machines, partition data in a click of a button, and run advanced, massively parallelized, distributed queries using a very comfortable query language we've designed. The product is really delightful to use — we were just playing with it today to analyze census data for the upcoming presidential election in the U.S. and using it to play with the data is a real joy. I'm very proud of what we've done here — I hope it will make lots of people's jobs easier and let them do things they couldn't have done before.

My job here is to do the most important thing at any given time. Sometimes it means fixing bugs, sometimes it means demoing the product to customers, and sometimes it means driving to buy supplies so our developers can get their jobs done.
Do you use Lisp at work? If yes, how you've made it happen? If not, why?: We don't use Lisp, but much of our software is built on ideas borrowed from Lisp. We don't use it because we needed low level control — most of the code is written in C++, even with some bits of assembly. But we've borrowed an enormous number of ideas from Lisp. In fact, if we weren't Lispers, we would have built a very different (and I think significantly more inferior) product.
What brought you to Lisp? What holds you?: A guy named bishop_pass on gamedev.net forums about fifteen years ago. He was a really good advocate and I respected his opinions because of other subjects, so I decided to check Lisp out. I enjoyed it immensely, and spent years hacking in it. Today the only Lisp I still use is Emacs Lisp. I honestly don't know if I'll program in Lisp again (other than for fun, of course), but the ideas behind it will be with me forever.
What's the most exciting use of Lisp you had?: I built cl-cont — a macro that converts Lisp code to continuation passing style. I honestly think I learned more about programming from that experience than from anything else I've done before or after.
What you dislike the most about Lisp?: Probably the arrogance of the community that surrounds it. Knowing Lisp certainly doesn't make one a better person, nor even necessarily a better programmer.
Among the software projects you've participated in what's your favorite?: Definitely RethinkDB. We took a really complex subject (real-time distributed systems) and made them extremely accessible and super-easy to use. I love the product both because we made the user experience a joy, and because of the really advanced technology that goes inside to make that happen (from low-level assembly hacks, all the way up to abstract mathematics).
If you had all the time in the world for a Lisp project, what would it be?: I'd want to build my own Lisp dialect. I know, I know, it's been done to death, there is no need to do it, and it only hurts the community, but in the presence of infinite time, it's just too much fun not to do.
Describe your workflow, give some productivity tips to fellow programmers.: The most important thing I learned on productivity is this Alan Kay quite — "Perspective is worth 80 IQ points." You could be the most productive person in the world, but it won't make the slightest bit of difference if you're pointing your talents in a direction that isn't useful to other people. If you're talented, your gift is precious and your time is limited. Learn how to direct your talents, it will be the most important thing you do.
You're currently a co-founder of a startup company RethinkDB, which went through YCombinator. As an insider of the startup ecosystem, in your opinion, what are the areas for Lisp use in startups nowadays with the biggest potential upside and why?: This isn't a popular stance in the Lisp community, but I think that today Lisp is mostly valuable as an education tool, as a means of thinking, and as an engine of ideas. It's very important for that. But as far as practical use goes, there are better options today.

2012-10-19

Lisp Hackers: François-René (Faré) Rideau

François-René Rideau works at ITA Software, one of the largest employers of lispers, which was acquired by Google a year ago. While at ITA he stepped up to support and improve ASDF, the system definition facility, that is at the core of Lisp package distribution. He's also the co-author of the recently published Google Common Lisp Style Guide, which as well originated at ITA.

He's also an active writer: both of code and prose. He's thoughts and articles can be found on twitter, Facebook, Google+, Livejournal, and his site.

Tell us something interesting about yourself.

I like introducing myself as a cybernetician: someone interested in the dynamic structure of human activities in general.

Programming languages and their semantics, operating systems and reflection, persistence of data and evolution of code, the relation between how programmers are organized and what code they produce — these are my topics of immediate professional interest. For what that means, see for instance my slides (improved) from ILC'09: "Better Stories, Better Languages" or my essay "From Creationism to Evolutionism in Computer Programming".

However I'm also interested in cybernetics as applies to Civilization in general, past, present and future. See for instance my essay "Identity, Immunity, Law and Aggression on the Rapacious Hardscrapple Frontier" or my writings about Individual Liberty and the basic principles of Economics

Last but not least, I was recently married to my love Rebecca Kellogg, with whom I have since had a daughter Guinevere Lý "Véra" Kellogg Rideau (born last May). This gives me less free time, yet somehow made me more productive.

What's your job? Tell us about your company.

For the last 7 years or so, I have been working at ITA Software, now part of Google Travel. I have been working on two servers written in Lisp, at first briefly on QPX the low (air)fare search engine behind Orbitz and Google Flights then mostly on QRes, a reservation system now launched with Cape Air. These projects nowadays each count about half a million lines of Common Lisp code (though written in very different styles), and each keep growing with tens of active developers.

I suspect that my login "fare" (at itasoftware) was a pun that played in favor of recruiting me at ITA; however, it wasn't available after the Google acquisition, so now I'm "tunes" (at google), to remind myself of my TUNES project.

At ITA, I have been working mostly on infrastructure:

how to use better compilers (moving from CMUCL to SBCL, CCL),
how to build, run and test our software,
how to maintain the free software libraries we use and sometimes write,
how to connect QRes to QPX and follow the evolution of its service,
how to persist objects to a robust database,
how to migrate data from legacy systems,
how to upgrade our software while it's running, etc.

And debugging all of the above and more, touching many parts of the application itself along the way.

I think of my job at ITA so far as that of a plumber: On good days, I design better piping systems. On bad days, I don gloves and put my hands down the pipes to scrub.

Since you're mentioning me as working at ITA and on ASDF, I suppose it is appropriate for me to tell that story in full.

In building our code at ITA, we had grown weary of ASDF as we had accumulated plenty of overrides and workarounds to its unsatisfactory behavior. Don't get me wrong: ASDF was a massive improvement over what existed before (i.e. mk-defsystem), making it possible to build and share Common Lisp software without massive headaches in configuring each and every library. We have to be grateful to Dan Barlow indeed for creating ASDF. But the Common Lisp ecosystem was dysfunctional in a way that prevented much needed further improvements to ASDF. And so I started working on a replacement, XCVB.

Now, at some point in late 2009, I wrote a rant explaining why ASDF could not be saved: "Software Irresponsibility". The point was that even though newer versions of ASDF were written that slowly addressed some issues, every implementation stuck to its own version with its own compatibility fixes; no vendor was interested in upgrading until their users would demand upgrades, and users wouldn't rely on new features and bug fixes until all vendors upgraded, instead caring a lot about bug-compatibility, in a vicious circle of what I call "Software Irresponsibility", with no one in charge, consensus required for any change, no possible way to reach consensus, and everyone discouraged.

However, I found a small flaw in my condemnation of ASDF as unsalvageable: if, which was not the case then, it were possible to upgrade ASDF from whichever version a vendor had installed to whichever newer version you cared for, then ASDF could be saved. Users would be able to rely on new features and bug fixes even when vendors didn't upgrade, and vendors would have an incentive to upgrade, not to stay behind, even if their users didn't directly demand it. The incentive structure would be reversed. Shortly after I wrote this rant, the current ASDF maintainer stepped down. After what I wrote, I felt like the honest thing to do was to step forward. Thus, I started making ASDF self-upgradable, then massively improved it, notably making it more robust, portable, and easy to configure — yet fully backwards compatible. I published it as ASDF 2 in 2010, with the help of many hackers, most notably Robert Goldman, and it has quickly been adopted by all active Common Lisp vendors.

You can read about ASDF and ASDF 2 in the article I wrote with Robert Goldman for ILC 2010: "Evolving ASDF: More Cooperation, Less Coordination". I'm also preparing a talk at ILC 2012 where I'll discuss recent enhancements. I have to admit I didn't actually understand the fine design of ASDF until I had to explain it in that paper, thanks to the systematic prodding of Robert Goldman. Clearly explaining what you're doing is something I heartily recommend to anyone who's writing software, possibly as a required step before you declare your software complete; it really forces you to get the concepts straight, the API clean, and the tests passing. That also did it for me with my more recent lisp-interface-library, on which I'm presenting a paper at ILC 2012: "LIL: CLOS reaches higher-order, sheds identity, and has a transformative experience".

One double downside of ASDF 2 is that it both took a lot of resources I didn't put in XCVB, and made for a much better system for XCVB to try to disrupt. It isn't as easy anymore to be ten times better than ASDF. I still hope to complete XCVB some day and make it good enough to fully replace ASDF on all Common Lisp platforms; but the goal has been pushed back significantly.

Now one important point that I want to explicitly stress is that the problem with ASDF was not a strictly technical issue (though there were many technical issues to fix), nor was it strictly a social issue; it was an issue at the interface between the social and the technical spheres, one of how our infrastructures and our incentives shape each other, and what kind of change can bring improvement. That's the kind of issues that interest me. That's why I call myself a cybernetician.

Do you use Lisp at work? If yes, how have you made it happen? If not, why?

I've made it happen by selection. I applied at ITA Software precisely because I knew (thanks to the Carl de Marcken article published by Paul Graham), that the company was using Lisp to create real-world software. And that's what I wanted to do: create real-world software with a language I could use without wanting to kill myself every night because it is turning me into a pattern-expanding machine rather than a human involved in thinking and using macros as appropriate.

"I object to doing things that computers can do." — Olin Shivers

Yet, in my tasks as a plumber, I have still spent way too much time writing shell scripts or Makefiles; though these languages possess some reflection including eval, their glaring misdesign only lets you go so far and scale so much until programs become totally unmanageable. That's what pushed me over the years to develop various bits of infrastructure to do as much of these things as possible in Lisp instead: cl-launch, command-line-arguments, philip-jose, xcvb, asdf, inferior-shell.

Interestingly, the first and the last, cl-launch and inferior-shell, are kind of dual: cl-launch abstracts over the many Lisp and shell implementations so you can invoke Lisp code from the Unix shell; it is a polyglot lisp and shell program that can manipulate itself and combine parts of itself with user-specified Lisp code to produce an executable shell script or a dumped binary image; I sometimes think of it as an exercise in "useful quining". inferior-shell abstracts over the many Lisp and shell implementations so you can invoke Unix shell utilities from any Lisp implementation, remotely if needs be (through ssh), and with much nicer string interpolation than any shell can ever provide; it is a classic Lisp library notably available through Quicklisp. With the two of them, I have enough Unix integration that I don't need to write shell scripts anymore. Instead, I interactively develop Lisp code at the SLIME REPL, and have a shell-runnable program in the end. That tremendously improved my quality of life in many situations involving system administration and server maintenance.

What brought you to Lisp? What holds you?

My first introduction to Lisp was in high-school, in using the HP RPL on my trusty old HP 28C (eventually upgraded to a HP28S, with 32KB of free RAM instead of 4KB!). When I became student at Ecole Normale Supérieure, I was taught Caml-light by xleroy himself, I learned to use Emacs, and I met Juliusz Chroboczek who introduced me to Scheme and Common Lisp, continuations and SICP. Finally, during my vain efforts to gather a team to develop an operating system based on a higher-level language as part of the TUNES project, I have been introduced to Lisp machines and plenty of other interesting concepts.

I use Lisp because I couldn't bear to program without higher-order functions, syntactic abstraction and runtime reflection. Of all Lisp dialects, I use Common Lisp mainly because that's what we use at work; of course a large reason why we use it at work is because it's a good language for practical work. However, frankly, If I were to leave ITA (by Google), I'd probably stop using Common Lisp and instead use Racket or Maru, or maybe Factor or Slate, and try to bootstrap something to my taste from there.

What's the most exciting use of Lisp you had?

I remember being quite exhilarated when I first ran the philip-jose farmer: it was a server quickly thrown together by building green-threads on top of arnesi's (delimited) continuation library for CL. With it, I could farm out computations over a hundred servers, bringing our data migration process from "way too slow" (weeks) to "within spec" (a few hours). It's impressive how much you can do in Lisp and with how little code!

While I released the code in philip-jose, it was never well-documented or made user-friendly, and I suspect no one ever used it for real. This unhappily includes ITA, for my code never made it to production: I was moved to another team, our customer went bankrupt, and the new team used simpler tools in the end, as our actual launch customer was 1/50 the size of our first prospect.

What you most dislike about Lisp?

For Lisp in general, I would say the lack of good ways to express restrictions on code and data. Racket has been doing great work with Typed Racket and Contracts; but I'm still hoping for some dialect with good resource management based on Linear Logic, and some user-extensible mechanism to define types and take advantage of them.

For Common Lisp in particular, though I do miss delimited continuations, I would say that its main issue is its lack of modularity. The package system is at the same time low-level and inexpressive; its defsystem facilities are also lacking, ASDF 2 notwithstanding; up until the recent success of Zach Beane's Quicklisp, there wasn't a good story to find and distribute software, and even now it's still behind what other languages have. This is part of a vicious circle where the language attracts and keeps a community of developers who live happily in a context where sharing and reusing code is relatively expensive (compared to other languages). But things are getting better, and I have to congratulate Zach Beane once again for Quicklisp. I believe I'm doing my small part.

Among software projects you've participated in what's your favorite?

I unhappily do not have a great history of success in software projects that I have actively participated in.

However, I have been impressed by many vastly successful projects in which I had but a modest participation. In the Linux kernel, the Caml community, the Racket community, (QPX and QRes at work might also qualify but only to lesser degrees), there were bright people unified by a common language, by which I mean not merely the underlying computer programming language, but a vision of things to come and a common approach to concepts: not just architecture but architectonics. Another important point in these successful projects was Software Responsibility (as contrasted to the previously discussed Software Irresponsibility): there is always someone in charge of accepting or rejecting patches to any part of the system. Patches don't linger forever unapplied yet unrejected, so the software goes forward and the rewarded contributors come back with more and/or better patches. Finally, tests. Lots of them. Automatically run. All the time. Proofs can do, too, though they are usually more expensive (now if you are going to do testing at the impressive scale of sqlite, maybe you should do proofs instead (see CPDT). I discovered, the hard way, that tests (or proofs) are the essential complement to programs, without which your programs WILL break as you modify them.

If you had all the time in the world for a Lisp project, what would it be?

I would resurrect TUNES based on a Linear Lisp, itself bootstrapped from Racket and/or Maru.

Describe your workflow, give some productivity tips to fellow programmers.

First, think hard and build an abstract model of what you're doing. Guided by this understanding of where you're going, code bottom up, write tests as you do, and run them interactively at the SLIME REPL; make sure what you write is working and passing all tests at all times. Update your abstract model as it gets pummeled into shape by experience. Once you've got the code manually written once or twice and detect a pattern, refactor it using macros to automate away the drudge so the third time is a piece of cake. Don't try to write the macro until you've written the code manually and fully debugged it. Never bother with low-level optimization until the very end; but bother about high-level optimization early enough, by making sure you choose proper data structures.

Unhappily, I have to admit I am a serial under-achiever. I enjoy thinking about the big picture, and I like to believe I often see it better and further than most people; but I have the greatest trouble staying on track to bring about solutions: I have so many projects, and only one life to maybe complete a few of them! The only way I can actually get a few things done, is to decompose solutions into small enough steps such that I can keep focused on the next one and get it done before the focus goes away.

A year ago Google bought ITA, which was, probably, the largest Lisp company recently. What were the biggest upsides and drawbacks of using Lisp on the scale of ITA? Does Lisp have a future inside Google?

On the upside, we certainly have been able to write quite advanced software that we might not have otherwise managed. A million lines of Lisp code, including its fair share of macros and DSLs, would be so many more million lines of code without the syntactic abstraction made possible by Lisp. However hard and expensive it was with Lisp, I can only imagine how many times worse it would have been with anything else.

At the top of the tech bubble in 2008, we had over fifty Lisp programmers working just on QRes, gathered at an exponential rate over 3 years. That's a lot. We didn't yet have good common standards (Jeremy Brown started one, later edited by Dan Weinreb; I recently took it over, expanded it, merged it into the existing beginning of a Google Common Lisp Style Guide and published it), and it was sometimes hard to follow what another hacker wrote, particularly if the author was a recently hired three-comma programmer. But with or without standards, our real, major, problem was with lack of appropriate management.

We were too many hackers to run without management, and none of our main programmers were interested in becoming managers; instead managers were parachuted from above, and some of them were pretty bad: the worst amongst them immediately behaved like empire-building bullies. These bad managers were trying to control us with impossibly short, arbitrary deadlines; not only did it cause overall bad quality code and morale burnout, the renewing of such deadlines quarter after quarter was an impediment to any long-term architectural consideration for years. What is even worse, the organization as setup had a lot of inherent bad incentives and created a lot of conflicts, so that even passable managers would create damage, and otherwise good engineers were pitted against each other one two sides of absurd interfaces, each team developing a lot of scar tissue around these interfaces to isolate itself from the other teams. Finally, I could witness how disruptive a single bad apple can be when empowered by bad management rather than promptly fired.

I have had a lot of losing fights with QRes management at a time when, hanging on a H1B visa, I was too much of a coward to quit. Eventually, the bad people left, one by one, leaving behind a dysfunctional organization; and great as the people that manned it may have been, none was able or willing to fix the organization. Then finally, Google acquired us. There's a reason why, of two companies founded at about the same time, one buys the other and not the other way around: one grew faster because it got some essential things right that the other didn't. Google, imperfect as it necessarily is, gets those essential things right. It cares about the long term. It builds things to scale. It has a sensible organization. It has a bottom up culture. So far, things have only improved within QRes. Also, launching was also good in many ways. It makes us and keeps us real.

Lisp can be something of a magic tool to solve the hardest technical issues; unhappily it doesn't even start to address the social issues. We wasted a whole lot of talent due to these social issues, and I believe that in an indirect way, this is related to the lack of modularity in Common Lisp, as it fostered a culture of loners unprepared to take on these social issues.

So I'm not telling you this story just to vent my past frustration. There too I have a cybernetic message to pass on: incentives matter, and technical infrastructure as well as social institutions shape those incentives and are shaped by them.

As for the future of Lisp at Google, that million line of Common Lisp code ain't gonna rewrite itself into C++, Java, Python, Go, or even DART. I don't think the obvious suggestions that we should rewrite it were ever taken seriously. It probably wouldn't help with turnover either. But maybe, if it keeps growing large enough, that pile of code will eventually achieve sentience and rewrite itself indeed. Either that, or it will commit suicide upon realizing the horror.

Anything else I forgot to ask?

Ponies.

2012-10-08

Lisp Hackers: Daniel Barlow

Daniel Barlow was one of the most active contributors to the open source Lisp ecosystem, when its development took off in the early 2000s. Together with Christophe Rhodes he was the first to join SBCL hacking, after the project was started by William Newman. He also had created a lot of early Lisp web tools, like Araneida HTTP application server, and built on it the first version of cliki.net, which served the Lisp community for almost 10 years (the second version went live earlier in 2012). Studying Cliki source was a kind of zen experience for me, as it did so much, yet in a very simple way.

But his largest contribution is, probably, ASDF, regarding which, likewise Cliki, there are controversial opinions among Lisp programmers. And Dan explains his attitude in the interview.

In the mid 2000s his involvement with open-source Lisp gradually diminished, as he stopped working as a consultant and got a full-time job. Yet he remains fondly remembered in the community.

Tell us something interesting about yourself.: I don't do interesting. Um, improvise. Hacker, skater, cyclist, husband, father to a seven-month-old son as demanding as he is adorable, computing retro-grouch who uses Linux on the desktop.

I have a metal plate and some pins in my right forearm where I broke it a couple of months ago, inline skating in the Le Mans 24 hour relay event. I have now regained more or less complete range of motion in that hand and can advise anyone doing the event next year that the carpet in the pit boxes is unexpectedly treacherous when it's been waterlogged by a sudden thunderstorm.

Still, we came fifth in category, which makes me very happy.
What's your job? Tell us about your company.: I started a new job about three months ago, in fact. I'm now working at Simply Business in London, busily disrupting the business insurance market. Which is to say, writing web apps for the online sale of business insurance.

It's not quite as buzzwordy as it sounds, actually. Business insurance is traditionally sold by brokers, who are humans and therefore although really good at dealing with complex cases and large contracts where the personal touch is required, tend to be a trifle expensive for straightforward policies which could be much more economically sold online. The industy is ripe for disintermediation.
What brought you to Lisp?: A combination of factors around the time I was at university: the UNIX-Haters Handbook, which I bought to sneer at and ended up agreeing with; Caml Light, which I used in my fourth year project; Perl - specifically my horror to learn that it flattens (1,2,3,(4,5,6)) to (1,2,3,4,5,6) - yes, I know about references, but I think it's a bug not a feature that the sensible syntax is reserved for the silly behaviour - and meeting some people from Harlequin (as was) at a careers fair.

It took me another couple of years or so to find CMUCL - in fact, I think it was another couple of years or so before it was ported to the x86 architecture, so it wouldn't have done me much good if I had known about it earlier - but looking back I suppose that was where the rot set in.
Do you use Lisp at work? If yes, how you've made it happen? If not, why?: No, we're primarily a Ruby shop, with a sideline in large legacy Java app which we're working to replace. I think there've been a couple of uses of Clojure in hack days, but that's as far as it goes.

Why not? Well, apart from the point that I've only been there since July ... It's The Ecosystem, I suppose. Ruby as a language has pretty good support for OO paradigms and a whole bunch of free software libraries for doing web-related things: Ruby as a community is big on Agile and TDD and maintainable design. And it's at least possible (if not exactly easy in the current bubble) to engage a regular recruitment agent and task him with finding competent programmers who know it. I'm not saying Lisp is exactly bad at any of that, but it's at least questionable whether it's as good along all of those axes, and it's certainly not better enough to make the switch sensible.
What's the most exciting use of Lisp you had?: SBCL was probably one of the most fun projects I've ever worked on. Working with people who were mostly smarter than me or had better taste than me or both, on a project whose goal was to make it possible for people less smart than me to hack on a Lisp system. And context switching between the CL with all its high level features and (e.g.) Alpha assembly was a real kick - it's a bit like I imagine building a Lisp machine would be, except that the goal is achievable and the result is generally useful.
What you dislike the most about Lisp?: I don't really use it enough any more to react to that with the required levels of venom. I should probably say ASDF, everyone else does :-)

I guess if you force me to an answer, it'd have to be its disdain for the platform it lives on - take, for example, CL pathname case conversion rules. Whoever decided that Unix systems could reasonably be said to have a "customary case" had, in my view, not looked very hard at it.
As far as I can tell, you're currently mostly doing work in Ruby. What's the pros and cons of Ruby development compared to Lisp?: The transpose-sexps function in Emacs does nothing useful in Ruby mode. rails console is a poor substitute for a proper toplevel. Backtraces don't show the values of parameters and local variables. (Yes, pry helps a lot). And the garbage collector (in MRI, anyway) is sucky to the point that even 1990s Java GC could probably beat it in a fair fight.

On the other hand, libraries.

Here's an interesting thought experiment, though: there's a clear difference between the Lisp workflow where you change the state of your image interactively to get the code into working shape very quickly (and then later try to remember what it was you did) and the more scripted approach of test-driven development in Ruby where you put everything (code, test setup, assertions) in files that you reload from disk on each run. How would you meld the two to get repeatable and fast iterations? A lot of people are doing things like Spork (which forks your application for each test it runs, throwing the child state away after the test has run) but they never seem to me to be more than 80% solutions. My intuition is that you'd want to stick to a much more functional design and just make state a non-problem.
Among software projects you've participated in what's your favorite?: SBCL was a lot of fun, as I said earlier. ASDF is a candidate too, just because it must so obviously fill a need if people are still cursing it - as they seem to be - ten years later :-)
Describe your workflow, give some productivity tips to fellow programmers.: It's taken me the best part of three months to get this interview back to Vsevolod, I'm the last person anyone should be asking about workflow or productivity. :-)

Um. I've been doing a lot of TDD lately. Given that as recently as two years ago I was castigating it as a religion this might be seen as a capitulation or as a conversion, this might be perceived as a change of mind. What can I say? Actually, pretty much now what I said then: the value of TDD is in the forces it exerts on your design — towards modularity, functional purity, decoupling, all those good things - not so much in the actual test suite you end up with. Process not product. These days everyone thinks that's obvious, but back then it was either less widely known or less explicitly stated or else I was just reading the wrong blogs.

(Of course, the tendency in Lisp to write code interactively that can be tested ad hoc at the repl probably has a very similar effect on coupling and functional style. My personal experience is that TDD doesn't seem to be nearly as valuable in repl-oriented languages, but YMMV)

More generally: go home, do some exercise, get some sleep. Sleep is way underrated.
Anything else I forgot to ask?: Some day I will write an apology for ASDF.

Pedants will note that the word "apology" not only means "an expression of remorse or regret" but also "a formal justification or defence", and may infer from that and my general unwillingness to ever admit I was wrong that I'm not about to actually say I did a bad thing in writing it. Seriously, go find a copy of MK-DEFSYSTEM and try porting it to a Lisp implementation it doesn't support.

In 2002 I presented a paper at the ILC (about CLiki, not ASDF) that said essentially "worse is better than nothing", and - unless the "worse" has the effect of stifling a potential better solution from coming along later - I still stand by that

2012-08-20

Как сочетать функциональные языки и мейнстрим

Где-то с месяц назад меня позвали на киевскую тусовку функциональных программистов рассказать про практический опыт использования Clojure. В итоге я немного покритиковал Clojure, но в основном хотел сказать несколько о другом: о том, где ниша функциональных и других немейнстримных языков (например, Lisp'а), и как их можно использовать в сочетании с мейнстримными. Получилось, как всегда довольно сбивчиво, поэтому попробую изложить здесь яснее и структурированнее.

Вот видео:

А хотел сказать я всего-то 3 простые вещи, в которых нет ничего особенно нового, но от этого они, как по мне, не теряют своей ценности.

1. Философия Unix для веба

Если говорить о серверной разработке, то самым эффективным подходом к построению масштабируемых систем (как в смысле нагрузки, так и трудоемкости их развития) был и остается Unix way:

small pieces, loosely joined, that do one thing, but do it well

Только в отличие от классики Unix, теперь эти кусочки живут в рамках отдельных узлов сети и взаимодействуют не через pipe, а через сетевые текстовые интерфейсы. Для того, чтобы организовать такое взаимодействие существует ряд простых, надежных, хорошо масштабируемых и, что очень важно, де-факто стандартных и языконезависимых средств. Под разные задачи эти средства разные, и они включают:

низкоуровневые механизмы взаимодействия: сокеты и ZeroMQ

высокоуровневые протоколы взаимодействия: HTTP, SMTP, etc.

форматы сериализации: JSON и еще десяток других

точки обмена данными: Redis, разные MQs

Это не REST и не SOA в чистом виде, скорее перечисленные схемы являются несколько специализированными, а иногда и ушедшими сильно в сторону ппримерами воплощения этого подхода. Это не более и не менее, чем инструментарий, поверх которого можно построить любую сетевую архитектуру — от централизованной до P2P.

2. Требования к языкам

Когда программисты сравнивают между собой разные языки, они, как правило, подходят к задаче не с той стороны: от возможностей, а не от требований. Хотя работа в индустрии должна бы была их научить обратному. :) Если же посмотреть на требования, то их можно разделить на несколько групп: требования к языкам для решения неизвестных (исследовательских) задач существенно отличаются от требований к языкам для реализации задач давно отработанных и понятных. Это классическая дихотомия R&D — research vs development — исследования vs инженерия. Большинство задач, с которыми мы сталкиваемся — инженерные, но как раз самые интересные задачи, как с точки зрения профессиональной, так и экономической — исследовательские. Также отдельно я выделяю группу требований для скриптовых языков.

Требования к языкам для исследований

Интерактивность (минимальное время цикла итерации)

Поддатливость и гибкость (решать задачу, а не бороться с системой)

Хорошая поддержка предметной области исследований (если это математика — то хотя бы Numeric Tower, если статистика — то хорошая поддержка матриц, если деревья — то инструменты работы с деревьями и первоклассная рекурсия, и т.д.)

Возможность решать задачу на языке предметной области (заметьте, что специфические исследовательские языми — всегда DSL'и)

Требования к языкам для решения стандартных задач

Поддерживаемость (возможность легко передать код от одного разработчика другому)

Развитая экосистемы инструментов и хорошая поддержка платформы

Стабильность и предсказуемость (как правило, мало кто любит истекать кровью на bleeding edge, поэтому выбирают то, что работает просто, но без особых проблем, проверенное)

И, порой самое важное — возможность получить быстроработающий результат (подчас скорость работы отличает решение, которое пойдет в продакшн, от того, которое не пойдет)

Требования к скриптовым языкам

Первое и основное — хорошая интеграция с хост-системой (оптимизация языка под наиболее часто выполняемые операции в хост-системе)

Простота и гибкость (я еще не слышал ни об одном статически типизированном скриптовом языке :)

Минимальный footprint (с одной стороны вся тяжелая работа может делаться на стороне хост-системы, с другой стороны — обычно ресурсы ограниченны и очень мало смысла тратить их на ненужное)

Очень много можно рассуждать о том, какие требования стояли во главе угла при создании тех или иных языков, как языки эволюционируют и т.д. Ограничусь лишь тем, что выделю группу языков, которые однозначно создавались чисто для исследовательской работы — это Matlab, Octave, Mathematica, R, Prolog и разные вариации на тему. Слабость таких языков обычно в том, что в них не были заложены общеинженерные механизмы (прежде всего, первоклассная поддержка взаимодействия с другими системами, которые живут за пределами их "внутреннего" мира).

Один из классических подходов к решению исследовательских задач — двухэтапный метод: разработать прототип системы на исследовательском языке, а затем реализовать полноценную систему уже на инженерном языке, оптимизировав при этом скорость и другие показатели решения, но не меняя его сути. Самым существенным недостатком у него, как по мне, является то, что это может неплохо работать в случае одноразового решения, но если система должна эволюционировать во времени, то все становится существенно сложнее. Ну и, зачем делать дурную работу: хорошо, если язык может поддерживать как исследовательскую, так и инженерную парадигму.

Если предаставить эти требования в виде декартовых координат, и расположить на них языки, и обевсти самую интересная область, в нее попадет совсем немного языков — раз-два и обчелся. Прежде всего, это Lisp, также Python, и, может быть, еще пара-тройка.

3. Выбор языка под задачу, а не под платформу

Возвращаясь к нашему Unix'у в облаках, отдельные компоненты этой системы могут быть написаны на любом языке (также, как и на обычном Unix'е) и даже работать на любой платформе. Это дает возможность решать специфические задачи тем инструментом, который подходит лучше всего. И среди всего спектра задач преобладают в основном такие, которые, по большому счету все равно, на каком языке делать. Для этих задач обычно самыми главными критериями выбора являются экосистема инструментов и в половине случаев скорость работы результата (во всяком случае в долгосрочном плане).

Но не все задачи такие: есть много разных областей, в которых мейнстримные языки работают плохо или же фактически не работают вообще. Известный афоризм на этот счет — 10е правило Гринспена. Соответственно, если вы хотите использовать Lisp, Haskell или Factor — решайте на них те задачи, на которых они дают очевидное преимущество, и делайте их полноценными гражданами в облачной экосистеме. Так вы на практике, а не в теории сможете доказать их полезность скептикам, и в то же время будут развиваться как знания остальных программистов о них, так и инструментарий этих языков (по которому они часто проигрывают мейнстримным аналогам). Таким образом, в будущем они получат возможность рассматриваться как кандидаты для решения и других задач, для которых их преимущества не столько очевидны (в 2-3 раза, а не на порядок).

P.S. Пару слов о Clojure

В выступлении я много сравнивал Erlang и Clojure. Оба эти языка делают упор на конкурентную парадигму разработки. Но Erlang в этом смысле стоит особняком от других функциональных языков, поскольку конкурирует с императивными языками не столько на уровне качества языка для решения конкретных задач, сколько как язык-платформа, создающий новую парадигму решения задач за рамками отдельного процесса и отдельной машины. Таким образом, его главная ценность — это роль системного языка для распределенных систем.

Что касается Clojure, то я в шутку разделил все языки на фундаментальные и хипстерские. Фундаментальные языки (такие как C, Lisp, Erlang, Haskell, Smalltalk) появляются, когда группы умных людей долго работают над сложными проблемами и в процессе создают не просто язык, а целую парадигму. В то же время хипстерские языки являются продуктом обычно одного человека, который хочет здесь и сейчас получить самое лучшее из нескольких языков сразу, гибрид. Я перечислял такие примеры, как C++ (C + классы + еще куча всего, понадерганного с разных сторон) — это, пожалуй, архетипный хипстерский язык,— Ruby (Perl + Smalltalk + Lisp), JavaScript (C + Scheme), который со времени пояления V8 и node.js перешел из категории скриптовых языков в общесистемные. Кстати, в большинстве случаев языки второго типа более успешны в краткосрочном плане и завоевывают мир. Точнее, их мировое господство чередуется с господством языков, которые стоят в этом спектре где-то посередине: когда люди устают от подобных гибридов, их в конце концов "побеждают" более здравые варианты. Java вместо C++, Python вместо Ruby (для веба), посмотрим, что будет вместо JS. К сожалению, Clojure — яркий пример такого гибрида: это попытка скрестить Lisp с Haskell'ем, да еще и на Java-основе...