Lisp, the Universe and Everything


Pretty-Printing Trees

  (or The Ugliest Code I've Ever Written)

In the last couple of days, I was ill and had to stay in bed, so I've used this time also to tidy up the work that accumulated over the past year in cl-nlp. That was especially timely, considering the interest that was expressed in using it by some people who I've met at the recent Lisp-related events.

I've even assembled a rough checklist of the things that need to be finished to get it to v.1.0 and beyond.

Besides, after finishing the basic cleaning, I've returned to one of the programming tasks that has racked my head for long: tree pretty-printing. In NLP, we constantly have to deal with various versions of parse trees, like the constituency or dependency ones, but the problem is that they are not easily visualized. And good visualization plays, at least for me, a critical role in effective debugging, ideation and programming. It's an essential part of a solid interactive experience that is one of the fundamental traits of Lisp development.

For instance, a constituency tree is usually presented as a Lisp list. Here's an infamous example from the Penn Treebank:

      (NP (NNP Pierre) (NNP Vinken) )
      (, ,) 
        (NP (CD 61) (NNS years) )
        (JJ old) )
      (, ,) )
    (VP (MD will) 
      (VP (VB join) 
        (NP (DT the) (NN board) )
        (PP-CLR (IN as) 
          (NP (DT a) (JJ nonexecutive) (NN director) ))
        (NP-TMP (NNP Nov.) (CD 29) )))
    (. .) )

A dependency tree has several representations, all of which are not really intuitive to grasp. This is the Stanford format:

 amod(ideas-2, Colorless-0)
 amod(ideas-2, green-1)
 nsubj(sleep-3, ideas-2)
 root(sleep-3, sleep-3)
 advmod(sleep-3, furiously-4)
 punct(sleep-3, .-5)

And here's the CoNLL one:

0 Colorless _ _ ADJ 2
1 green _ _ ADJ 2
2 ideas _ _ NOUN 3
3 sleep _ _ NOUN 3
4 furiously _ _ ADV 3
5 . _ _ PUNCT 3

Also, Google's Parsey McParseface offers another - presumably, more visual - representation (using the asciitree lib). Still, it is not good enough, as it messes with the order of words in a sentence.

Input: Bob brought the pizza to Alice .
brought VBD ROOT
 +-- Bob NNP nsubj
 +-- pizza NN dobj
 |   +-- the DT det
 +-- to IN prep
 |   +-- Alice NNP pobj
 +-- . . punct

As you see, dependency trees are not trivial to visualize (or pretty-print) in ASCII. The authors of Spacy creatively approached solving this problem by using CSS in their displaCy tool:

However, it seems like an overkill to bring a browser to with you for such a small task. And it's also not very scalable:

I, in fact, was always interested in creative ways of text-based visualization. So, I thought of ways to represent parse trees in ASCII.

With constituency ones, it's rather trivial:

> (pprint-tree '(TOP (S (NP (NN ))
                        (VP (VBZ )
                            (NP (DT )
                                (JJ )
                                (NN )))
                        (|.| <.:5 22..23>)))
  :          VP         : 
  :   .---------.       : 
 NP   :        NP       : 
  :   :   .----:-----.  : 
 NN  VBZ DT   JJ    NN  . 
  :   :   :    :     :  : 
This  is  a simple test . 

The dependencies are trickier, but I managed to find a way to show them without compromising the sentence word order:

> (pprint-deps '(<This:0 0..4> <is:1 5..7> <a:2 8..9> <simple:3 10..16> <test:4 17..21> <.:5 22..23>)
               '(root(_ROOT_-0, is-1) nsubj(is-1, This-0) dobj(is-1, test-4) det(test-4, a-2) amod(test-4, simple-3) punct(is-1, .-5)))
Colorless green     ideas      sleep     furiously . 
    ^       ^        .^         .^.          ^     ^
    :       `. amod .´:         :::          :     :
    `..... amod .....´:         :::          :     :
                      `. nsubj .´::          :     :
                                 :`. advmod .´     :
                                 :`.... punct .....´

And it looks pretty neat even for longer sentences:

We        hold these   truths to       be  self -       evident , that all      men are        created     equal , that they are        endowed      by  their    Creator    with certain unalienable Rights , that among     these are      Life         , Liberty   and the    pursuit     of     Happiness . 
 ^         .^.   ^       .^    ^       .^.   ^  ^         .^    ^   ^   ^       .^   ^           .^.         ^   ^   ^    ^   ^           .^.         ^.   ^        .^.        ^.    ^         ^        .^   ^   ^    ^.        ^   .^.        ^.         ^    ^.      ^   ^       .^.        ^.        ^     ^
 `. nsubj .´::   `. det .´:    `. aux .´::   :  `. punct .´:    :   :   `. det .´:   `. auxpass .´::         :   :   :    :   `. auxpass .´::         ::   `. poss .´::        ::    :         `. amod .´:   :   :    :`. pobj .´   :::        :`. punct .´    :`. cc .´   `. det .´::        :`. pobj .´     :
            :`... dobj ...´             ::   `. npadvmod .´:    :   :            :               ::`. advcl .´   :   :    :               :::         ::             ::        ::    `...... amod ......´:   :   :    :             :::        ::              ::                   :`. prep .´               :
            ::                          :`..... acomp .....´    :   :            `.. nsubjpass ..´::             :   :    :               :::         ::             ::        :`......... pobj .........´   :   :    :             :::        ::              :`...... conj .......´                         :
            :`......... advcl ..........´                       :   :                            ::`... punct ...´   :    :               :::         ::             :`. prep .´                             :   :    :             :::        :`.... conj ....´                                              :
            :`..................... punct ......................´   `........... mark ...........´::                 :    :               :::         :`... pobj ....´                                       :   :    :             ::`. attr .´                                                              :
            ::                                                                                    ::                 :    :               ::`. agent .´                                                      :   :    `... prep ....´:                                                                        :
            ::                                                                                    ::                 :    `.. nsubjpass ..´::                                                                :   `...... mark ......´:                                                                        :
            ::                                                                                    ::                 `....... mark .......´::                                                                :                       :                                                                        :
            ::                                                                                    ::                                       :`............................ punct .............................´                       :                                                                        :
            ::                                                                                    ::                                       :`........................................ advcl .........................................´                                                                        :
            ::                                                                                    :`................ advcl ................´                                                                                                                                                                  :
            :`...................................... ccomp .......................................´                                                                                                                                                                                                           :
            :`............................................................................................................................................ punct .............................................................................................................................................´

However, writing the visualization code was one of the most intimidating programming tasks I've ever encountered. One explanation is that trees are most naturally processed in depth-first order top-down, while the visualization requires bottom-up BFS approach. The other may be that pixel-perfect (or, in this case, character-perfect display is always tedious). As far as I'm concerned, this is not a sufficient explanation, but I couldn't find any other. The ugliest part of this machinery is deps->levels function that prints the dependency relations in a layered fashion. The problem is to properly calculate minimal space necessary to accommodate both tokens and dependency labels and to account for different cases when the token has outgoing dependency arcs or doesn't. In theory sounds pretty easy, but in practice, it turned out a nightmare.

And all of this assumes projective trees (non-intersecting arcs), as well as doesn't know how to show on one level two arcs going from one token in two directions. Finally, I still couldn't align the two trees (constituency and dependency) above and under the sentence. Here's the target:

  :               VP              : 
  :         .---------.           : 
 NP         :        NP           : 
  :         :   .----:---------.  : 
 NN        VBZ DT   JJ        NN  . 
This        is  a simple     test . 
  ^         .^. ^    ^        .^  ^
  `. nsubj .´:: :    `. amod .´:  :
             :: `.... det ....´:  :
             :`..... dobj .....´  :
             :`...... punct ......´

and this is how it prints for now (one more challenge was to transfer additional offsets from dependencies into the constituency tree):

  :          VP         : 
  :   .---------.       : 
 NP   :        NP       : 
  :   :   .----:-----.  : 
 NN  VBZ DT   JJ    NN  . 
This  is  a simple test . 
  ^         .^. ^    ^        .^  ^
  `. nsubj .´:: :    `. amod .´:  :
             :: `.... det ....´:  :
             :`..... dobj .....´  :
             :`...... punct ......´

Well, the good news is that it is usable, but it still needs more work to be feature complete. I wonder what was I doing wrong: maybe, someone can come up with a clean and simple implementation of this functionality (in any language)? I consider it a great coding challenge, although it may require a week of your free time and a bunch of dead neurons to accomplish. But if you're willing to take it, I'd be glad to see the results... :D


(m8n)ware Open for Business

Today, I want to announce (m8n)ware (the name is an i18n-abbreviation of "meditationware" with a mix of Lisp parens). This is a thing I always wanted to build. After parting ways with Grammarly almost a year ago, I had some time to rest and think about my next move. And this thought I couldn't let go so I figured: you can always go work somewhere, but you don't have a lot of stabs at realizing some of your own ideas. Maybe, two or three in a lifetime. I had tried this once already with fin-ack almost 8 years ago, and the concept behind it was, basically, the same — the implementation differed.

In theory

What is (m8n)ware? It is a company aimed at solving problems in the area of cognition-related computing, which will be built as a distributed network of mostly Lisp research-oriented engineers. This sounds rather complex, so let me try to explain a couple of points:

  • Cognition-related computing is the best term I came to after a long thinking about the area of CS that includes various tasks related to cognition, intelligence, knowledge, and associated logic. The common marketing buzzword is Artificial Intelligence, but it has a negative history and is quite misleading. All computer programs implement some form of "artificial" intelligent behavior. The defining feature of cognition-related computing is that it requires some transformation of raw data into structured computer-processable information and back, which is similar to human cognitive functions that arguably do the same transformation for our own internal processing.
  • The zest of the distributed network notion is that the primary focus of (m8n)ware is building not a localized corporate-like structure, in which people are bound primarily by legal contracts and payment obligations, but a loosely coupled group of like-minded people, who share the same values, interests, and approaches to technology. This organization will be seeking a perfect middle-ground between a corporation and an open-source community.
  • Research-oriented engineers is another "middle-ground" term that describes the main multidisciplinary role needed in this organization. We're not a scientific lab that is focused on fundamental research, neither are we an outsourcing shop that faithfully implements existing results according to a given spec. We're engineers in the sense that we deliver production-ready technology that may be useful to the end users in a straightforward manner. And, at the same time, we're researchers because we wield the methodology and are ready to experiment in new areas that don't have satisfactory state-of-the-art solutions.

I don't believe in the VC mantra of "build a startup, get rich, change the world." First of all, I don't believe in changing the material world (which implies a conviction that you know better). I believe in changing yourself. Also, getting rich and doing something good (to the world) are not the goals that are always aligned. Moreover, they are usually in conflict. I'm not a businessman in the sense that money is not my ultimate goal. But I like to see things grow and develop, things that are bigger than myself. Thus I'm interested not in market share but in mind share.

Considering all of the above, (m8n)ware is not going to be a product company in a traditional sense. It will be a technology company that will create and disseminate knowledge-based services and products. Also, it will not aim at rapid growth, but rather at sustainable development.

In the previous post, I've listed my motivations for moving in this particular direction and explained my values. I can't say that it got overwhelmingly positive feedback, but, in general, the results were better than I expected :) Now, I have several clues on how to cross the most challenging chasm of scaling its operation from a single-person endeavor to a productive and sustainable group. Meanwhile, I was testing if this approach may work in practice and doing market research. Now, I'm ready to go all in and devote at least the next year of my professional life to building this thing. is oficially live. If you're interested in cooperation as a client, partner or co-worker, please, let me know...

In practice

There are several aspects that I'm betting on in (m8n)ware that are non-mainstream and may seem somewhat counter-intuitive. In this part, I want to provide more details about why and how, I think, this will work.

The first aspect is radical transparency. From this and the previous post, it should be clear that (m8n)ware originated and plans to continue functioning fully exposed to the outside world, not relying on any secret know-hows or clever tricks. I don't plan to conceal what we're going to do and why. Neither "fake it till you make it." Why this will work? First of all, my experience shows that in the current age of information overload, we're fighting primarily not for the purse but for the thoughts of our "customers" (in many possible markets: not only where you sell your product/services, but also in the labor market, and in the ecosystem of potential competitors, partners, and vendors). And this requires information sharing, not safekeeping and concealment. Secondly, in general, I'm not interested in competition — rather I'd like to find a unique niche in the market that will be served by the company in the best possible manner and will be big enough to sustain it. The good news is that the AI-market is, currently, growing very fast and this trend will last at least for a couple more years. So demand is greater than supply, and this means not a very harsh competitive environment. Another thing is Lisp: no one in their right mind will bet a company on Lisp, so I'm not really worried about the competition in the labor market. :) The final point about openness is that I personally endorse it, and as this is the company that aims to be as close to my ideal as possible it should endorse it.

Although it's not a classic product company, it's not going to be a typical outsourcing one either. Yes, initially, it will provide primarily consulting services, but the idea is that, with time, the share of these services will decrease in favor of supporting more general-purpose tools and technology developed in-house. And to ensure the constant priority of this goal, we'll be doing such work from day one. Currently, I see it in the following manner: the time of all engineers will be split in some proportion between for-pay consulting and developing open-source/research projects for free, and with time as some of these projects become important to the company, it will start paying the people who develop them for this work as well. This is a frugal approach, but I advocate it based on personal perspective: working at my previous gigs, I'd be eager to forfeit, say, 20% of my salary to be able to spend 20% of my time on open-source projects that matter to me personally. Actually, the percentage may be much bigger. Currently, I spend 50% of my time working on such projects and am quite happy with this. I deeply believe that such balance is more appealing to many programmers (especially, the kind of people I'd be willing to cooperate with) than a conventional approach.

Lisp again. From my experience working in cognitive problems domain, I can definitely say that it's not about coding. For several reasons. The obvious one is that 80% of resources are spent in other parts: thinking/learning, working with data, experiments, documentation. (The remaining 20% are still critical, especially since most of the solutions are resource-demanding and the code is algorithm-heavy). Then, current technology situation: the days of backend-only solutions are, unfortunately, gone. A lot of problems require heavy mobile or in-browser presence. And on the backend, thanks to microservices and other stuff, no one is developing in a single language and even on a single platform anymore. Finally, there's knowledge transfer. Programs may be not a bad medium to express concepts, but not the optimal one either: between scientific papers, blog posts, markdown documents, experiment notebooks, and production-optimized programs, there is no one-size-fits-all solution. All this creates conditions, in which the choice of a programming language becomes much less a constraint than it was just a a few years ago. On the other side, from the point of view of "internal" productivity (not concerned with integration into the bigger picture), Lisp has proven to be a great and rewarding environment very well suited for research work. Plus a great way to differentiate in the labor market... :)

Our value proposition

So, if you need to solve some cognitive computing problems, here's what (m8n)ware may offer.

  1. Provide small-scale consulting services: talk to your people and help them with their challenges, perform an audit, study feasibility of some serious project, help with gathering relevant data sets, etc.
  2. Develop a prototype solution for a particular problem, and deliver it as a set of data, documentation, and a working web-service to allow integration in your prototypes, testing in your environment, with your clients and data.
  3. Develop a turnkey solution and integrate it into your environment. This is rather tricky as we'll prefer to work in Lisp, and not every environment will be ready for this. We're also willing to compromise and develop some non-critical integration parts in other languages when necessary, provided that the core remains Lisp-based.

Why you should go to us instead of solving the problem on your own? The current situation in cognition-related computing is that such projects have high business value, but are not easy to complete: they require not just engineering, but a substantial/prevailing research component. Productive work in this area assumes a skill set of developers and managers that is different from conventional software development. Obviously, you want to develop this expertise in-house, but growing it, currently, is a slow and daunting process. Still, you should definitely do that for the long-term benefits, but this doesn't mean that you can say something in the lines of: in the next half a year I need to solve this complex AI problem, so I'll just hire a person/team in a couple of months and let them do it. It's a risky approach even in conventional software development, and in this field, it just doesn't work. The competition for AI researchers is insane and, moreover, if you're a regular company and not Google/Facebook or the latest-hottest startup your chances of hiring and retaining top talent are, basically, nil. (Why we'll be able to have the talented people while you won't? Because our particular focus — Cognitive+Tech+Distiributed+Lisp — will allow us to appeal to a portion of the talent pool that is not happy in mainstream environments).

Cognitive computing projects are risky and hard to predict. That's why for any serious long-term (longer than a couple of months) partnership we'll be dividing the work into reasonable chunks that will allow you to get at least part of the value at each checkpoint, see and assess progress, and pivot if your plans or conditions change.

We're open for business — write to if interested.


Уроки курса «Алгоритмика»

28 декабря состоялся второй выпуск слушателей курса по алгоритмам в Projector. Пока что именно "слушателей", а не полноценных выпускников, потому что до 100% того, чем может и должен стать этот курс, еще очень далеко. Но чтобы достичь этих 100%, в том числе, нужно подвести некоторые промежуточные итоги. Первая попытка, прошедшая весной и в начале лета, официально позиционировалась как бета-версия, но и вторая, по факту, тоже оказалось пробной, т.к. курс был разделен на 2 уровня сложности и существенно переработан.

Что можно сказать точно — что курс нужен. Эти знания и навыки дают возможность программисту перейти из категории кодера в полноценного разработчика, способного решать задачи любой сложности. Именно такие люди нужны как гуглам и фейсбукам (и поэтому они уделяют такое внимание теме алгоритмов на собеседованиях), так и небольшим амбициозным продуктовым командам. Да и самому себе: это один из аспектов, котороый дает возможность программисту "расправить крылья" и полюбить свою работу, получив возможность сделать ее разнообразной и приносящей гораздо большую отдачу. И то количество людей, которые этим активно интересуются, показывает, что на рынке тоже есть такое понимание. Хотя эти знания относятся к базовым для университетского курса Компьютерных наук, далеко не у всех была возможность их освоить: кто-то пришел в профессию не классическим путем, кому-то не повезло с вузом и преподавателями, кто-то, просто, прогулял и теперь, набравшись ума-разумуа, хотел бы наверстать упущенное.

Другой вопрос, что разным категориям из этой группы людей подходят разные форматы обучения. Для кого-то это онлайн-обучение, для кого-то — классический университет, и только немногие, объективно, готовы к формату интенсивных трехмесячных курсов. Практика показывает, что где-то треть людей, записавшихся и начавших ходить на такие курсы, быстро осознает это и отпадает. В итоге, до конца двух первых курсов у меня доходило чуть меньше половины участников, причем далеко не все из них успешно проходили весь материал. Т.е. помимо тех, кто просто попал не туда и ушел сразу, есть еще примерно такое же количество, которые могли бы получить хороший результат, но этого не произошло. Почему так? В первую очередь, конечно, причина в недоработках с моей стороны.

Наконец, если в востребованности направления алгоритмов в целом у меня нет сомнений, то вот потребность именно в продвинутом курсе пока что не доказана. Да, первый набор был закрыт, но мы сделали ключевую ошибку: не отфильтровали его должным образом, т.к. надеялись, что те, кто в себе не уверен, пойдут на базовый курс. Это сработало лишь отчасти. Второй момент: люди ждут от продвинутых алгоритмов нечто большого, чем только алгоритмов — конкретно, они хотят машинного обучения. И это можно понять: все хотят машинного обучения :) Оно было заключительной частью программы последней итерации курса (и прошло, как по мне, на ура), но из-за этого пострадал весь остальной материал, который объективно не вмещался в формат. Поэтому машинного обучения больше не будет в рамках Алгоритмики Про. Впрочем, ее сомневаюсь, что очень скоро оно появится в виде отдельного курса, т.к. спрос на это направление сейчас зашкаливает.

В итоге, что же такое Алгоритмика Про и есть ли в достаточном количестве те, кому она нужна? Это сложный курс и сложный вопрос. Его основная идея — это погрузить практикующих программистов в проблематику решения реальных алгоритмических задач: работу с данными размером как минимумом в гигабайты, или реальными графами с миллионами узлов, разобраться в том, как функционируют современные базы данных, редакторы и системы контроля версий, изучить различные алгоритмы оптимизации и применить их для задач из окружающей действительности, залезть под капот сервера и уменьшить время ожидания запросов в его очереди. К сожалению, пока что я только нащупываю это направление. С одной стороны, не все практические проблемы, до которых можно дотянутся, лежали на поверхности (хотя сейчас по опыту двух курсов эта тема для меня довольно неплохо прояснилась). С другой — для этого не было достаточно готовности, вовлеченности и отдачи со стороны участников. Много ли у нас программистов, которые хотят прокачаться в алгоритмической разработке, находятся на должном уровне и готовы выделить для этого большой кусок времени в течение трех месяцев — вот это главный вопрос, ответа на который мы пока не знаем. Такие люди, одназначно, были на первых двух моих курсах. Но, к сожалению, их можно было сосчитать на пальцах одной руки, когда нужно было бы задействовать хотя бы четыре...

Основные вызовы курса для меня

Первый из них я, фактически, описал выше. С самого первого дня обсуждения мы говорили о том, что этот курс должен быть максимально практичным. И я честно старался добиться этого. Но практичность имеет разные аспекты: один аспект — это практическая работа, т.е., в данном контексте, банальное программирование. К сожалению, мне не удалось привлечь к этому каждого участника, хотя по моими прикидкам и прошлому опыту казалось, что это произойдет естественно. Второй аспект практичности — это обсуждение (и реализация) примеров из реального мира, где это все используется. Этому я также старался уделять должное внимание: как рассказывать кейсы на лекциях, так и давать подобные задания (хотя их было меньше, чем могло быть), так и в аспекте курсовой работы. К сожалению, эта часть, как по мне, основывается на первой, т.е. активном программировании, а без него она тоже сильно буксовала. Это проблема номер один, которую я намерен активно решать в рамках следующего курса.

Она также упирается в наличии удобной среды для такой работы у каждого участника курса, а также ее единства для всех участников, чтобы можно было где-то подталкивать продвижение вперед. Я начал первый курс с демонстрацией примеров кода на Лиспе. Это не всем нравилось, хотя все был об этом честно предупрежденны. В итоге, второй вариант был более абстрактным: описание алгоритмов на доске без привязки к тому или иному языку. Этот AB-тест показал, что так не работает: нужно иметь под рукой код, который можно пощупать и покрутить, и который можно подкинуть человек, который застрял, чтоб он мог двигаться дальше. Учитывая мою собственную привязку к Лиспу, а также то, что язык хорошо подходит для реализации алгоритмов, я планирую продолжать настаивать на его использовании. Почему не Python или что-то другое? Во-первых, многие языки не очень пригодны для изучения алгоритмов вообще: яркий пример — это JavaScript, который слишком не четок, не имеет полноценной поддержки арифметики и нужных структур данных, другая крайность — это статические языки, особенно низкоуровневые, которые, с одной стороны, дают много возможностей для оптимизации, но, с другой, вносят слишком много ограничений (в частности, более сложный процесс разработки) и избыточной сложности. Что до Питона, то он более-менее подходит, но я его, по-просту, не люблю, тем более, что курсов по алгоритмам на Питоне хватает. Что ж до конкретно Лиспа и его особенностей: я считаб, что это хороший фильтр, которых нам не хватало при наборе на предыдущие курсы. На самом деле, разобраться в Лиспе на базовом уровне, необходимом для этого курса, не сложно. И если у человека не хватает мотивации и доверия, чтобы это сделать, это многое говорит о его дальнейшей мотивации преодолевать трудности во время самого курса. А, как показала практика, цитируя одного из студентов курса, "Сам факт, що лекція коштує 550 грн, ще не має достатньо стимулючого ефекту," щоб виконати домашнє завдання :(

А зачем, вообще, платить?

Очевидный и резонный вопрос, на который должен ответить для себя каждый желающий пройти этот курс — это вопрос, стоит ли он того и зачем, вообще, платить? Ведь есть интернет, википедия и прекрасные онлайн-курсы, на которых можно изучить то же самое. И это, действительно, так! В отличие от онлайн-курсов, оффлайн-курсы не могут быть бесплатными, поскольку они должны окупать аренду помещения и другие расходы, достойную оплату преподаваталя и персонала, и давать какую-то прибыль организаторам. И к ним не применима фримиум-модель, которую используют Курсера и другие. Да и, вообще, за все в жизни нужно платить.

Но если взглянуть с практической стороны, ROI любого обучения — это отношение полученного результата к затратам денег и времени на его достижение. По-идее, оффлайн-курсы могут выигрывать за счет более высокого среднего результата и меньших затрат времени. Что может войти в этот лучший результат?

  • во-первых, как ни банально это звучит, "волшебный пендель", т.е. внешняя мотивация пробежать этот забег от начала и до конца. И вложенные деньги тоже являются частью этой мотивации, хотя, как показывает практика, не достаточной. В этих курсах пока не было соревновательного момента, который присущ классическому обучению, и это еще одно направление, над которым нужно немного поработать (наметки есть)
  • во-вторых, возможность личного общения с преподавателем и другими учениками. Для меня это, на самом деле, одна из главных мотиваций делать этот курс: возможность взаимодействия с программистами, которые ищут и хотят развиваться в профессии. Парадокс в том, что даже не смотря на то, что я получаю неплохие деньги за этот курс, я все равно зарабатываю больше за основную свою работу. Т.е. меньший заработок должен компенсироваться чем-то другим. Для меня это другое — это возможность со-творчества с участниками курса. А это значит, что мы должны быть на одной волне и идти, в первую очередь, иметь желание попасть на занятие и провести его полноценно. В идеале, завязавшиеся во время обучения связи должны быть одним из главных долгосрочных активов после окончания курса
  • комфортная среда обучения и общения, причастность к сообществу. Projector делает очень важное дело, создавая на основе своей площадки сообщества профессионалов в сфере дизайна, продуктовой разработки и программирования (а также, в будущем, я думаю и других областях)

Кому стоит и не стоит идти на курсы по алгоритмам в Projector

Для меня, на самом деле, это ключевой вопрос всей этой темы. Ни я, ни Projector не ставим себе цели массовости и сверхприбылей. Во-первых, это не устойчиво и закончится пшиком, во-вторых, никакого внутреннего удовлетворения от такой работы не получишь. Между группой из 8-10 мотивированных людей, которые знают, куда и зачем пришли, и 20 вольно интересующимися я выбираю первый вариант, хотя второй, на самом деле, проще. Первые две итерации курсов были поиском: поиском правильного формата и адекватной ему аудитории.

Мой вывод следующий: эти курсы подходят тем, кто

  • уже имеют некоторый опыт программирования (в идеале, хотя бы пару лет)
  • осознал для себя ценность алгоритмов, и не будет мучаться вечным вопросом украинского студента: "где же это все применяется в реальной жизни?" Ответ на него, с моей точки зрения: кто хочет, тот найдет. Спрос на алгоритмических программистов есть, и хотя он не более нишевый, но в нишах всегда больше и интерес, и доходы
  • готов (как психологически, так и организационно) 3 месяца стабильно уделять минимум 10 часов в неделю этим занятиям, а также, что еще более важно, уделять им основной ресурс своего мозга. Практически, это означает, что в это время не удастся полноценно интенсивно работать. Как показали эти 2 курса, самое лучший период для участия в этой авантюре — это либо перерыв между работами, либо последние курсы вуза. Те, кто пытаются одновременно интенсивно работать в разработке и учиться, либо забивают на курс, либо жалуются, что работа начинает страдать, либо берут отпуск, чтобы подтянуть хвосты. Также это может прокатить для тех, у кого работа сейчас не предполагает активное написание кода. Если же вы только что поменяли работу, как раз должны заканчивать важный проект, ожидаете рождения ребенка (да, и такие случаи уже бывали :) или же собираетесь уехать в середине в отпуск или коммандировку, то этот формат точно не для вас


5 Steps to Grasping Modern ML

Recently, I've been teaching an advanced Algorithms course, which concluded in a short introduction to Machine Learning. Obviously, ML is its own track in Computer Science curriculum, but, nevertheless, there's a substantial overlap between these 2 disciplines: algorithms and ML. However, ML adds another dimension that is not usually considered in the world of algorithmic thinking.

Anyhow, this experience helped me formulate the minimal selection of concepts that need to be grasped in order to start practical ML work. An ML crash course so to say.

As I've never seen such compilation, I'd like to share it in this post. Here are the 5 essential steps to understanding

Step 1. Understanding the ML problem formulation. kNN algorithm

The first thing one needs to realize is the difference between an ML problem and the common programming problems. Here training/test data and an objective function should be explained alongside with the 3 common "learning" approaches: supervised, unsupervised, and reinforcement. A widely used and good initial examples is the Iris data set and kNN algorithm.

Step 2. Adding features and iterative training into the picture. Perceptron algorithm

The second step is introduction of the concept of feature extraction that allows approaching a problem from different angles. The Iris data set already has features, but initially they may be perceived as given. Iterative training is another common ML approach (although some popular algorithms like kNN or Decision Trees don't rely upon it). Perceptron is the simplest algorithm to explain (which still remains in practical use) and leads nicely to the next step.

A good example task and data set for this part is the Brown Corpus and the problem of POS tagging. And there's a great post outlining its soultion by Matthew Honnibal.

Step 3. Continuous vs discrete learning, gradient descent. Softmax algorithm

The obvious next step is transitioning from a discrete perceptron learning to continuous gradient descent used in Logistic regression. Andrew Ng provides a lucid connection in Part II & III of his tutorial on Linear Models. It also helps that Logistic regression and Softmax are the basic building blocks of Neural Networks that are to be discussed next. The example task for this problem may remain the same POS tagging, although others, like the ones used by Andrew, may be also utilized.

Step 4. Learning graphs (aka neural nets), backprop. Feed-forward Neural Network algorithm

As soon as we understand gradient descent and logistic regression, it's rather easy to make the next step to forming layers of such blocks to allow the combined model to "learn" higher-level feature representations. This is where the Backprop algorithm for efficient training comes into play (that is, by the way, another example of a dynamic programming algorithm). Also in this part, it's possible to talk about vector representations of words and other highly contextualized objects (landmark position in image, etc.) A great explanation of Backprop is presented in this post of Christopher Olah. Also, a good exaple data set here is the MNIST.

Step 5. Bias-variance tradeoff, regularization & ensembles. Random Forest algorithm

Finally, we should return to the beginning and revisit the learning problem, but with some practical experience already under our belt. This is where the essential bias-variance tradeoff and common ways to tackle it should be discussed: regularization and ensembles. It's also a good place to introduce the Decision Tree algorithms and the ensemble methods based upon them (Random Forest and, maybe, others) as one of the most widely-used current practical approach.

"Winning the Bias-Variance Tradeoff" by Juila Evans may be a good introductory text on this.

Overall, due to the highly condensed nature of such presentation, a lot of important things will be almost not covered. For example, unsupervised learning, CV with its convolutions, sequence models. However, I believe that with the obtained knowledge and conceptual understand of the mentioned basis those parts may be grasped quite easily.

If this plan turns out helpful to you or some essential improvements are necessary, please, leave your thoughts and comments...


The Technology Company Case

What's a Technology Company?

I'm a programmer. Obviously, this means that I have to earn money and realize my talents working for some company that employs programmers (or on my own). It's worth noting that there are several kinds of such companies.

One is traditional enterprises, like banks or government agencies, that need programmers to automate their processes and improve output. Every company needs an accountant, and, likewise, nowadays every needs a programmer.

There are also companies that provide software development and related services - the so-called consulting or outsourcing firms. They employ programmers to automate the work and improve the output of, mainly, the first breed of companies.

Then, there are also technology product companies, like Instagram or Apple, that employ engineers to build their products, services or media, which are then consumed by ordinary people.

Finally, there are truly technology companies that produce new technology that is used by all the previous three groups, as well as by the technology companies themselves. From the business standpoint, this technology may be supplied either in the form of on-the-spot consulting work, licensing or even separate products.

Every group has some percentage of technology work in its operation. This work, often called R&D, comprises of implementation of existing technology (D in R&D) and creation of the new one (R). The share of the two differs substantially between the groups. The companies from the first one may be 1 to 10% dependent on R&D work and have almost 0% of R in it, the second group is 90% R&D work, still, with mere percents of R in it, the third group is just 30-50% R&D, and the share of R in it may rise to 10-20% but rarely more, and the last group should have 90% R&D with >50% R in it.

A technology company should be a thought leader in its sphere. This means not chasing fashions in our pop-culture-like industry but setting an example justified by technological excellence instead of marketing. This means building something that will last and have an impact for a substantially longer period of time than the ever-accelerating hype cycle. This means having an ultimate goal of solving hard technical problems and not chasing profits or market share. While product companies try to change the world by producing their innovative products that merely use technology, a technology company does that by producing technology that enables more innovative products. A closed vs an open approach.

10x Programmers

There's this popular meme of 10x programmers that constantly spurs discussion and flamewars among our peers. Is it just fad, who are those 10xers, do they really exist?

Let's first consider this question from the perspective of other crafts and professions. Are there 10x painters? Well, if we compare painter productivity by the number of pieces drawn it would be hard to tell. But if you think about price, clearly, there are even 1000x ones: an ordinary painter's work may cost $1000, and a famous masterpiece will be in the millions. If we consider the number of people reached the same rule applies: maybe, thousands will see quality works of a common professional painter, and millions or even billions - the works of a master. But you may say that painting, unlike programming, is an art. What about carpentry? Well, I'd compare with professions that require mostly intellectual work. Are there 100x doctors? Surely, there are those who saved 100x more people by inventing a new operation method or treatment. Lawyers? A person who writes a law impacts orders of magnitude more than an ordinary counselor at some random firm. This list may be continued on and on.

I've compiled a book called "Interviews with 100x programmers". To some extent, the name was an exaggeration. But, as they say, every joke has some truth in it. In fact, I fully subscribe to the 10x programmer concept. Moreover, I consider that there are not only 10x ones but also 100x, 1000x... Definitely, there are hardly any 10x coders, i.e. people who produce 10x the amount of code a good professional programmer will create in the same timeframe. But there's much more to programming than merely writing program code.

To be an order of magnitude more productive means to solve problems an order of magnitude more complex than the ones considered accessible at a given point in time. Obviously, such problems exist, and there will, probably, always be an unlimited supply of them. Also, it should be clear from the short history of computing that there are some people capable of bringing a new perspective, coming up with approaches that allow solving such problems either in a much better way or just solve them at all. As Alan Kay, who's for sure one of such 100x programmers, has famously said: "A change in perspective is worth 80 IQ points."

Still, there's more to it than just solving harder problems. Another popular explanation given to the 10x thing is that such a programmer is the one who makes 10 other programmers 2x more productive. This, from my point of view, implies the one who is showing a better approach, in other words, a thought leader, and the one who implements this vision in some technology that other programmers use. In fact, we're productive in our work at our current level mostly thanks to such prolific programmers: every day I use Unix, Emacs, Lisp, git and other tools that were initially conceived and built by a handful of the 10x programmers. Their vision and impulse made thousands and even millions more productive.

Those 10x programmers are the ones I'd like to be around at work. And so, my ideal company is the one that attracts such persons. And although a significant percent of such people are loners, most of them are also highly motivated by the presence of similar colleagues.

So which one of the 4 company types mentioned above will such people choose?

The first one is mostly out of consideration because in it the programmers are not the primary value creators - on the contrary, often they are considered a cost center. I.e. they are just another service function similar to an accountant or a janitor. Surely, there are exceptions to this rule when the company leaders realize the potential that technology change bears to their company, which, basically, means that the firm is transitioning to type 3. Even in such case, it's still a much less productive environment than a type 3 firm built with the right principles in mind from the start.

What about outsourcing companies? Their advantage is that programmers are their primary asset, which means that the company will be built around them, have a substantial number of them and will do a lot to attract and hold prominent people. The nature of work, unfortunately, is usually a severely limiting factor here. First of all, in most of the cases, the customer doesn't really care about the technological excellence or innovative nature of the result. The projects are in most of the cases counter-innovative, i.e. the more mundane, reproducible, and ordinary the technological solution that achieves the desired result is the better. And it's quite reasonable from the business standpoint: innovation is risky. This means that, ultimately, such companies reward uniformity and interchangeability of their stuff and their output, especially, since it's much easier to manage and scale. Have I mentioned that managing programmers is very hard (the common metaphor used is "herding cats")?

Now, let's look at product companies. Are they a heaven for 10x programmers? Well, a lot of such people flock there. One reason is that such companies understand the need for talented programmers because unlike the previous 2 types they may and should face unique technological challenges, and, moreover, their leadership is able to recognize that (type 1 companies also face those challenges, but usually they just don't view them from the technology standpoint). Yet, a product company is only X% new technology and another (100-X)% other things. What is the value of X? Maybe, it's 20-30% at Google or Facebook, and even less at smaller companies with fewer resources. Why? Because, as we discussed above, the ultimate goal of most of such companies is making money by serving masses of customers. This requires huge marketing, sales, operations, and support "vehicles" that employ professionals to operate and programmers to build, maintain and develop. But have quite little interesting technical challenges. Once again, this is the right thing from the business standpoint, especially if you have to earn more and more money each year and grow your market share. But focus on earnings and market share means that technological excellence becomes secondary. Surely, the best of the leaders and managers realize its importance, but they have to make many trade-offs all the time.

That's why I have singled out "pure" technology companies. Such organizations are naturally inclined to make tech excellency their focus. There are, surely, counterexamples that are infected with the Silicon Valley "growth virus" and try to win the market as fast as possible with marketing, but it doesn't mean that it always has to work that way. In my opinion, purely technological companies are the best place for 10x programmers because they will not merely utilize their work to some other end goal but have vested interest in amplifying its influence. They are not inclined to conceal the know-hows and innovations as trade secrets, but will benefit from sharing and promoting them. They may also provide maximum freedom of choice: of approaches, tools, supporting technologies, because their primary concern is not effective scaling of the same ultimately repetitive work to many similar programmers but creating breakthroughs. Their dependence on such ultra-productive programmers is existential.

I don't consider myself to be a 10x programmer, but, surely, I'd like to reach such level someday and I also aspire to work alongside them.

A Company I'd Build

All in all, being part of a technology company seems like the best choice for me both in terms of potential impact and possibilities to have 10x programmer colleagues. Eventually, either you have to join one or create one yourself. For the last 5 years, I've been working in the so-called AI, and my experience both from product company side and individual consultant work shows that demand for research-related technology expertise here is growing much faster than the supply. I see it as a chance for new technology companies to emerge and gather those few capable people in this field to amplify their impact. So I'm seriously considering starting a technology company, and I'm looking for like-minded people who share my values and vision to join our forces.

If I were to start such company, I'd build its foundation on a few things that really matter to me personally. Some principles or, as they used to call them, values. Unfortunately, the notion of "values" has somewhat lost its original meaning in the corporate world. When you see such qualities as effectiveness or adaptability cast as values that's a sign of such misconception. Values are something that you don't compromise upon at all. Surely, it's pointless to compromise any parts of your professionalism (such as effectiveness), so professionalism is a default value not even worth discussing. Real "values", however, are those aspects of your work culture that run a real risk of conflicting with the things that are considered universally important. In business, those are profits, market share, favorable competitive position. So, being true to your values means not forfeiting them even if you're going to lose in those basic areas.

Here is a list of the values that I subscribe to:

  • Technological excellence should be a basic trait of any technology company. For me, an example of applying such value would be using Lisp as a starting point for most of the solutions despite the fact that the language is quite unpopular and underappreciated - my personal experience shows that it works very well, especially in the fields that are heavily knowledge-based. Another example is that in a technology company literally everyone should be technology-savvy: even the office manager should be programming at times.
  • Personalism is the main quality that a company has to support in its dealings with all the people it's interacting with: employees, customers, contractors and providers. This means, for example, striving to provide flexible and productive working conditions to each employee instead of trying to fit everyone in the same conditions (because management is hard). Overall, lack of management competency should never become a limiting factor. One manifestation of this is that a modern technology company should be built as a distributed organization from day 1.
  • Ahimsa is an ancient word meaning not harming anyone. It is a little bit more than our modern-day ethics, but it's worth it. Why create something if you know that it will cause misery and suffering to others? In effect, this means, for example, refusal to provide services to companies that are clearly unethical.
  • Radical openness. As they say, "information wants to be free." :) Maximal sharing and minimal secrecy makes so many things much simpler. And in our lowest-common-denominator technology world, ultimately, the risk of competitors copying and abusing your work is much less than that of brilliant people not joining your cause because they just haven't heard of it.

So... If you're interested in solving complex AI challenges out of whatever part of the world you're living in, working with 10x programmers, using Lisp and other advanced technologies in the process - drop me a line, I'd be glad to chat.


Improving Lisp UX One Form at a Time

At the recent ELS, I presented a lightning talk about RUTILS and how I see it as a way of "modernizing" CL, i.e. updating the basic language elements to be simpler, clearer and more generic. Thus improving the everyday user experience and answering the complaints of outsiders about "historical cruft" in the Lisp standard. Indeed, Lisp has a lot of unrecognizable names (like mapcar and svref) or just unnecessary long ones (multiple-value-bind or defparameter), and out-of-the-box it lacks a lot of things that many current programmers are used to: unified generic accessors, generators, literal syntax for defining hash-tables or dynamic vectors etc. This may not be a problem for the people working with the language on a regular basis (or if it is they probably have a personal solution for that already), but it impedes communication with the outside world. I'd paid extra attention to that recently as I was preparing code examples for the experimental course on algorithms, which I teach now using Lisp instead of pseudocode (actually, modulo the naming/generics issue, Lisp is a great fit for that).

Unfortunately, the lightning talk format is too short for a good presentation of this topic, so here's a more elaborate post, in which I want to show a few examples from the RUTILS library of using Lisp's built-in capabilities to introduce clear, uniform, and generic syntactic abstractions that may be used alongside the standard Lisp operators, as well as replace them in the cases when we want to get more concise and understandable code.

What's cool about this problem is that, in Lisp, besides a common way to extend the language with functions and methods (and even macros/templates, which find they way into more and more languages), there are several other approaches to the problem that allow to tackle issues that can't be covered by functions and even macros. Those include, for instance, reader macros and aliasing. Aliasing is, actually, a rather simple idea (and can be, probably, implemented in other dynamic languages): duplicating functionality of existing functions or macros with a new name. The idea for such operator came from Paul Graham's "On Lisp" and it may be implemented in the following way (see a full implementation here):

(defmacro abbr (short long &optional lambda-list)
      ((macro-function ',long)
       (setf (macro-function ',short) (macro-function ',long)))
      ((fboundp ',long)
       (setf (fdefinition ',short) (fdefinition ',long))
       ,(when lambda-list
          `(define-setf-expander ,short ,lambda-list
             (values ,@(multiple-value-bind
                           (dummies vals store store-form access-form)
                            (cons long (remove-if (lambda (sym)
                                                    (member sym '(&optional &key)))
                         (let ((expansion-vals (mapcar (lambda (x) `(quote ,x))
                                                       (list dummies
                           (setf (second expansion-vals)
                                 (cons 'list vals))
      (t (error "Can't abbreviate ~a" ',long)))
     (setf (documentation ',short 'function) (documentation ',long 'function))

As you may have noticed, it is also capable of duplicating a setf-expander for a given function if the lambda-list is provided. Using abbr we can define a lot of shorthands or alternative names, and it is heavily used in RUTILS to provide more than 50 alternative names; we'll see some of them in this post. What this example shows is the malleability of Lisp, which allows approaching its own improvement from different angles depending on the problem at hand and the tradeoffs you're willing to make.

Introducing generic element access

One of the examples of historic baggage in CL is a substantial variety of different methods to access elements of collections, hash-tables, structures, and objects with no generic function unifying them. Not to say that other languages have a totally uniform accessor mechanism. Usually, there will be two or three general-purpose ways to organize it: dot notation for object field access, something square-braketish for array and other collections access, and some generic operator like get for all the other cases. And occasionally (e.g. in Python or C++) there are hooks to plug into the built-in operators. Still, it's a much smaller number than in Lisp, and what's more important, it's sufficiently distinct and non-surprising.

In Lisp, actually, nothing prevents us from doing even better — both better than the current state and than other languages — i.e. from having a fully uniform and extensible solution. At first approximation, it's just a matter of defining a generic function that will work on different container types and utilize all the existing optimized accessor functions in its methods. This interface will be extensible for any container object. In RUTILSX (a part of RUTILS where any experiments are allowed) this function is called generic-elt:

(defgeneric generic-elt (obj key &rest keys)
  (:method :around (obj key &rest keys)
    (reduce #'generic-elt keys :initial-value (call-next-method obj key))))

One important aspect you can see in this definition is the presence of an :around method that allows to chain multiple accesses in one call and dispatch each one to an appropriate basic method via call-next-method. Thus, we may write something like (generic-elt obj 'children 0 :key) to access, for instance, an element indexed by :key in a hash-table that is the first element of a sequence that is the contents of the slot children of some object obj.

The only problem with this function is its long name. Unfortunately, most of good short element access names, like elt and nth are already taken in the Common Lisp standard, while for RUTILS I've adopted a religious principle to retain full backward compatibility and don't alter anything from the standard. This is a critical point: not redefining CL, but building on top of it and extending it!

Moreover, element access has two features: it's a very common operation and it's also not a usual function that does some computation, so ideally it should have a short but prominent look in the code. The perfect solution occurred to me at one point: introduce an alias ? for it. Lisp allows to name operations with any characters, and a question mark, in my opinion, matches very well the inner intent of this operation: query a container-like object using a certain key. With it, our previous example becomes very succinct and cool: (? obj 'children 0 :key).

Additionally to element reading, there's also element write access. This operation in Lisp, like in most other languages, has a unified entry point called setf. There's a special interface to provide specific "methods" for it based on the accessor function. Yet, what to do when an access function is polymorphic? Well, provide polymorphic setter companion. (defsetf generic-elt generic-setf). Like generic-elt, generic-setf defers work to already defined specific setters:

(defmethod generic-setf ((obj list) key &rest keys-and-val)
  (setf (nth key obj) (atomize keys-and-val)))

And it also supports key chaining, so you can write: (setf (? obj 'children 0 :key) new-value).

Having this unified access functionality is nice and cool, but some people may still linger for the familiar dot object slot access syntax. We can't blame them: habits are a basis of good UX. Unfortunately, this is contrary to the Lisp way... But Lisp is a pro-choice and future-proof language: if you want something badly, even something not in the usual ways, almost always you can, actually, find a clean and supported means of implementing it. And this case is not an exception. If you can tolerate an small addition — a @-prefix to the object reference (that's also an extra prominent indicator of something unusual going on) — when accessing its slots you can define a reader macro that will expand forms @obj.slot into our (? obj 'slot) or a standard (slot-value obj 'slot). With it, we can write something like (? tokens, which is much more succinct and, arguably, readable than (elt tokens (slot-value (slot-value dep 'govr) 'id)).

Still, one issue remains unsolved in this approach: the preferred Lisp slot-access method is not via slot-value, but with an accessor method that is exported. And one of the reasons for it is that slot-names, which are usually short and can clash, are kept private to the package where they are defined. It means that in most cases @obj.slot will not work across packages. (Unlike the OO-languages in which every class is its own namespace, in Lisp, this function is not "complected" within the OO-system, and packages are a namespacing method, while objects serve for encapsulation and inheritance.)

There are two ways to tackle this problem. As I said, Lisp is future-proof: being thoroughly dynamic and extensible, CLOS defines a method that is called when there's a problem accessing an object's slot — slot-missing. Once again, we can define an :around method that will be a little smarter (?) and try to look up slot-name not only in the current package, but also in the class' original package.

(defmethod slot-missing :around
    (class instance slot-name (operation (eql 'slot-value)) &optional new-value)
  (declare (ignore new-value))
  (let ((class-package (symbol-package (class-name (class-of instance)))))
    (if (eql class-package (symbol-package slot-name))  ;; to avoid infinite looping
        (if-it (find-symbol (string-upcase slot-name) class-package)
               (slot-value instance it)

This is a rather radical way and comes at a cost: two additional virtual function calls (of the slot-missing method itself and an additional slot-value one). But in most of the cases it may be worth paying it for convenience's sake, especially, since you can always optimize a particular call-site by changing the code to the most direct (slot-value obj 'package::slot) variant. By the way, using slot accessor method is also costlier than just slot-value, so we are compensating here somewhat. Anyway, it's cool to have all the options on the table: beautiful slow and ugly fast method that our backward-compatibility approach allows us. As usual, you can't have a cake and eat it too...

Though, sometimes, you can. :) If you think more of this it becomes apparent that slot-value could be implemented this way from the start: look up the slot name in the class'es original package. As classes or structs are defined together with their slots it is very rare if not almost impossible to see slot-names not available in the package where their class is defined (you have to explicitly use a private name from another package when defining a class to do such a trick). So, slot-value should always look for slot names in the class'es package first. We can define a "smart" slot-value variant that will do just that, and with our nice generic-elt frontend it can easily integrated without breaking backward-compatibility.

(defun smart-slot-value (object slot-name)
  (slot-value object
              (or (find-symbol (string-upcase slot-name)
                               (symbol-package (class-name (class-of instance))))

Unifying variable binding with with

Almost everything in functional variable definition and binding was pioneered by Lisp at some point, including the concept of destructuring. Yet, the CL standard, once again, lacks unification in this area. There are at least 4 major constructs: let and let*, destructuring-bind and multiple-value-bind, and also a few specialized ones like with-slots or ppcre:register-groups-bind. One more thing to mention is that parallel assignment behavior of plain let can be implemented with destructuring-bind and multiple-value-bind. Overall, it just screams for uniting in a single construct, and already there have been a few attempts to do that (like metabang-bind). In RUTILS, I present a novel implementation of generic bind that has two distinct features: a more plausible name — with — and a simple method-based extension mechanism. The implementation is very simple: the binding construct selection is performed at compile-time based on the structure of the clause and, optionally, presence of special symbols in it:

(defmacro with ((&rest bindings) &body body)
  (let ((rez body))
    (dolist (binding (reverse bindings))
      (:= rez `((,@(call #'expand-binding binding rez)))))
    (first rez)))

A very short number of methods covering the basic cases are defined:

  • the first one expands to let or multiple-value-bind depending on the number of symbols in the clause (i.e. for multiple values you should have more than 2)
  • the second group triggers when the first element of the clause is a list and defaults to destructruing-bind, but has special behaviors for 2 symbols ? and @ generating clauses for our generic element access and smart slot access discussed in the previous section

(defun expand-binding (binding form)
  (append (apply #'bind-dispatch binding)

(defgeneric bind-dispatch (arg1 arg2 &rest args)
  (:method ((arg1 symbol) arg2 &rest args)
    (if args
        `(multiple-value-bind (,arg1 ,arg2 ,@(butlast args)) ,(last1 args))
        `(let ((,arg1 ,arg2)))))
  (:method ((arg1 list) (arg2 (eql '?)) &rest args)
    `(let (,@(mapcar (lambda (var-key)
                       `(,(first (mklist var-key))
                         (? ,(first args) ,(last1 (mklist var-key)))))
  (:method ((arg1 list) (arg2 (eql '@)) &rest args)
    (with-gensyms (obj)
      `(let* ((,obj ,(first args))
              ,@(mapcar (lambda (var-slot)
                          `(,(first (mklist var-slot))
                            (smart-slot-value ,obj ',(last1 (mklist var-slot)))))
  (:method ((arg1 list) arg2 &rest args)
    `(destructuring-bind ,arg1 ,arg2)))
In a sense, it's a classic example of combining generic-functions and macros to create a clean and extensible UI. Another great benefit of using with is reduced code nesting that can become quite deep with the standard operators. Here's one of the examples from my codebase:

(with (((stack buffer ctx) @ parser)
       (fs (extract-fs parser interm))
       (((toks :tokens) (cache :cache)) ? ctx))
And here's how it would have looked in plain CL:

(with-slots (stack buffer ctx) parser
  (let ((fs (extract-fs parser interm)))
        (toks (gethash :tokens ctx))
        (cache (gethash :cache ctx)))

Implementing simple generators on top of signals

One of my friends and a Lisp enthusiast, Valery Zamarayev, who's also a long-time Python user, once complained that the only thing that he misses in CL from Python is generators. This feature is popular in many dynamic languages, such as Ruby or Perl, and even Java 8 has introduced something similar. Sure, there are multiple ways to implement lazy evaluation in Lisp with many libraries for that, like SERIES, pygen or CLAZY. We don't have to wait for another version of the spec (especially, since it's not coming 8-)

In RUTILS I have discovered, I believe, a novel and a very clean way to implement generators — on top of the signal system. The signal or condition facility is, by the way, one of the most underappreciated assets of Common Lisp that often comes to rescue in seemingly dead ends of control flow implementation. And Kent Pitman's description of it is one of my favorite reads in Computer Science. Anyway, here's all you need to implement Python-style generators in Lisp:

(define-condition generated ()
  ((item :initarg :item :reader generated-item)))

(defun yield (item)
  (restart-case (signal 'generated :item item)
    (resume () item)))

(defmacro doing ((item generator-form &optional result) &body body)
  (with-gensyms (e)
    `(block nil
       (handler-bind ((generated (lambda (,e)
                                   (let ((,item (generated-item ,e)))
                                     (invoke-restart (find-restart 'resume))))))

The doing macro works just like dolist, but iterating the generator form instead of an existing sequence. As you can see from this example, restarts are like generators in disguise. Or, to be more correct, they are a more general way to handle such functionality, and it takes just a thin layer of syntactic sugar to adapt them to a particular usage style.

And a few mischiefs

We have seen three different approaches to extending CL in order to accommodate new popular syntactic constructs and approaches. Lastly, I wanted to tread a little in the "danger zone" that may be considered unconventional or plain bad-style by many lispers — modifying syntax at the reader level. One thing that Clojure (following other dynamic languages before it), I believe, has proven is the importance of shorthand literal notation for popular operations. CL standard has predated this understanding: although it has specific print representations for various important objects, and even a special syntax for static arrays. Yet, the language is really future-proof in this respect, because it provides a way to hook into the reader mechanism by modifying the readtables. It was further smoothed and packaged by the popular NAMED-READTABLES library, which allows to treat readtables similar to packages. In RUTILS I have defined several extended readtables that implement a few shortcuts that are used literally in every second function or macro I define in my code. These include:

  • a shorthand notation for zero-, one- or two-argument lambda functions: ^(+ % %%) expands into (lambda (% %%) (+ % %%))
  • a literal syntax for hash-tables: #h(equal "key" "val") will create a EQUAL-hash-table with one key-value pair
  • a syntax for heredoc-strings: #/this quote (") shouldn't be escaped/# (which, unfortunately, doesn't always work smoothly in the repl)

Overall, I have experimented a lot with naming — it was sort of my obsession in this work to find short and obvious names for new things, many of which substitute the existing functionality, under the constraints of not altering what's already in the standard. For this sake, I've ventured into non-character symbols and even the keyword package — a major offence, I reckon... And here are a few of the findings I wanted to share (besides ? and with mentioned previously):

  • call is a new alias for funcall — I suppose, in the 70's it was a really fun experience to call a function hence the name, but now its too clumsy
  • get#, set#, and getset# are aliases and new operations for #-tables (when you can't or won't use ? for that)
  • finally, the grandest mischief is := (alongside :+, :-, :*, :/), which is an alias for setf (and, you've guessed it, incf etc). The justification for this is that everyone is confused about the -f, that setting a variable is a very important operation that we should immediately notice in our clean and functional code ;), and that := is a very familiar syntax for it even used by some languages, such as Pascal or golang. It may be controversial, but it's super-convenient.

The only thing I failed to find a proper renaming for so far is mapcar. It is another one of those emblematic operations that should be familiar to everyone, yet -car creates confusion. For now, I resist the temptation to rename map into map-into and make map smarter by using the first sequence's type for the result expression. However, there's no plausible alternative variant I was able to find even among the zoo of other language's naming of this concept. Any thoughts?

PS. Those were a few prominent examples, but RUTILS, in fact, has much more to offer. A lot of stuff was borrowed from other utility projects, as well as implemented from scratch: anaphoric operators, the famous iter — a replacement for loop, Clojure-style threading macros, a new semantic pair data type to replace cons-cells, lots of utilities to work with the standard data structures (sequences, vectors, hash-tables, strings) making them truly first-class, iteration with explicit indices etc etc. With all that in the toolbox, there's now no ground to claim that Lisp is in any aspect inferior in terms of day-to-day UX compared to some other language, be it Haskell, Ruby or Clojure. Surely, I'm not talking about the semantic differences here.


European Lisp Symposium 2016

The last two days, I'm at the ELS2016. So far, it's being a great experience - I've actually forgotten the joy of being in one room with several dozens of Lisp enthusiasts. The peculiarity of this particular event is that it's somewhere in the middle between a scientific conference, like ACL, that I had a chance to attend in the recent years thanks to my work at Grammarly, and a tech gathering: it employs the same peer reviewed approach and a scientific presentation style you will find at the research conferences, but most of the topics are very applied and engineering-related.

Anyway, the program was really entertaining with several deep and insightful presentations (here are the proceedings). The highlights for me were the talks on the heterogenous sequences type-checker implementation based on the Lisp declare facility (that I'm growing more and more fond) by Jim Newton and a presentation of an image-processing DSL that's an excellent example of the Lisp state-of-the-art approach in DSL design by Kai Selgrad. Other things like a description of the editor buffers protocol, local variables preservation technic were also quite insightful. And other good stuff is coming...

It's also great to hear new people bringing fresh ideas alongside old-timers sharing their wisdom and perspective - one of the things I appreciate in the Common Lisp community.

Near the end, I'm going to present a lightning talk about RUTILS and how I view it as a vehicle for evolving the Common Lisp user experience.