Programming Algorithms: Linked Lists

Linked data structures are in many ways the opposite of the contiguous ones that we have explored to some extent in the previous chapter using the example of arrays. In terms of complexity, they fail where those ones shine (first of all, at random access) — but prevail at scenarios when a repeated modification is necessary. In general, they are much more flexible and so allow the programmer to represent almost any kind of a data structure, although the ones that require such level of flexibility may not be too frequent. Usually, they are specialized trees or graphs.

The basic linked data structure is a singly-linked list.

Just like arrays, lists in Lisp may be created both with a literal syntax for constants and by calling a function — make-list — that creates a list of a certain size filled with nil elements. Besides, there's a handy list utility that is used to create lists with the specified content (the analog of vec).

CL-USER> '("hello" world 111)
("hello" WORLD 111)
CL-USER> (make-list 3)
CL-USER> (list "hello" 'world 111)
("hello" WORLD 111)

An empty list is represented as () and, interestingly, in Lisp, it is also a synonym of logical falsehood (nil). This property is used very often, and we'll have a chance to see that.

If we were to introduce our own lists, which may be quite a common scenario in case the built-in ones' capabilities do not suit us, we'd need to define the structure "node", and our list would be built as a chain of such nodes. We might have wanted to store the list head and, possibly, tail, as well as other properties like size. All in all, it would look like the following:

(defstruct list-cell

(defstruct our-own-list
  (head nil :type (or list-cell null))
  (tail nil :type (or list-cell null)))

CL-USER> (let ((tail (make-list-cell :data "world")))
            :head (make-list-cell
                   :data "hello"
                   :next tail)
            :tail tail))
            :DATA "hello"
            :NEXT #S(LIST-CELL :DATA "world" :NEXT NIL))

Lists as Sequences

Alongside arrays, list is the other basic data structure that implements the sequence abstract data type. Let's consider the complexity of basic sequence operations for linked lists:

  • so-called random access, i.e. access by index of a random element, requires O(n) time as we have to traverse all the preceding elements before we can reach the desired one (n/2 operations on average)
  • yet, once we have reached some element, removing it or inserting something after it takes O(1)
  • subsequencing is also O(n)

Getting the list length, in the basic case, is also O(n) i.e. it requires full list traversal. It is possible, though, to store list length as a separate slot, tracking each change on the fly, which means O(1) complexity. Lisp, however, implements the simplest variant of lists without size tracking. This is an example of a small but important decision that real-world programming is full of. Why is such a solution the right thing™, in this case? Adding the size counter to each list would have certainly made this common length operation more effective, but the cost of doing that would've included: increase in occupied storage space for all lists, a need to update size in all list modification operations, and, possibly, a need for a more complex cons cell implementation[1]. These considerations make the situation with lists almost opposite to arrays, for which size tracking is quite reasonable because they change much less often and not tracking the length historically proved to be a terrible security decision. So, what side to choose? A default approach is to prefer the solution which doesn't completely rule out the alternative strategy. If we were to choose a simple cons-cell sans size (what the authors of Lisp did) we'll always be able to add the "smart" list data structure with the size field, on top of it. Yet, stripping the size field from built-in lists won't be possible. Similar reasoning is also applicable to other questions, such as: why aren't lists, in Lisp, doubly-linked. Also, it helps that there's no security implication as lists aren't used as data exchange buffers, for which the problem manifests itself.

For demonstration, let's add the size field to our-own-list (and, meanwhile, consider all the functions that will need to update it...):

(defstruct our-own-list
  (head nil :type (or list-cell nil))
  (tail nil :type (or list-cell nil))
  (size 0 :type (integer 0)))

Given that obtaining the length of a list, in Lisp, is an expensive operation, a common pattern in programs that require multiple requests of the length field is to store its value in some variable at the beginning of the algorithm and then use this cached value, updating it if necessary.

As we see, lists are quite inefficient in random access scenarios. However, many sequences don't require random access and can satisfy all the requirements of a particular use case using just the sequential one. That's one of the reasons why they are called sequences, after all. And if we consider the special case of list operations at index 0 they are, obviously, efficient: both access and addition/removal is O(1). Also, if the algorithm requires a sequential scan, list traversal is rather efficient too, although not as good as array traversal for it still requires jumping over the memory pointers. There are numerous sequence operations that are based on sequential scans. The most common is map, which we analyzed in the previous chapter. It is the functional programming alternative to looping, a more high-level operation, and thus simpler to understand for the common cases, although less versatile.

map is a function that works with different types of built-in sequences. It takes as the first argument the target sequence type (if nil is supplied it won't create the resulting sequence and so will be used just for side-effects). Here is a polymorphic example involving lists and vectors:

CL-USER> (map 'vector '+
              '(1 2 3 4 5)
              #(1 2 3))
#(2 4 6)

map applies the function provided as its second argument (here, addition) sequentially to every element of the sequences that are supplied as other arguments, until one of them ends, and records the result in the output sequence. map would have been even more intuitive, if it just had used the type of the first argument for the result sequence, i.e. be a "do what I mean" dwim-map, while a separate advanced variant with result-type selection might have been used in the background. Unfortunately, the current standard scheme is not for change, but we can define our own wrapper function:

(defun dwim-map (fn seq &rest seqs)
  "A thin wrapper over MAP that uses the first SEQ's type for the result."
  (apply 'map (type-of seq) fn seqs))

map in Lisp is, historically, used for lists. So there's also a number of list-specific map variants that predated the generic map, in the earlier versions of the language, and are still in wide use today. These include mapcar, mapc, and mapcan (replaced in RUTILS by a safer flat-map). Now, let's see a couple of examples of using mapping. Suppose that we'd like to extract odd numbers from a list of numbers. Using mapcar as a list-specific map we might try to call it with an anonymous function that tests its argument for oddity and keeps them in such case:

CL-USER> (mapcar (lambda (x) (when (oddp x) x))
                 (range 1 10))
(1 NIL 3 NIL 5 NIL 7 NIL 9)

However, the problem is that non-odd numbers still have their place reserved in the result list, although it is not filled by them. Keeping only the results that satisfy (or don't) certain criteria and discarding the others is a very common pattern that is known as "filtering". There's a set of Lisp functions for such scenarios: remove, remove-if, and remove-if-not, as well as RUTILS' complements to them keep-if and keep-if-not. We can achieve the desired result adding remove to the picture:

CL-USER> (remove nil (mapcar (lambda (x) (when (oddp x) x))
                             (range 1 10)))
(1 3 5 7 9)

A more elegant solution will use the remove-if(-not) or keep-if(-not) variants. remove-if-not is the most popular among these functions. It takes a predicate and a sequence and returns the sequence of the same type holding only the elements that satisfy the predicate:

CL-USER> (remove-if-not 'oddp (range 1 10))
(1 3 5 7 9)

Using such high-level mapping functions is very convenient, which is why there's a number of other -if(-not) operations, like find(-if(-not)), member(-if(-not)), position(-if(-not)), etc.

The implementation of mapcar or any other list mapping function, including your own task-specific variants, follows the same pattern of traversing the list accumulating the result into another list and reversing it, in the end:

(defun simple-mapcar (fn list)
  (let ((rez ()))
    (dolist (item list)
      (:= rez (cons (call fn item) rez)))
    (reverse rez)))

The function cons is used to add an item to the beginning of the list. It creates a new list head that points to the previous list as its tail.

From the complexity point of view, if we compare such iteration with looping over an array we'll see that it is also a linear traversal that requires twice as many operations as with arrays because we need to traverse the result fully once again, in the end, to reverse it. Its advantage, though, is higher versatility: if we don't know the size of the resulting sequence (for example, in the case of remove-if-not) we don't have to change anything in this scheme and just add a filter line ((when (oddp item) ...), while for arrays we'd either need to use a dynamic array (that will need constant resizing and so have at least the same double number of operations) or pre-allocate the full-sized result sequence and then downsize it to fit the actual accumulated number of elements, which may be problematic when we deal with large arrays.

Lists as Functional Data Structures

The distinction between arrays and linked lists in many ways reflects the distinction between the imperative and functional programming paradigms. Within the imperative or, in this context, procedural approach, the program is built out of low-level blocks (conditionals, loops, and sequentials) that allow for the most fine-tuned and efficient implementation, at the expense of abstraction level and modularization capabilities. It also heavily utilizes in-place modification and manual resource management to keep overhead at a minimum. An array is the most suitable data-structure for such a way of programming. Functional programming, on the contrary, strives to bring the abstraction level higher, which may come at a cost of sacrificing efficiency (only when necessary, and, ideally, only for non-critical parts). Functional programs are built by combining referentially transparent computational procedures (aka "pure functions") that operate on more advanced data structures (either persistent ones or having special access semantics, e.g. transactional) that are also more expensive to manage but provide additional benefits.

Singly-linked lists are a simple example of functional data structures. A functional or persistent data structure is the one that doesn't allow in-place modification. In other words, to alter the contents of the structure a fresh copy with the desired changes should be created. The flexibility of linked data structures makes them suitable for serving as functional ones. We have seen the cons operation that is one of the earliest examples of non-destructive, i.e. functional, modification. This action prepends an element to the head of a list, and as we're dealing with the singly-linked list the original doesn't have to be updated: a new cons cell is added in front of it with its next pointer referencing the original list that becomes the new tail. This way, we can preserve both the pointer to the original head and add a new head. Such an approach is the basis for most of the functional data structures: the functional trees, for example, add a new head and a new route from the head to the newly added element, adding new nodes along the way — according to the same principle.

It is interesting, though, that lists can be used in destructive and non-destructive fashion likewise. There are both low- and high-level functions in Lisp that perform list modification, and their existence is justified by the use cases in many algorithms. Purely functional lists render many of the efficient list algorithms useless. One of the high-level list modification function is nconc. It concatenates two lists together updating in the process the next pointer of the last cons cell of the first list:

CL-USER> (let ((l1 (list 1 2 3))
               (l2 (list 4 5 6)))
           (nconc l1 l2)  ; note no assignment to l1
           l1)            ; but it is still changed
(1 2 3 4 5 6)

There's a functional variant of this operation, append, and, in general, it is considered distasteful to use nconc for two reasons:

  • the risk of unwarranted modification
  • funny enough, the implementation of nconc, actually, isn't mandated to be more efficient than that of append

So, forget nconc, append all the lists!

Using append we'll need to modify the previous piece of code because otherwise the newly created list will be garbage-collected immediately:

CL-USER> (let ((l1 (list 1 2 3))
               (l2 (list 4 5 6)))
           (:= l1 (append l1 l2))
(1 2 3 4 5 6)

The low-level list modification operations are rplaca and rplacd. They can be combined with list-specific accessors nth and nthcdr that provide indexed access to list elements and tails respectively. Here's, for example, how to add an element in the middle of a list:

CL-USER> (let ((l1 (list 1 2 3)))
           (rplacd (nthcdr 0 l1)
                   (cons 4 (nthcdr 1 l1)))
(1 4 2 3)

Just to re-iterate, although functional list operations are the default choice, for efficient implementation of some algorithms, you'll need to resort to the ugly destructive ones.

Different Kinds of Lists

We have, thus far, seen the most basic linked list variant — a singly-linked one. It has a number of limitations: for instance, it's impossible to traverse it from the end to the beginning. Yet, there are many algorithms that require accessing the list from both sides or do other things with it that are inefficient or even impossible with the singly-linked one, hence other, more advanced, list variants exist.

But first, let's consider an interesting tweak to the regular singly-linked list — a circular list. It can be created from the normal one by making the last cons cell point to the first. It may seem like a problematic data structure to work with, but all the potential issues with infinite looping while traversing it are solved if we keep a pointer to any node and stop iteration when we encounter this node for the second time. What's the use for such structure? Well, not so many, but there's a prominent one: the ring buffer. A ring or circular buffer is a structure that can hold a predefined number of items and each item is added to the next slot of the current item. This way, when the buffer is completely filled it will wrap around to the first element, which will be overwritten at the next modification. By our buffer-filling algorithm, the element to be overwritten is the one that was written the earliest for the current item set. Using a circular linked list is one of the simplest ways to implement such a buffer. Another approach would be to use an array of a certain size moving the pointer to the next item by incrementing an index into the array. Obviously, when the index reaches array size it should be reset to zero.

A more advanced list variant is a doubly-linked one, in which all the elements have both the next and previous pointers. The following definition, using inheritance, extends our original list-cell with a pointer to the previous element. Thanks to the basic object-oriented capabilities of structs, it will work with the current definition of our-own-list as well, and allow it to function as a doubly-linked list.

(defstruct (list-cell2 (:include list-cell))

Yet, we still haven't shown the implementation of the higher-level operations of adding and removing an element to/from our-own-list. Obviously, they will differ for singly- and doubly-linked lists, and that distinction will require us to differentiate the doubly-linked list types. That, in turn, will demand invocation of a rather heavy OO-machinery, which is beyond the subject of this book. Instead, for now, let's just examine the basic list addition function, for the doubly-linked list:

(defun our-cons2 (data list)
  (when (null list) (:= list (make-our-own-list)))
  (let ((new-head (make-list-cell2
                   :data data 
                   :next (when list @list.head))))
    (:= @list.head.prev new-head)
     :head new-head
     :tail @list.tail
     :size (1+ @list.size))))

The first thing to note is the use of the @ syntactic sugar, from RUTILS, that implements the mainstream dot notation for slot-value access (i.e. @list.head.prev refers to the prev field of the head field of the provided list structure of the assumed our-own-list type, which in a more classically Lispy, although cumbersome, variants may look like one of the following: (our-cons2-prev (our-own-list-head list)) or (slot-value (slot-value list 'head) 'prev)[2]).

More important here is that, unlike for the singly-linked list, this function requires an in-place modification of the head element of the original list: setting its prev pointer. Immediately making doubly-linked lists non-persistent.

Finally, the first line is the protection against trying to access the null list (that will result in a much-feared, especially in Java-land, null-pointer exception class of error).

At first sight, it may seem that doubly-linked lists are more useful than singly-linked ones. But they also have higher overhead so, in practice, they are used quite sporadically. We may see just a couple of use cases on the pages of this book. One of them is presented in the next part — a double-ended queue.

Besides doubly-linked, there are also association lists that serve as a variant of key-value data structures. At least 3 types may be found in Common Lisp code, and we'll briefly discuss them in the chapter on key-value structures. Finally, a skip list is a probabilistic data structure based on singly-linked lists, that allows for faster search, which we'll also discuss in a separate chapter on probabilistic structures. Other more esoteric list variants, such as self-organized list and XOR-list, may also be found in the literature — but very rarely, in practice.


The flexibility of lists allows them to serve as a common choice for implementing a number of popular abstract data structures.


A queue or FIFO has the following interface:

  • enqueue an item at the end
  • dequeue the first element: get it and remove it from the queue

It imposes a first-in-first-out (FIFO) ordering on the elements. A queue can be implemented directly with a singly-linked list like our-own-list. Obviously, it can also be built on top of a dynamic array but will require permanent expansion and contraction of the collection, which, as we already know, isn't the preferred scenario for their usage.

There are numerous uses for the queue structures for processing items in a certain order (some of which we'll see in further chapters of this book).


A stack or LIFO (last-in-first-out) is even simpler than a queue, and it is used even more widely. Its interface is:

  • push an item on top of the stack making it the first element
  • pop an item from the top: get it and remove it from the stack

A simple Lisp list can serve as a stack, and you can see such uses in almost every file with Lisp code. The most common pattern is result accumulation during iteration: using the stack interface, we can rewrite simple-mapcar in an even simpler way (which is idiomatic Lisp):

(defun simple-mapcar (fn list)
  (let ((rez ()))
    (dolist (item list)
      (push (call fn item) rez))
    (reverse rez)))

Stacks hold elements in reverse-chronological order and can thus be used to keep the history of changes to be able to undo them. This feature is used in procedure calling conventions by the compilers: there exists a separate segment of program memory called the Stack segment, and when a function call happens (beginning from the program's entry point called the main function in C) all of its arguments and local variables are put on this stack as well as the return address in the program code segment where the call was initiated. Such an approach allows for the existence of local variables that last only for the duration of the call and are referenced relative to the current stack head and not bound to some absolute position in memory like the global ones. After the procedure call returns, the stack is "unwound" and all the local data is forgotten returning the context to the same state in which it was before the call. Such stack-based history-keeping is a very common and useful pattern that may be utilized in userland code likewise.

Lisp itself also uses this trick to implement global variables with a capability to have context-dependent values through the extent of let blocks: each such variable also has a stack of values associated with it. This is one of the most underappreciated features of the Lisp language used quite often by experienced lispers. Here is a small example with a standard global variable (they are called special in Lisp parlance due to this special property) *standard-output* that stores a reference to the current output stream:

CL-USER> (print 1)
CL-USER> (let ((*standard-output* (make-broadcast-stream)))
           (print 1))

In the first call to print, we see both the printed value and the returned one, while in the second — only the return value of the print function, while it's output is sent, effectively, to /dev/null.

Stacks can be also used to implement queues. We'll need two of them to do that: one will be used for enqueuing the items and another — for dequeuing. Here's the implementation:

(defstruct queue

(defun enqueue (item queue)
  (push item @queue.head))

(defun dequeue (queue)
  ;; Here and in the next condition, we use the property that an empty list
  ;; is also logically false. This is discouraged by many Lisp style-guides,
  ;; but, in many cases, such code is not only more compact but also more clear.
  (unless @queue.tail
    (do ()
        ((null @queue.head))  ; this loop continues until head becomes empty
      (push (pop @queue.head) @queue.tail)))
      ;; By pushing all the items from the head to the tail we reverse
      ;; their order — this is the second reversing that cancels the reversing
      ;; performed when we push the items to the head and it restores the original order.
  (when @queue.tail
    (values (pop @queue.tail)
            t)))  ; this second value is used to indicate that the queue was not empty
CL-USER> (let ((q (make-queue)))
           (print q)
           (enqueue 1 q)
           (enqueue 2 q)
           (enqueue 3 q)
           (print q)
           (print q)
           (dequeue q)
           (print q)
           (enqueue 4 q)
           (print q)
           (dequeue q)
           (print q)
           (dequeue q)
           (print q)
           (dequeue q)
           (print q)
           (dequeue q))
#S(QUEUE :HEAD (3 2 1) :TAIL NIL) 
#S(QUEUE :HEAD (3 2 1) :TAIL NIL) 
#S(QUEUE :HEAD (4) :TAIL (2 3)) 
#S(QUEUE :HEAD (4) :TAIL (3)) 
NIL  ; no second value indicates that the queue is now empty

Such queue implementation still has O(1) operation times for enqueue/dequeue. Each element will experience exactly 4 operations: 2 pushs and 2 pops (for the head and tail).

Another stack-based structure is the stack with a minimum element, i.e. some structure that not only holds elements in LIFO order but also keeps track of the minimum among them. The challenge is that if we just add the min slot that holds the current minimum, when this minimum is popped out of the stack we'll need to examine all the remaining elements to find the new minimum. We can avoid this additional work by adding another stack — a stack of minimums. Now, each push and pop operation requires us to also check the head of this second stack and, in case the added/removed element is the minimum, push it to the stack of minimums or pop it from there, accordingly.

A well-known algorithm that illustrates stack usage is fully-parenthesized arithmetic expressions evaluation:

(defun arith-eval (expr)
  "EXPR is a list of symbols that may include:
   square brackets, arithmetic operations, and numbers."
  (let ((ops ())
        (vals ())
        (op nil))
    (dolist (item expr)
      (case item
        ([ ) ; do nothing
        ((+ - * /) (push item ops))
        (] (:= op (pop ops)
               val (pop vals))
           (case op
             (+ (:+ val (pop vals)))
             (- (:- val (pop vals)))
             (* (:* val (pop vals)))
             (/ (:/ val (pop vals))))
           (push val vals))
        (otherwise (push item vals))))
    (pop vals)))

CL-USER> (arith-eval '([ 1 + [ [ 2 + 3 ] * [ 4 * 5 ] ] ] ]))


A deque is a short name for a double-ended queue, which can be traversed in both orders: FIFO and LIFO. It has 4 operations: push-front and push-back (also called shift), pop-front and pop-back (unshift). This structure may be implemented with a doubly-linked list or likewise a simple queue with 2 stacks. The difference for the 2-stacks implementation is that now items may be pushed back and forth between head and tail depending on the direction we're popping from, which results in worst-case linear complexity of such operations: when there's constant alteration of front and back directions.

The use case for such structure is the algorithm that utilizes both direct and reverse ordering: a classic example being job-stealing algorithms, when the main worker is processing the queue from the front, while other workers, when idle, may steal the lowest priority items from the back (to minimize the chance of a conflict for the same job).

Stacks in Action: SAX Parsing

Custom XML parsing is a common task for those who deal with different datasets, as many of them come in XML form, for example, Wikipedia and other Wikidata resources. There are two main approaches to XML parsing:

  • DOM parsing reads the whole document and creates its tree representation in memory. This technique is handy for small documents, but, for huge ones, such as the dump of Wikipedia, it will quickly fill all available memory. Also, dealing with the deep tree structure, if you want to extract only some specific pieces from it, is not very convenient.
  • SAX parsing is an alternative variant that uses the stream approach. The parser reads the document and, upon completing the processing of a particular part, invokes the relevant callback: what to do when an open tag is read, when a closed one, and with the contents of the current element. These actions happen for each tag, and we can think of the whole process as traversing the document tree utilizing the so-called "visitor pattern": when visiting each node we have a chance to react after the beginning, in the middle, and in the end.

Once you get used to SAX parsing, due to its simplicity, it becomes a tool of choice for processing XML, as well as JSON and other formats that allow for a similar stream parsing approach. Often the simplest parsing pattern is enough: remember the tag we're looking at, and when it matches a set of interesting tags, process its contents. However, sometimes, we need to make decisions based on the broader context. For example, let's say, we have the text marked up into paragraphs, which are split into sentences, which are, in turn, tokenized. To process such a three-level structure, with SAX parsing, we could use the following outline (utilizing CXML library primitives):

(defclass text-sax (sax:sax-parser-mixin)
  ((parags :initform nil :accessor sax-parags)
   (parag :initform nil :accessor sax-parag)
   (sent :initform nil :accessor sax-sent)
   (tag :initform nil :accessor sax-tag)))

(defmethod sax:start-element ((sax text-sax)
                              namespace-uri local-name qname attrs)
  (declare (ignore namespace-uri qname attrs))
  (:= (sax-tag sax) (mkeyw local-name))

(defmethod sax:end-element ((sax text-sax)
                            namespace-uri local-name qname)
  (declare (ignore namespace-uri qname))
  (with-slots (tag parags sent) sax
    (case tag
      (:paragraph (push (reverse parag) parags)
                  (:= parag nil))
      (:sentence (push (reverse sent) parag)
                 (:= sent nil)))))

(defmethod sax:characters ((sax text-sax) text)
  (when (eql :token (sax-tag sax))
    (push text (sax-sent sax)))

(defmethod sax:end-document ((sax text-sax))
  (reverse (sax-parags sax)))

This code will return the accumulated structure of paragraphs from the sax:end-document method. And two stacks: the current sentence and the current paragraph are used to accumulate intermediate data while parsing. In a similar fashion, another stack of encountered tags might have been used to exactly track our position in the document tree if there were such necessity. Overall, the more you'll be using SAX parsing, the more you'll realize that stacks are enough to address 99% of the arising challenges.

Lists as Sets

Another very important abstract data structure is a Set. It is a collection that holds each element only once no matter how many times we add it there. This structure may be used in a variety of cases: when we need to track the items we have already seen and processed, when we want to calculate some relations between groups of elements,s and so forth.

Basically, its interface consists of set-theoretic operations:

  • add/remove an item
  • check whether an item is in the set
  • check whether a set is a subset of another set
  • union, intersection, difference, etc.

Sets have an interesting aspect that an efficient implementation of element-wise operations (add/remove/member) and set-wise (union/...) require the use of different concrete data-structures, so a choice should be made depending on the main use case. One way to implement sets is by using linked lists. Lisp has standard library support for this with the following functions:

  • adjoin to add an item to the list if it's not already there
  • member to check for item presence in the set
  • subsetp for subset relationship query
  • union, intersection, set-difference, and set-exclusive-or for set operations

This approach works well for small sets (up to tens of elements), but it is rather inefficient, in general. Adding an item to the set or checking for membership will require O(n) operations, while, in the hash-set (that we'll discuss in the chapter on key-value structures), these are O(1) operations. A naive implementation of union and other set-theoretic operations will require O(n^2) as we'll have to compare each element from one set with each one from the other. However, if our set lists are in sorted order set-theoretic operations can be implemented efficiently in just O(n) where n is the total number of elements in all sets, by performing a single linear scan over each set in parallel. Using a hash-set will also result in the same complexity.

Here is a simplified implementation of union for sets of numbers built on sorted lists:

(defun sorted-union (s1 s2)
  (let ((rez ()))
    (do ()
        ((and (null s1) (null s2)))
      (let ((i1 (first s1))
            (i2 (first s2)))
        (cond ((null i1) (dolist (i2 s2)
                           (push i2 rez))
              ((null i2) (dolist (i1 s1)
                           (push i1 rez))
              ((= i1 i2) (push i1 rez)
                         (:= s1 (rest s1)
                             s2 (rest s2)))
              ((< i1 i2) (push i1 rez)
                         (:= s1 (rest s1)))
              ;; just T may be used instead
              ;; of the following condition
              ((> i1 i2) (push i2 rez)
                         (:= s2 (rest s2))))))
    (reverse rez)))

CL-USER> (sorted-union '(1 2 3)
                       '(0 1 5 6))
(0 1 2 3 5 6)

This approach may be useful even for unsorted list-based sets as sorting is a merely O(n * log n) operation. Even better though, when the use case requires primarily set-theoretic operations on our sets and the number of changes/membership queries is comparatively low, the most efficient technique may be to keep the lists sorted at all times.

Merge Sort

Speaking about sorting, the algorithms we discussed for array sorting in the previous chapter do not work as efficient for lists for they are based on swap operations, which are O(n), in the list case. Thus, another approach is required, and there exist a number of efficient list sorting algorithms, the most prominent of which is Merge sort. It works by splitting the list into two equal parts until we get to trivial one-element lists and then merging the sorted lists into the bigger sorted ones. The merging procedure for sorted lists is efficient as we've seen in the previous example. A nice feature of such an approach is its stability, i.e. preservation of the original order of the equal elements, given the proper implementation of the merge procedure.

(defun merge-sort (list comp)
  (if (null (rest list))
      (let ((half (floor (length list) 2)))
        (merge-lists (merge-sort (subseq seq 0 half) comp)
                     (merge-sort (subseq seq half) comp)

(defun merge-lists (l1 l2 comp)
  (let ((rez ())
    (do ()
        ((and (null l1) (null l2)))
      (let ((i1 (first l1))
            (i2 (first l2)))
        (cond ((null i1) (dolist (i l2)
                           (push i rez))
              ((null i2) (dolist (i l1)
                           (push i rez))
              ((call comp i1 i2) (push i1 rez)
                                 (:= l1 (rest l1)))
              (t (push i2 rez)
                 (:= l2 (rest l2))))))
    (reverse rez)))

The same complexity analysis as for binary search applies to this algorithm. At each level of the recursion tree, we perform O(n) operations: each element is pushed into the resulting list once, reversed once, and there are at most 4 comparison operations: 3 null checks and 1 call of the comp function. We also need to perform one copy per element in the subseq operation and take the length of the list (although it can be memorized and passed down as the function call argument) on the recursive descent. This totals to not more than 10 operations per element, which is a constant. And the height of the tree is, as we already know, (log n 2). So, the total complexity is O(n * log n).

Let's now measure the real time needed for such sorting, and let's compare it to the time of prod-sort (with optimal array accessors) from the Arrays chapter:

CL-USER> (with ((lst (random-list 10000))
                (vec (make-array 10000 :initial-contents lst)))
           (print-sort-timings "Prod" 'prod-sort vec)
           (print-sort-timings "Merge " 'merge-sort lst))
= Prodsort of random vector =
Evaluation took:
  0.048 seconds of real time
= Prodsort of sorted vector =
Evaluation took:
  0.032 seconds of real time
= Prodsort of reverse sorted vector =
Evaluation took:
  0.044 seconds of real time
= Merge sort of random vector =
Evaluation took:
  0.007 seconds of real time
= Merge sort of sorted vector =
Evaluation took:
  0.007 seconds of real time
= Merge sort of reverse sorted vector =
Evaluation took:
  0.008 seconds of real time

Interestingly enough, Merge sort turned out to be around 5 times faster, although it seems that the number of operations required at each level of recursion is at least 2-3 times bigger than for quicksort. Why we got such result is left as an exercise to the reader: I'd start from profiling the function calls and looking where most of the time is wasted...

It should be apparent that the merge-lists procedure works in a similar way to set-theoretic operations on sorted lists that we've discussed in the previous part. It is, in fact, provided in the Lisp standard library. Using the standard merge, Merge sort may be written in a completely functional and also generic way to support any kind of sequences:

(defun merge-sort (seq comp)
  (if (or (null seq)  ; avoid expensive length calculation for lists
          (<= (length seq) 1))
      (let ((half (floor (length seq) 2)))
        (merge (type-of seq)
               (merge-sort (subseq seq 0 half) comp)
               (merge-sort (subseq seq half) comp)

There's still one substantial difference of Merge sort from the array sorting functions: it is not in-place. So it also requires the O(n * log n) additional space to hold the half sublists that are produced at each iteration. Sorting and merging them in-place is not possible. There are ways to somewhat reduce this extra space usage but not totally eliminate it.

Parallelization of Merge Sort

The extra-space drawback of Merge sort may, however, turn irrelevant if we consider the problem of parallelizing this procedure. The general idea of parallelized implementation of any algorithm is to split the work in a way that allows reducing the runtime proportional to the number of workers performing those jobs. In the ideal case, if we have m workers and are able to spread the work evenly the running time should be reduced by a factor of m. For the Merge sort, it will mean just O(n/m * log n). Such ideal reduction is not always achievable, though, because often there are bottlenecks in the algorithm that require all or some workers to wait for one of them to complete its job.

Here's a trivial parallel Merge sort implementation that uses the eager-future2 library, which adds high-level data parallelism capabilities based on the Lisp implementation's multithreading facilities:

(defun parallel-merge-sort (seq comp)
  (if (or (null seq) (<= (length seq) 1))
      (with ((half (floor (length seq) 2))
             (thread1 (eager-future2:pexec
                       (merge-sort (subseq seq 0 half) comp)))
             (thread2 (eager-future2:pexec
                       (merge-sort (subseq seq half) comp))))
        (merge (type-of seq)
               (eager-future2:yield thread1)
               (eager-future2:yield thread2)               

The eager-future2:pexec procedure submits each merge-sort to the thread pool that manages multiple CPU threads available in the system and continues program execution not waiting for it to return. While eager-future2:yield pauses execution until the thread performing the appropriate merge-sort returns.

When I ran our testing function with both serial and parallel merge sorts on my machine, with 4 CPUs, I got the following result:

CL-USER> (with ((lst1 (random-list 10000))
                (lst2 (copy-list lst1)))
           (print-sort-timings "Merge " 'merge-sort lst1)
           (print-sort-timings "Parallel Merge " 'parallel-merge-sort lst2))
= Merge sort of random vector =
Evaluation took:
  0.007 seconds of real time
  114.29% CPU
= Merge sort of sorted vector =
Evaluation took:
  0.006 seconds of real time
  116.67% CPU
= Merge sort of reverse sorted vector =
Evaluation took:
  0.007 seconds of real time
  114.29% CPU
= Parallel Merge sort of random vector =
Evaluation took:
  0.003 seconds of real time
  266.67% CPU
= Parallel Merge sort of sorted vector =
Evaluation took:
  0.003 seconds of real time
  266.67% CPU
= Parallel Merge sort of reverse sorted vector =
Evaluation took:
  0.005 seconds of real time
  220.00% CPU

A speedup of approximately 2x, which is also reflected by the rise in CPU utilization from around 100% (i.e. 1 CPU) to 250%. These are correct numbers as the merge procedure is still executed serially and remains the bottleneck. There are more sophisticated ways to achieve optimal m times speedup, in Merge sort parallelization, but we won't discuss them here due to their complexity.

Lists and Lisp

Historically, Lisp's name originated as an abbreviation of "List Processing", which points both to the significance that lists played in the language's early development and also to the fact that flexibility (a major feature of lists) was always a cornerstone of its design. Why are lists important to Lisp? Maybe, originally, it was connected with the availability and the good support of this data structure in the language itself. But, quickly, the focus shifted to the fact that, unlike other languages, Lisp code is input in the compiler not in a custom string-based format but in the form of nested lists that directly represent the syntax tree. Coupled with superior support for the list data structure, it opens numerous possibilities for programmatic processing of the code itself, which are manifest in the macro system, code walkers and generators, etc. So, "List Processing" turns out to be not about lists of data, but about lists of code, which perfectly describes the main distinctive feature of this language...


[1] While, in the Lisp machines, cons cells even had special hardware support, and such change would have made it useless.

[2] Although, for structs, it is implementation-dependent if this will work. In the major implementations, it will.

No comments: