2020-03-30

Programming Algorithms: Synchronization

This is a snippet of the "Synchronization" chapter of the book.

This is the final chapter of the book, in which we will discuss optimization of parallel computations: whether concurrently on a single machine in and a shared-memory setting or in a distributed shared-nothing environment. This is a huge topic that spans synchronization itself, parallelization, concurrency, distributed computations, and the functional approach. And every senior software developer should be well-versed in it.

Usually, synchronization is studied in the context of system or distributed programming, but it has a significant algorithmic footprint and is also one of the hottest topics for new algorithm research. In fact, there are whole books that concentrate on it, but, usually, they attack the problem from other angles, not focusing on the algorithmic part. This chapter will be more algorithm-centered, although it will also present an overview of the problem space. SO that, in the end, you'll have a good foundation to explore the topic further if a desire or need for that will appear.

Let's start from the basics. In the previous chapters of the book we, mainly, viewed algorithms as single computations running without external interruptions. This approach, obviously, removes the unnecessary complexity, but it also isn't totally faithful to reality. Most of the programs we deal with, now, run in multiprocessing environments (sometimes, even distributed ones), and even when they don't utilize these capabilities, those are still available, they sometimes have their impact, and, besides, might have improved the performance of the programs if they would have been utilized. The majority of the backend stuff, which, currently, is comprised of services running in the datacenters, are multithreaded. There's a notorious "Zawinski's Law" that states that every program attempts to expand until it can read mail. Those programs which cannot so expand are replaced by ones which can. Being a good joke it also reflects an important truth about the tendency of all programs over time to become network-aware, and thus distributed to at least some extent.

There are two principally different types of environments in which the programs that need synchronization run: shared-memory and shared-nothing ones.

In a shared-memory setting, there exists some shared storage (not necessarily, RAM) that can be directly accessed by all the threads of the application. Concurrent access to data in this shared memory is the principal source of the synchronization challenges, although not the only one. The example of a shared-memory program is a normal application that uses multithreading provided either directly by the OS or, more frequently, by the language runtime.

The opposite of shared-memory is a shared-nothing environment, in which all threads don't have any common data storage and can coordinate only by sending messages directly to other processes. The contents of the messages have to be copied from the memory of the sender to the receiver. In this setting, some of the synchronization problems disappear, but others still remain. At the fundamental level, some synchronization or coordination still needs to happen. From a performance standpoint, however, the shared-nothing mode is, usually, inferior due to the need for additional data copying. So, both paradigms have their place and the choice, which one to utilize, depends on the context of a particular task.

The main goal of synchronization is ensuring program correctness when multiple computations are running in parallel. Another side of the coin is achieving optimal performance, which is also addressed by parallelization that we have somewhat discussed in a couple of prior chapters. Prioritizing performance before correctness, although tempting, is one of the primary sources of bugs in the concurrent systems. The trivial example would be building a shared-memory program without explicit use of any synchronization mechanisms. It is, definitely, the most performant approach, but non-coordinated access to the shared data will inevitably result in failures like data corruption.

More details about of the book may be found on its website.

No comments:

Post a Comment