2019-08-30

Programming Algorithms: Key-Values

This is a snippet of the "Key-Values" chapter of the book.

To conclude the description of essential data structures, we need to discuss key-values (kvs), which are the broadest family of structures one can imagine. Unlike arrays and lists, kvs are not concrete structures. In fact, they span, at least in some capacity, all of the popular concrete ones, as well as some obscure.

The main feature of kvs is efficient access to the values by some kind of keys that they are associated with. In other words, each element of such data structure is a key-value pair that can be easily retrieved if we know the key, and, on the other hand, if we ask for the key that is not in the structure, the null result is also returned efficiently. By "efficiently", we usually mean O(1) or, at least, something sublinear (like O(log n)), although, for some cases, even O(n) retrieval time may be acceptable. See how broad this is! So, a lot of different structures may play the role of key-values.

By the way, there isn't even a single widely-adopted name for such structures. Besides key-values — which isn't such a popular term (I derived it from key-value stores) — in different languages, they are called maps, dictionaries, associative arrays, tables, objects and so on.

In a sense, these are the most basic and essential data structures. They are so essential that some dynamic languages — for example, Lua, explicitly, and JavaScript, without a lot of advertisement — rely on them as the core (sometimes sole) language's data structure. Moreover, key-values are used almost everywhere. Below is a list of some of the most popular scenarios:

  • implementation of the object system in programming languages
  • most of the key-value stores are, for the most part, glorified key-value structures
  • internal tables in the operating system (running process table or file descriptor tables in the Linux kernel), programming language environment or application software
  • all kinds of memoization and caching
  • efficient implementation of sets
  • ad hoc or predefined records for returning aggregated data from function calls
  • representing various dictionaries (in language processing and beyond)

Considering such a wide spread, it may be surprising that, historically, the programming language community only gradually realized the usefulness of key-values. For instance, such languages as C and C++ don't have the built-in support for general kvs (if we don't count structs and arrays, which may be considered significantly limited versions). Lisp, on the contrary, was to some extent pioneering their recognition with the concepts of alists and plists, as well as being one of the first languages to have hash-table support in the standard.

More details about of the book may be found on its website.

No comments: