Chapter 7: Abstracting Concurrency
Up until this point in the book, we explicitly modelled concurrency issues in our code. With promises, we synchronized two or more asynchronous actions. With generators, we created data on-the-fly, avoiding unnecessary memory allocations. Finally, we learned that web workers are the workhorses that leverage multiple CPU cores.
In this chapter, we will take all these ideas and put them into the context of application code. That is, if concurrency is the default, then we will need to make concurrency as unobtrusive as possible. We'll start by exploring various techniques that will help us encapsulate concurrency mechanisms within the components that we use. Then, we will move straight to improving our code from the previous two chapters by using promises to facilitate worker communication.
Once we're able to abstract worker communication using promises, we'll look at implementing lazy workers with the help of generators. We'll also cover worker abstraction using the Parallel.js library, followed by the concept of worker pools.
Writing concurrent code
Concurrent programming is hard to get right. Even with contrived example applications, the bulk of complexity comes from concurrent code. We obviously want our code to be readable while keeping the benefits of concurrency. We want to get the most out of each CPU on the system. We only want to compute what we need, when we need it. We don't want spaghetti code that joins together several asynchronous operations. Focusing on all these aspects of concurrent programming while developing applications detracts from what we should really be focusing on—the features that give our application value.
In this section, we'll look at the approaches that we might use to insulate the rest of our application from tricky concurrency bits. This generally means making concurrency the default mode—even when there's no real concurrency happening under the hood. In the end, we don't want our code to contain 90% concurrency acrobatics and 10% functionality.
Hiding the concurrency mechanism
The difficulty with exposing concurrency mechanisms all throughout our code is that they're all slightly different from one another. This magnifies the callback hell that we may already find ourselves in. For example, not all concurrent operations are network requests that fetch data from some remote resource. Asynchronous data might come from a worker or some another callback that's asynchronous in itself. Picture a scenario where we have three disparate data sources used to compute a value that we need—all of which are asynchronous.
That value is the thing we care about in our application code. From the perspective of the feature that we're building we don't care about anything below it. So, our front-end architecture needs to encapsulate the complexities associated with concurrency. There's another complication to consider here in addition to all our asynchronous data sources—what about when the data isn't asynchronous and originates from a local source? What about synchronizing a local data source and an HTTP request? We'll cover this in the following section.
Without concurrency
Just because we're writing a concurrent JavaScript application, not every operation is inherently concurrent. For example, if one component asks another component for data that it already has in memory, then it's not an asynchronous operation and is returned immediately. Our application is likely filled with operations like these, where concurrency simply doesn't make sense. And therein lies the challenge—how do we mix asynchronous operations seamlessly with synchronous operations?
The simple answer is that we make the default assumption of concurrency everywhere. Promises make this problem tractable.
The key aspect of promises is their general ability to abstract synchronization problems away for us. This is applicable not just with network requests, but also web worker messages, or any other asynchronous operation that relies on callbacks. It requires a bit of an adjustment to think about our data as we promise that it'll get here eventually. But, once we close this mental gap, concurrency is enabled by default. Concurrency is the default as far as our features are concerned, and what we do behind the operating curtain isn't disruptive in the slightest.
Let's turn our attention to some code now. We'll create two functions: one asynchronous and one plain old function that simply returns a value. The goal here is to make the code that uses these functions the same, despite the major differences in how the value is generated:
The trade-off here is the added promise complexity, wrapped around what would otherwise be a simple value returned from a function. But in reality, the complexity is encapsulated within the promise, and if we weren't writing a concurrent application, we obviously would need to concern ourselves with issues such as these. The benefit is huge. When everything is a promised value, we can safely rule out the inconsistencies that lead to nasty concurrency bugs.
Worker communication with promises
We now have a handle on why treating primitive values as promises benefits our code. It's time to apply this concept to web workers. In the preceding two chapters, our code that synchronized responses coming from web workers started to look a little intractable. This was because we were essentially trying to emulate many boilerplate chores that promises are good at handling. We'll first attempt to solve these problems by creating helper functions that wrap the worker communications for us, returning promises. Then we'll try another approach that involves extending the web worker interface at a lower level. Lastly, we'll look at some more complex synchronization scenarios that involve multiple workers, such as those from the last chapter.
Helper functions
It would be ideal if we could get web worker responses back in the form of a promise resolution. But, we need to create the promise in the first place—how do we do this? Well, we could manually create the promise, where the message that's sent to the worker is sent from within the promise executor function. But, if we take this approach, we're not much better off than we were before introducing promises.
The trick is to encapsulate both the message posted to the worker and any message received from the worker, within a single helper function.
Let's take a look at an example helper function that implements this pattern. First, we'll need a worker that carries out some task—we'll start with this:
Here we have a worker that will square any number we pass it. This work() function is intentionally slow so that we can see how our application, as a whole, performs when web workers take longer than usual to complete a task. It also uses an ID as we've seen with our previous web worker examples, so it can reconcile with the code that sent the message. Let's implement the helper function that uses this worker now:
If we focus on the way that the square() function is used, passing a number argument and getting a promise as a return value, we can see that this fits in with our earlier discussion on making code concurrent by default. For example, we can completely remove workers from this scenario and simply change the way the helper function resolves the promise that it returns, and the rest of our code will continue to function unaltered.
The helper function tactic is just one approach to simplify worker communication using promises. Perhaps we can decide that we don't necessarily want to maintain a bunch of helper functions. Next, we'll look at a more granular approach than helper functions.
Extending postMessage()
Rather than amassing vast quantities of helper functions, we can take a more generic route. There's nothing wrong with helper functions; they're direct and to the point. If we reach a point where there are literally hundreds of them, their value would start to depreciate very quickly. The more generic approach is to keep using worker.postMessage().
So let's see if we can make this method return a promise just like our helper function from the previous section. This way, we keep using the granular postMessage() method, but improve our synchronization semantics. First, here's the worker code:
This is nothing radically different from what we've seen so far in our web worker code. Now, in the main thread, we have to figure out how to alter the interface of Worker. Let's do this now. Then, we'll try posting some messages to this worker and resolving promises as a response:
Well, this is exactly what we need, right? We can post message data directly to the worker, and the response data is sent back to us through the promise resolution. As an added bonus, we can actually wrap helper functions around this new postMessage() function implementation if we're so inclined. The main trick involved with making this work is storing a reference to the original postMessage(). Then, we override the web worker property postMessage, not the function itself. Finally, we can reuse it to add the necessary reconciliation and promise goodness.
Synchronizing worker results
The code in the last two sections has adequately reduced our web worker callback hell to a more tolerable level. In fact, now that we've got a handle on how to encapsulate web worker communication by having postMessage() return a promise, we're ready to start simplifying any messy worker code that isn't using this approach. The examples that we've looked at so far, have benefited greatly from promises, they are simple; not having these abstractions wouldn't be the end of the world.
What about the scenario where we map a collection of data and then reduce the mapped collection? We may recall the map reduce code got a little hairy in Chapter 6, Practical Parallelism: Mapping and reducing. This is mostly due to all the worker communication boilerplate code entangled with the code that's trying to execute a map/reduce operation. Let's see if we fair any better using our promise technique. First, we'll create a very basic worker:
We can use this worker to pass arrays for mapping. So we'll create two of them and split the workload between the two workers, shown as follows:
When this is all we need to post data to workers, and to synchronize data from two or more workers, we're actually motivated to write concurrent code—it looks the same as the rest of our application code now.
Lazy workers
It's time for us to look at web workers from a different angle. The fundamental reason we're using workers in the first place is that we want to compute more than we have in the past in the same amount of time. Doing this, as we now know, involves messaging intricacies, divide and conquer strategies so to speak. We have to get data into and out of the worker, usually as an array.
Generators help us compute lazily. That is, we don't want to compute something or allocate data in memory until we really need it. Do web workers make this difficult or impossible to achieve? Or can we leverage generators to compute lazily and in parallel?
In this section, we'll explore ideas related to using generators in web workers. First, we'll look at the overhead issues associated with web workers. Then, we'll write some code that uses generators to pass data in and out of workers. Finally, we'll see if we can lazily pass data through a chain of generators, all residing in web workers.
Reducing overhead
The main thread can offload expensive operations to web workers, running them in another thread. This means the DOM is able to paint pending updates and process pending user events. However, we still face the overhead of allocating large arrays and the time taken to update the UI. Despite parallel processing with web workers, our users could still face a slowdown because there's no update to the UI until the entire data set has been processed.
These overheads are merely to facilitate the worker communication and have very little to do with the application functionality that we're trying to implement.
The overhead with arrays and serialization, required for worker communication, generally isn't a big deal. However, with larger collections, we could be faced with real performance issues, stemming from the very mechanism that we use to improve performance. So looking at worker communication from another perspective doesn't hurt, even if it's not necessary at first.
Instead of allocating and serializing lots of data upfront, individual items could be passed in and out of workers. This would give the UI a chance to update using the data that's been already processed, before all of the processed data arrives.
Generating values in workers
If we want to update the UI as our workers generate results, then they can't package the result set as an array to send back to the main thread after all the computations are done. While this happens, the UI sits there without responding to the user. We want a lazier approach where values are generated one at a time so that the UI can be updated sooner. Let's build an example that sends input to the web worker and sends results back at a much more granular level than what we've seen so far in this book. First, we'll create a worker; the code for it is as follows:
There's nothing earth-shattering here. It's the same work() function that we've already used to intentionally slow-down our code by inefficiently squaring a number. There's no actual generator used inside the worker. This is because we really don't need one, we'll see why in a moment:
Each number that's passed to our worker is more expensive to process than the previous number. So overall, processing the entire input array before showing anything to the user would feel as if the application is hanging or broken. But, this is not the case here because although each number is expensive to process, we're posting the results back as they become available.
We perform the same amount of work as we would perform by passing in an array and getting back an array as output. However, this approach simply changes the order in which things happen. We've introduced cooperative multi-tasking into the picture—compute some data in one task and update the UI in another. The aggregate time taken to complete the work is the same, but to the user, it feels much faster. At the end of the day, the user perceivable performance of our application is the only performance metric that counts.
We passed in the input as individual messages. We could have passed in the input as an array, posted the results individually, and gotten the same effect. However, this would probably amount to nothing more than an unneeded complexity. There's a natural correspondence to the pattern as it is—item in, item out. Don't change it if you don't have to.
Lazy worker chains
As we saw in Chapter 4, Lazy Evaluation with Generators: Lightweight map/reduce we can assemble chains of generators. This is how we implement complex functionality lazily; an item flows through a chain of generator functions that transform the item before yielding it to the next generator until it reaches the caller. Without generators, we would have to allocate a lot of intermediary data structures just for the sake of passing data from one function to the next.
In the section prior to this one, we saw that a pattern similar to generators was possible with web workers. Since we face a similar problem here, we don't want to allocate large data structures. We can avoid doing this by passing in items at a more granular level. This has the added benefit of keeping the UI responsive because we're able to update it before the last item arrives from the worker. Given that we can do this much with workers, could we not build on this idea and assemble more complex chains of worker processing nodes?
For instance, let's say we have a collection of numbers and several transformations. We need to make these transformations in a specific order before we can display them in our UI. Ideally, we would setup a chain of workers where each worker is responsible for performing its designated transformation, then passing the output on to the next worker. Eventually, the main thread gets a value back that it can display in the DOM.
The problem with this goal is the tricky communication that it involves. Since dedicated workers only communicate with the main thread that created them, it's hardly advantageous to send the results back to the main thread, then onto the next worker in the chain, and so on. Well, it turns out that dedicated workers can directly communicate without involving the main thread. We can use something called channel messaging here. The idea is simple; it involves creating a channel, which has two ports—messages posted on one port and received on the other.
We've been using messaging channels and ports all along. They're baked into web workers. This is where the message event and postMessage() method pattern comes from.
Each channel uses two messaging ports. The first port is used to post messages, whereas the second is used to receive message events. The only time the main thread is used is when the processing chain is first kicked off by posting a message to the first channel and when the message is received from the last channel.
Instead of letting this complex worker communication intimidate us, let's write some code; maybe, it'll look a little more approachable there. First we'll create the workers used in the chain. Actually, they're two instances of the same worker. Here's the code:
This is interesting. In this worker, we have message ports to work with. The first port is used to receive input, and the second port is used to send output. The work() function simply squares the given number using our now familiar approach of wasting CPU cycles to see how workers behave. What we want to do in our main thread is to create two instances of this worker so that we can pass the first instance a number to square. Then, without passing the result back to the main thread, it passes the result to the next worker, and the number is squared again. Let's look at some code that connects workers using messaging channels:
In addition to the data that we want to send to the worker, we can also send a list of message ports that we want to transfer to the worker context. This is what we do with the first two messages sent to the worker. The message data is null because we're not doing anything with it. In fact, these are the only messages we're sending directly to the worker. The rest of the communication happens through the message channels that we've created. The expensive computation happens in the worker because that's where the message handler resides.
Using Parallel.js
The aim of the Parallel.js library is to make interacting with web workers as seamless as possible. In fact, it handles one of the key goals of this book—it hides the concurrency mechanism and allows us to focus on the application that we're building.
In this section, we'll look at the approach taken by Parallel.js for worker communication and the general approach of passing code to workers. Then, we'll walk through some code that uses Parallel.js to spawn new worker processes. Lastly, we'll explore the built-in map/reduce capabilities that the library has to offer.
How it works
All the workers that we've used so far in this book have been our own creation. We implemented message event handling in our workers that computed some value, then posted a response. With Parallel.js, we don't implement workers. Instead, we implement functions, which are then passed to workers that are managed by the library.
This takes care of a few headaches for us. All our code is implemented in the main thread, meaning that it's easier to use the functions that we've implemented in the main thread because we don't need to import them into web workers using importScripts(). We also don't need to manually start web workers by creating them with a script path. Instead, we let Parallel.js spawn new workers for us, and then, we can tell the workers what to do by passing functions and data to them. So, how does this work, exactly?
Workers need a script argument. Without a valid script, workers simply do not work. Parallel.js has a straightforward eval script. This is what's passed to any worker that the library creates. Then, the API within the main thread assembles code that's to be evaluated within the worker and sends it over whenever we need to communicate with workers.
This is feasible because Parallel.js doesn't aim to expose a plethora of functionality backed by workers. Instead, the aim is to make the worker communication mechanism as seamless as possible while providing minimal functionality. This makes it easy to build only the concurrency functionality that's relevant to our application and not a host of other functions that we'll never use.
Spawning workers
The Parallel.js library has the notion of a job. The primary input to a job is the data that the job is going to process. The creation of a job isn't directly tied to the creation of a background worker. Workers are distinct from Parallel.js jobs; we don't interact directly with workers when using the library. Once we have our job instance, and it's supplied with our data, we use a job method to invoke workers.
The most basic method is spawn(), which takes a function as an argument and runs it in a web worker. The function that we pass to it can return results, and these are then resolved as a thenable object that's returned by spawn(). Let's look at some code that uses Parallel.js to spawn new job backed by a web worker:
Well now, that's pretty cool; we don't have to worry about any of the monotonous web worker life-cycle tasks. We have some data and some function that we want to apply to the data, and we want to run it in parallel with other work taking place on the page. The cherry on the top is the familiar thenable that's returned from the spawn() method. It fits right into our concurrent application, where everything else is treated as a promise.
We log how long it takes for our function to process the input data we give it. We only spawn a single web worker for this task, so the result is reached in the same amount of time as it would have been, were it computed in the main thread. Aside from freeing up the main thread to handle DOM events and repainting, there's no objective performance gain. We'll see if we can use a different method to up the concurrency level.
The worker created by spawn() is immediately terminated when we're done with it. This frees up memory for us. However, there's no concurrency level governing the use of spawn(), we can call it 100 times in a row if we like.
Mapping and reducing
In the last section, we spawned a worker thread using the spawn() method. Parallel.js also has a map() method and a reduce() method. The idea is to make things easier for us. By passing map() a function, the library will automatically apply it to each item in the job data. Similar semantics apply with the reduce() method. Let's take a look at how this works by writing some code:
Ouch! This is quite the performance hit—what's going on here? What we're seeing here is a phenomenon called parallel slowdown. This slowdown takes place when there's too much parallel communication overhead. The reason this is happening in this particular example is due to the way Parallel.js processes arrays in map(). Each array item goes through a worker. This doesn't mean that there are 2500 workers created—one for each element in the array. The number of created workers maxes out at four or the navigator.hardwareConcurrency value—similar semantics we looked at earlier in this book.
The real overhead comes from messages sent to and received from the workers—5000 messages! This is obviously not optimal, as evidenced by the timer in the code. Let's see if we can make a drastic improvement on these numbers while keeping roughly the same code structure:
Here, we can see that the same results are generated, and much faster. The difference is that we start things off by slicing the array into chunks of smaller arrays. These arrays are the items that get passed to the workers, instead of individual numbers. So the mapping job has to change slightly as well, instead of squaring a number, it's mapping a smaller array to an array of squares. The reduce logic is slightly more complex, but overall, our approach is still the same. Most importantly, we've removed the heavy message-passing bottleneck that was causing unacceptable performance flaws in the first implementation.
Just like the spawn() method cleans up the worker when it returns, so do the map() and reduce() Parallel.js methods. The downside to freeing workers is that they need to be recreated whenever these methods are called. We'll address this challenge in the next section.
Worker pools
The final section of this chapter covers the concept of worker pools. In the preceding section on Parallel.js, we ran up against an issue where workers were frequently created and terminated. This is a lot of overhead. If we know the level of concurrency we're capable of operating at, then why not allocate a statically-sized pool of workers that can take on work?
The first design task for creating a worker pool is to allocate the workers. The next step is to schedule the jobs as they come in by distributing them to available workers in the pool. Lastly, we'll need to account for busy states when all the workers are busy. Let's do this.
Allocating pools
Before we think about allocating pools of worker threads, we need to look at the overarching worker pool abstraction. How do we want it to look and behave? Ideally, we want the pool abstraction to look and behave like a plain dedicated worker. We can post a message to the pool and get a promise in response. So while we can't directly extend the Worker prototype, we can create a new abstraction that closely resembles the Worker API.
Let's look at some code now. Here's the initial abstraction that we'll use:
When a new WorkerPool is created, the given script is used to spawn all the workers within the pool. The workers property is a Map instance, and the worker instances themselves are the keys. The reason we store the workers as map keys is so that we can easily lookup the appropriate resolver function to call.
When a given worker responds, the message event handler that we've added to each worker is called, and this is where we find the resolver function that's waiting to be called. There's no chance of us calling the wrong resolver because a given worker doesn't take on new work until it's finished with its current task.
Scheduling jobs
Now we'll implement the postMessage() method. This is what the caller will use to post a message to one of the workers in the pool. The caller doesn't know which worker fulfills their request, nor do they care. They get a promise as a return value, and it's resolved with the worker response as the value:
It's the promise executor function that takes care of actually finding the first available worker and posting our message there. When an available worker is found, we also set the worker's resolver function in our workers map. If there are no available workers in the pool, the posted message goes into the queue. This queue is emptied in the message event handler. This is because when a worker comes back with a message, it means the worker is free to take on more work, and it checks if there's anything queued before returning to an idle state.
The getWorker() method is a simple helper that finds the next available worker for us. We know a worker is available to take on a task if its resolver function is set to null in the workers map. Lastly, let's see this worker pool in action:
In this usage scenario, we have a couple of form controls that send parameterized work to the worker. The larger the number, the longer the work will take; it uses our standard work() function that slowly squares numbers. If we use a large number and frequently click the button that posts the message to the pool, then eventually we'll exhaust the pool. If that's the case, we will display a warning. However, this is just for troubleshooting purposes—the posted messages aren't lost when the pool is busy, they're simply queued.