A look at the Parallel class, .Net 4.0's new functionality to simplify multi-threading.

.Net 4.0 introduces new libraries for handling and greatly simplifying multi-threaded programming. This is a welcome addition to .Net for two reasons. Firstly writing multi-threaded code tends to be complex. There are all sorts of difficulties when working with threads that simply don't exist for single threaded applications. Add to this an environment where multiple programmers are working on different parts of the code base, or where a maintenance programmer with a weak understanding of the code base needs to make adjustments years later and you have a recipe for difficult to debug code with possible errors making their way into a production environment. But at the same time computers are becoming vastly more parallel. Consumer machines are now multi core, 4 cores are common and over the next couple years we will see 6 and possible 8 core machines in the hands of everyday users. Writing your code in a multi-threaded manner has never been more important.

The new additions make it far easier to take existing code and make portions of it multi-threaded or build new code that is optimized for parallel processors. Let’s have a look at the Parallel library first.

Parallel.For

Firstly let’s set up some skeleton code:

Here we have a test object we will be manipulating, a method to generate random strings (used in our test object) and a method to print lists of our test objects.

Parallel.For is the multithreaded version of a standard for loop. Rather than being a keyword it is a function meaning our code will unfortunately look a little clunkier. It also has the restriction that the iterator is limited to incrementing by 1 (i++), unlike the standard for where you can specify an iteration of any value (i+=2, i=i+7, i=myFunc(i), etc). To invoke you specify an initial value, a final value (rather than a test condition), and an Action that executes your desired code. Let’s look at an example:

In this trivial example we reverse the string in our test object and append a number to it. I've used three different ways of invoking the same code to provide some insight as to what the code will look like. We always have a single integer parameter (here named i) which increments between the initial value (here 0) and the final value (here _objects.Count()).

As you can see the code is very simple. We can control the number of threads that are spawned by using an overload that takes a ParallelOptions object. ParallelOptions takes a CancellationToken which allows external code to cancel the loop early, an int MaxDegreeOfParallelism that defines how many threads should be spawned and a TaskScheduler that overrides the default TaskScheduler alowing you to control the scheduling. Without using this overload .Net takes care of these details based on the number of cores in the computer along with current usage etc. There are also overloads that use doubles for the incriminator rather than an int.

While this is definitely a step up from the standard threading libraries, care must still be taken that you use Parallel.For in such a way that side effects from different threads don't interfere with each other. Order can longer be guaranteed so if say you are adding a value to a list inside each execution then the order items are added will be somewhat random. Still for many tasks this is a great addition. Tests by others seem to be showing significant speed improvements when used on multi core machines for non-trivial examples. As always optimization needs to be tested both before and after to ensure you get the best speed possible.

Parallel.ForEach

Parallel.ForEach works in much the same way as Parallel.For. Let’s look at some code that does the same thing as our for example:

In our example we pass a list; the Parallel.ForEach method along with an Action to be executed. The parameter to the action in this case is the current element from the IEnumerable. We have overloads taking ParallelOptions just like Parallel.For, and we have the same caveats to using this function in regards to threads interfering with each other.

Parallel.Invoke

Finally let us look at Parallel.Invoke. Parallel.Invoke is used for firing of a bunch of discrete pieces of code that do not have a requiremnt for order of execution or completion. Parallel.Invoke takes an of Actions and starts them of in parallel. Just like the previous methods it also has an overload that takes a ParallelOptions object. Let’s see a sample:

Again for demonstrative purposes three different ways of passing in an Action parameter are used. If you are running this code yourself you can easily change the parameter inside the Thread.Sleep methods to change the order in which the words are printed out thus demonstrating the non-sequential nature of this code.

All in all the Parallel class makes it easy to kick of multithreaded code, though care must still be taken to avoid traditional multi threading pitfalls.