TPL Performance Improvements in .NET 4.5

Task.WaitAll and Task.WaitAny

Task’s waiting logic in .NET 4.5 has been changed. The performance gain for this change is most apparent when waiting on multiple Tasks, such as when using Task.WaitAll and Task.WaitAny.

Let’s explore the extent of this performance boost with this benchmark code for Task.WaitAll:

public static Tuple TestWaitAll(int ntasks)
        {
            Task[] tasks = new Task[ntasks];
            Action action = () => { };
            for (int i = 0; i < ntasks; i++) tasks[i] = new Task(action);
            Stopwatch sw = new Stopwatch();
            long startBytes = GC.GetTotalMemory(true);
            sw.Start();
            Task.WaitAll(tasks, 1);
            sw.Stop();
            long endBytes = GC.GetTotalMemory(true);
            GC.KeepAlive(tasks);
            return Tuple.Create(sw.ElapsedMilliseconds, endBytes - startBytes);
        }

The code above times the overhead of setting up a WaitAll for ntasks uncompleted Tasks, plus a one millisecond timeout. This test is admittedly less than perfectly precise, as the actual time before the WaitAll call times out could be anywhere from 1 millisecond to the scheduler quantum of the underlying operating system. Nevertheless, the test results still shed some light on the performance differences between .NET 4 and .NET 4.5 for this scenario:

Task Creation Performance in .NET 4.5

In this post, I will compare the Task creation performance in .NET 4 and .NET 4.5.

I will measure both time and memory consumption associated with Task creation:

public static Tuple<long, long> CreateTasks(int ntasks)
{
    Task[] tasks = new Task[ntasks];
    Stopwatch sw = new Stopwatch();
    Action action = () => { };
    long startBytes = GC.GetTotalMemory(true);
    sw.Start();
    for (int i = 0; i < ntasks; i++) tasks[i] = new Task(action);
    sw.Stop();
    long endBytes = GC.GetTotalMemory(true);
    GC.KeepAlive(tasks);
    return Tuple.Create(sw.ElapsedMilliseconds,endBytes-startBytes);
}

The results on my test machine are as follows:

 

 

The benchmark results do indeed show the smaller footprint of a Task in .NET 4.5, in addition to the decreased amount of time that it takes to create Tasks.

File Content and Directory Search using Directory.GetFiles and PLINQ

 

 

 

 

 

Array of File Names

Starting .NET 4, you can use PLINQ queries to parallelize operations on file directories. The following code snippet shows how you can write a query by using the GetFiles method to populate an array of file names in a directory and all subdirectories. This method does not return until the entire array is populated, and therefore it can introduce latency at the beginning of the operation. However, after the array is populated, PLINQ can be used to search inside all the files with the specific extension located in a particular directory for a specific word very quickly. For measuring the performance, you can create a folder called CLOBS and create 8 large text files (1GB each).

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

After running the project, the CPU usage goes up as it is shown in the following figure:

Finding all matches in 8 large text files (1GB each) takes 407.03 seconds as it is shown in the output window:

File Content and Directory Search using Directory.EnumerateFiles and PLINQ

Enumerable Collection of File Names

Starting .NET 4, you can enumerate directories and files by using methods that return an enumerable collection of strings of their names. In previous versions of the .NET Framework, you could only obtain arrays of these collections. Enumerable collections provide better performance than arrays.

Parallel LINQ (PLINQ)

In .NET 4, you can use Parallel LINQ (PLINQ) for queries that contain computationally expensive operations on every element over all the files in a specified directory tree.
The following code snippet shows how to parallelize operations on file directories. The PLINQ query uses the Directory.EnumerateFiles method to search inside all the files with the specific extension located in the particular directory for a specific word. For measuring the performance, you can create a folder called CLOBS and create 8 large text files (1GB each).

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

After running the project, the CPU usage goes up as it is shown in the following figure:

Finding all matches in 8 large text files (1GB each) takes 402.596 seconds as it is shown in the output window:

File Content and Directory Search using Directory.EnumerateFiles and LINQ

Enumerable Collection of File Names

Starting .NET 4, you can enumerate directories and files by using methods that return an enumerable collection of strings of their names. In previous versions of the .NET Framework, you could only obtain arrays of these collections. Enumerable collections provide better performance than arrays.

LINQ Query

Language-Integrated Query (LINQ) is the name for a set of technologies based on the integration of query capabilities directly into the C# language. With LINQ, a query is now a first-class language construct, just like classes, methods, events and so on. The following example shows how to use Directory.EnumerateFiles method and LINQ query to search inside all the files with the specific extension located in the particular directory for a specific word. For measuring the performance, you can create a folder called CLOBS and create 8 large text files (1GB each).

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

After running the project, the CPU usage goes up as it is shown in the following figure:

Finding all matches in 8 large text files (1GB each) takes 144.06 seconds as it is shown in the output window:

Performance of PLINQ Queries

Parallel LINQ (PLINQ)
The main goal of the Parallel LINQ, or PLINQ is to execute LINQ to Objects queries in parallel, realizing the benefits of multithreading. Using PLINQ is simple, if you have to perform the same task on each element in a sequence, and those tasks are independent. If you need the result of one calculation step in order to find the next, PLINQ is not for you but many CPU intensive tasks can in fact be done in parallel. To tell the compiler to use PLINQ, you just need to call AsParallel and let PLINQ handle the threading.
The following samples demonstrate the performance of PLINQ queries for different scenarios:

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

The result is shown below:

Without the AsParallel call, we would only use a single thread. Please note that except you specify that you want the results in the same order as the original sequence, PLINQ will assume you don’t mind getting results as soon as they’re available, even if results from earlier elements haven’t been returned yet. You can prevent this by using AsParallel().AsOrdered()

When to Use PLINQ

It’s tempting to search your existing applications for LINQ queries and experiment with parallelizing them. This is usually unproductive, because most problems for which LINQ is obviously the best solution tend to execute very quickly and so don’t benefit from parallelization. A better approach is to find a CPU-intensive bottleneck and then consider, “Can this be expressed as a LINQ query?”
PLINQ is well suited to embarrassingly parallel problems. It also works well for structured blocking tasks, such as calling several web services at once. PLINQ can be a poor choice for imaging, because collating millions of pixels into an output sequence creates a bottleneck. Instead, it’s better to write pixels directly to an array or unmanaged memory block and use the Parallel class or task parallelism to manage the multi-threading.

Concurrency Visualizer SDK

What is new in Concurrency Visualizer SDK?



 

 

 

 

 

 


The Concurrency Visualizer displays rich data related to CPU thread behavior, DirectX activity, and disk I/O, among other things. This information can be incredibly valuable when investigating application behavior, but sometimes it is difficult to quickly understand how the data displayed in the Concurrency Visualizer maps to application behavior. The new Concurrency Visualizer SDK, which you can use in Visual Studio 11 Developer Preview allows you to instrument your code in order to augment the visualizations displayed in the Threads View of the Concurrency Visualizer. These visualizations, referred to as “Markers”, make the Threads View data more semantically meaningful because they represent specific phases and events in your application.
For those of you who have used Scenario Markers with the Visual Studio 2010 Concurrency Visualizer, you’ll find that the Concurrency Visualizer SDK is conceptually similar, but provides much more control and flexibility.
The SDK exposes three visual primitives: Span, Flag, and Message
A span represents an interval of time in your application, such as an application phase. A flag represents a single point in time (e.g. the point where some value reached a threshold or when an exception was thrown). A message also represents a single point in time, but is meant as a visual analog to classic event-style tracing. So what might have previously been dumped to a log file can now be wrapped in a message call. This will yield visualizations in the Threads View and you’ll have the ability (via the UI) to export the data into a CSV file.

Copyright © All Rights Reserved - C# Learners