File Content and Directory Search using Directory.EnumerateFiles and PLINQ

Enumerable Collection of File Names

Starting .NET 4, you can enumerate directories and files by using methods that return an enumerable collection of strings of their names. In previous versions of the .NET Framework, you could only obtain arrays of these collections. Enumerable collections provide better performance than arrays.

Parallel LINQ (PLINQ)

In .NET 4, you can use Parallel LINQ (PLINQ) for queries that contain computationally expensive operations on every element over all the files in a specified directory tree.
The following code snippet shows how to parallelize operations on file directories. The PLINQ query uses the Directory.EnumerateFiles method to search inside all the files with the specific extension located in the particular directory for a specific word. For measuring the performance, you can create a folder called CLOBS and create 8 large text files (1GB each).

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

After running the project, the CPU usage goes up as it is shown in the following figure:

Finding all matches in 8 large text files (1GB each) takes 402.596 seconds as it is shown in the output window:

File Content and Directory Search using Directory.EnumerateFiles and LINQ

Enumerable Collection of File Names

Starting .NET 4, you can enumerate directories and files by using methods that return an enumerable collection of strings of their names. In previous versions of the .NET Framework, you could only obtain arrays of these collections. Enumerable collections provide better performance than arrays.

LINQ Query

Language-Integrated Query (LINQ) is the name for a set of technologies based on the integration of query capabilities directly into the C# language. With LINQ, a query is now a first-class language construct, just like classes, methods, events and so on. The following example shows how to use Directory.EnumerateFiles method and LINQ query to search inside all the files with the specific extension located in the particular directory for a specific word. For measuring the performance, you can create a folder called CLOBS and create 8 large text files (1GB each).

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

After running the project, the CPU usage goes up as it is shown in the following figure:

Finding all matches in 8 large text files (1GB each) takes 144.06 seconds as it is shown in the output window:

Copyright © All Rights Reserved - C# Learners