Different data structures in .NET. Practical aspects.

Oleg Kikhtov
5 min readSep 30, 2021

--

I don’t want to retell here theory part of what type of data structures exist, what is big O notation and so on. The assumption is that you already know that or you can find in in google if you have some gaps.

The purpose of this article is to concentrate on practical aspects and show how and when to use data structures correctly and prove that with concrete examples, metrics, profiling reports and deep analysis.

Lets get the ball rolling! (Links to github with code samples are attached in the end of this article)

Plan

Compare such collections to know which one to use in specific use cases:

  1. Array
  2. List
  3. LinkedList
  4. Dictionary

* We skip other data structures such as Queue, Stack, SortedList as those have specific use cases to use them.

In the end of this investigation we would like to find answers to such questions

  • Why do we have Array and List in C# as they are very similar? Are there any use cases to use Array but not List?
  • We usually use Dictionary for obvious key-value use cases. Are there any other options where we can benefit from hash table?
  • Why do we need LinkedList and when to use it?

Some notes

  1. All metrics were collected from Release build
  2. All metrics are isolated (everything else is commented to make sure memory allocation and execution time is relevant only for specific case and is not impacted with something else)
  3. Total execution time could be small. But it could impact our system when we have nested loops, a lot of loops in a call stack, more entities or strict SLA.

An example of how we collect our metrics for List

First type of exercise is to add 2 000 000 entities to List. We will calculate total time and memory usage.

*everything else is commented to make sure we get clean results.

Result:

Start!
Completed. Added 2000000 elements to list with no capacity. Total time = 00:00:00.3428810
****************************
Done!

Now lets go to Performance Profiling

Lets start with Memory Usage

Result:

Used Memory: 163.9 MB

Now lets uncomment FindUser method to calculate time for retrieving data.

*we are going to search for last entity (worst use case) with LINQ.

Result:

And the same I did for Array, List with default capacity, LinkedList and Dictionary. Here is the summery:

Add 2 000 000 objects to collection

Result analysis

If we compare “AVG total time” and “Memory Usage” we can see that both “List with default capacity” and “Array” have good results.

LinkedList takes x1.65 times longer to do that operation and requires much more memory. Dictionary requires x2 more in comparison to List with default capacity and more memory as well.

So List is a winner here. As it has more functionality than Array and better time and memory usage than LinkedList and Dictionary.

But don’t forget to set up default capacity when create List as that will save you 12% of memory in comparison to List without default capacity.

Retrieve an element from collection

Result analysis

Lets take a look on “AVG total time”. This is not a surprise that hash table is a best data structure for this, no doubts. Its also interesting to observe that getting entity by LINQ is 41 times slower than by index (as getting by index is O(1) and by LINQ O(n)). Getting value by LINQ from LinkedList is slower than from List.

So if that is possible its better to use Dictionary. For example, when we are going to get entities by id or guid, we can add those entities to dictionary and use id as key. In other cases its better to use List.

Now lets answer question which we asked in the beginning

  • Why do we have Array and List in C# as they are very similar? Are there any use cases to use Array but not List? List is a good replacement for array, and we can use it instead of array in all places, but we need to use default capacity when create an instance of List (if we know it beforehand, of course). That will save 12% of memory.
  • We usually use Dictionary for obvious key-value use cases. Are there any other options where we can benefit from hash table? We often use List with LINQ to get entities only by id. That could be replaced with Dictionary in some cases.
  • Why do we need LinkedList and when to use it? Use List in most cases. Use LinkedList — if we need to insert some element in the middle.

Links

My github repo

Use the following table to reference how various mutable collection types compare in algorithmic complexity

--

--