dateo. Coding Blog

Coding, Tech and Developers Blog

concurrency

.net

entity framework

Be optimistic about concurrency in Entity Framework

Dennis Frühauff on February 7th, 2024

Concurrency is a thing we developers know as one of the trickiest things to solve. It might happen rarely, but especially in distributed, massively scaled applications, it will happen often enough. And if not handled well, it will cause once-in-a-day issues that are particularly hard to track down. Let's together take a look at what we can do with concurrency issues in Entity Framework.

Concurrency (and its issues) is all around us in the computer programming world, down to the level of every CPU operation that is executing while I am typing this text. And while there are plenty of great resources out there, I want to give you a quick overview of concurrency when it comes to database access, specifically with Entity Framework in .NET. While problems can be rare, there are two different approaches when it comes to tackling them: Optimistic and pessimistic concurrency. Though explaining both very briefly, I will focus on the topic of optimistic concurrency.

Let's dive right in. You can find all the demos in this article here on GitHub

Introduction and Last-One-Wins

The common scenario when it comes to data access concurrency is that two different threads (or tasks, processes, applications) retrieve the same data or entity, manipulate it, and write it back to the data storage. Now, the reading of the entity must not happen at the same point in time. It is only important that between reading data from the database and storing it back, the underlying database row has already been changed by a different thread.

This is not always something bad; it heavily depends on the actual data that is manipulated.
By default, Entity Framework will not notify that there is an actual issue - it will silently apply the last-one-wins scenario. Depending on which of these tasks writes its version of the data last back to the database, those values will be the ones that are persisted. If your specific application is fine with that, I strongly advise to not do something about concurrency issues in your application.

To demonstrate what will happen in these cases, you can find demos here that you can execute and play with yourself.

Problems with last-one-wins scenarios will only arise if you are updating multiple properties in the same go in your code.
Let's look at an example:
Starting out with a product {Title: "Cucumber", Price: 0.99} we are launching two tasks making the following updates {Title: "Cucumber", Price: 1.99} and {Title: "Cucumber-1", Price: 0.99}. That is, we are explicitly setting both properties on the entity. If you run that test, you will notice that the final state of the product will be {Title: "Cucumber-1", Price: 1.99}, which is not among any of the combinations we desired. We now have corrupt data in our database.

This is usually the type of issue that "happens only once a week", "a customer complaint their updates were not saved"-kind-of-thing. Very hard to track down and observe in the debugger.

So, how did that happen? EF's default behavior is to send only updates to the database that are actually necessary. Or in other words: If a property did not change, it will not be included in the actual UPDATE statement in the SQL that is sent to the database. This is an optimization that is performed by EF's ChangeTracker instance. And usually, this is a very good idea, because it limits the amount of traffic that needs to be sent across.

But as we see now, it can lead to inconsistent data; so please be aware that this optimization exists. An easy fix for this could be to explicitly tell the ChangeTracker instance to update all properties, as demonstrated in this test. This can be a viable solution if you do not particularly care about how much data is transferred and if your entities do not have too many properties.

But let us now assume that you do not want to mess around manually with tracking entity changes and at the same time handle concurrency issues gracefully in your application - what can you do? Two general approaches are distinguished:

Pessimistic Concurrency

Pessimistic concurrency assumes that the frequency of issues due to concurrent data manipulation is high (which is pessimistic, right?). In this case, there is only one brute-force method: Lock the specific row in question until the changes have been committed, so that other threads cannot get access to it. While very simple, this will also be the approach with the worst performance in general. Also, in this article, I'd like to focus on EF's capabilities when it comes to the alternative approach (mind the title).

Optimistic Concurrency

Optimistic concurrency, on the other hand, assumes that issues due to concurrent data manipulation happen very rarely and only need to be handled if they appear, hence the term optimistic.
Let's focus on what we can do about that in EF.

Introducing concurrency tokens

As we've seen and as you can verify from the tests so far, by default, Entity Framework will silently swallow all information about possible concurrency issues. So, the first thing we want to achieve is to actually get notified when concurrency issues appear while saving changes.

In order to do this, we have to make sure that we tell EF which of our models properties can save as concurrency tokens, serving as a means to know when a change was being made in the database while we were trying to update an entity in another thread.

A concurrency token is nothing but an additional where clause in the final SQL statement that is being sent to the database. For example, if the simple update statement looked like this:

UPDATE Products 
   SET Price = 1.99 
   WHERE Id = 1;

after telling EF to use a concurrency token, it might look something like this:

UPDATE Products 
   SET Price = 1.99 
   WHERE Id = 1
   AND RowVersion = 42;

In the latter statement, if a different process updated the RowVersion property to 43, without us knowing about it, executing this exact statement, because there would be no entity that matches this set of constraints anymore. In this case, Entity Framework throws a DbUpdateConcurrencyException for us to handle gracefully in our code.

Database-generated concurrency tokens

There are two different ways to specify how properties should be used as concurrency tokens.
The first method is using the database provider's built-in methods to auto-generate concurrency tokens:

protected override void OnModelCreating(ModelBuilder modelBuilder)
    {
        modelBuilder
          .Entity<ConcurrentProduct>()
          .ToTable(nameof(ConcurrentProduct));
        ...
  		modelBuilder
          .Entity<ConcurrentProduct>()
          .Property(o => o.VersionNumber)
          .IsRowVersion();
        ...
    }

The extension IsRowVersion() expects you to have a column on the database that matches to specific provider's implementation of concurrency tokens. In the case of SQL Server, this is the type rowversion (which is also where the name of this method originates). For Postgres, however, this property should be a uint. Note also that the non-fluent way of modeling this would be the [Timestamp] attribute in your model classes.

The call to this method is automatically chained to ValueGeneratedOnAddOrUpdate() to make sure that whenever you call SaveChanges() the database will update the property on this entry.

Some database providers do not support auto-generated concurrency tokens (SQLite, as an example). If you are dealing with those, you can still default to application-managed concurrency tokens, which I am demonstrating in these tests.

The entity configuration looks like this:

```csharp
protected override void OnModelCreating(ModelBuilder modelBuilder)
    {
        modelBuilder
          .Entity<ConcurrentProduct>()
          .ToTable(nameof(ConcurrentProduct));
        ...
  		modelBuilder
          .Entity<ConcurrentProduct>()
          .Property(o => o.VersionNumber)
          .IsConcurrencyToken();
        ...
    }

or, in non-fluent language: [ConcurrencyCheck].

With this, we are still telling EF to append the above-mentioned additional where clauses to our statements, but we have to provide them ourselves. The simplest solution to this could be an integer property that is just being updated whenever we make changes to our entity. This is demonstrated in the tests.

Since adding concurrency tokens just adds addional where clauses, it is also possible to declare multiple properties as concurrency tokens. This can be feasible if you don't want to change your existing database table, but rather use a combination of existing properties to make update statements unique.

Handling `DbUpdateConcurrencyException`

If you were to take a look at Microsoft's own documentation about handling concurrency issues in EF, they resolve to dive deep into the change tracker's entries and adjust the properties to your
needs.
However, in the above tests, I am demonstrating a simpler and oftentimes sufficient way of doing it: Retrying. Now I am not saying that "this is how you should do it". Please consider it a simple demonstration, stripped down to its basics. The main idea is to completely retry the process of fetching, updating, and saving the entities. We are thereby ensuring that we take a fresh instance of the DbContext and get the latest state from the database again. If you intend to use this, I'd certainly recommend using existing retry frameworks like Polly for this.

Conclusion

I gave you a quick overview of concurrency issues that can occur whenever multiple processes try to update the same row on a single database and also hinted at possibilities to mitigate the effects.

Obviously, the examples used in my repository for you to try are very simple and will in no way reflect what might happen in a bigger application. But if you want to start learning about concurrency in Entity Framework you have to start somewhere, right?

Still, the general recommendation should be that if last-one-wins is fine for you, you should not worry about handling those issues differently. But if your application suffers from corrupt data or missing updates, I hope I could get you started.

This talk is inspired by a hands-on presentation in the .NET Community Standup. Please check those out, both for additional introduction as well as some more advanced scenarios.

Please share on social media, stay in touch via the contact form, and subscribe to our post newsletter!

Be the first to know when a new post was released