read

The Rationale

It’s generally believed that by using the Repository pattern, you can (in summary) “decouple” your data access from your domain and “expose the data in a consistent way”.

If you look at any of the implementations of a Repository working with a UnitOfWork (EF) – you’ll see there’s not all that much “decoupling”:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Data;
using ContosoUniversity.Models;

namespace ContosoUniversity.DAL
{
    public class StudentRepository : IStudentRepository, IDisposable
    {
        private SchoolContext context;

        public StudentRepository(SchoolContext context)
        {
            this.context = context;
        }

        public IEnumerable<Student> GetStudents()
        {
            return context.Students.ToList();
        }

        public Student GetStudentByID(int id)
        {
            return context.Students.Find(id);
        }

        //<snip>

        public void Save()
        {
            context.SaveChanges();
        }

    }
}

This class can’t exist without the SchoolContext – so what exactly did we decouple here? Nothing.

In this code, from MSDN, what we have is a reimplentation of LINQ, with the classic problem of the “ever-spiraling Repository API”. By “spiraling API” I mean fun things like “GetStudentByEmail, GetStudentByBirthday, GetStudentByOrderNumber” etc.

But that’s not the primary problem here. The primary problem is the Save() routine. It saves a Student… I think. What else does it save? Can you tell? I sure can’t… more on this below.

UnitOfWork is Transactional

A Unit of Work, as it’s name applies, is there to do a thing. That thing could be as simple as retrieving records to display, or as complex as processing a new Order. When you’re using EntityFramework and you instantiate your DbContext – you’re creating a new UnitOfWork.

With EntityFramework you can “flush and reset” the UnitofWork by using SubmitChanges() – this kicks off the change tracker comparison bits – adding new records, updating and deleting as you’ve specified. Again, all in a transaction.

A Repository Is Not a Unit of Work

Each method in a Repository is supposed to be an atomic operation – once again either pulling stuff out, or putting it back in. You could have a SalesRepository that pulls catalog information, and that transacts an order.

The downside to using a Repository is that it tends to spiral, and pretty soon you have one repository having to reference the other because you didn’t think the SalesRepository needed to reference the ReportsRepository (or something like that).

This quickly can become a mess – and it’s why people starting using UnitOfWork. UnitOfWork is an “atomic operation on the fly” so-to-speak.

The Only Thing You Could Do Worse: Repository < T >

This pattern is maddening. It’s not a Repository – it’s an abstraction of an abstraction. Here’s one that’s quite popular for some reason:

public class CustomerRepository : Repository < Customer > {

  public CustomerRepository(DbContext context){
    //a property on the base class
    this.DB = context;
  }

  //base class has Add/Save/Remove/Get/Fetch

}

On the face of it: what’s wrong with this? It’s encapsulating things and the Repository base class can use the context so… what’s the problem?

The problems are Legion. Let’s take a look…

Do You Know Where That DbContext Has Been?

No, you don’t. It’s getting injected and you have no idea which method opened it, nor for what reason. The idea behind Repository is code “reuse” so you’ll probably be calling it from a Registration routine, maybe a new order transaction, or from an API call – who knows? Certainly not your Repository – and this is the main selling point of this pattern!.

The name says it all: UnitOfWork. When you inject it like this you don’t know where it came from.

“I Needed The New Customer ID”

Consider the code above in our CustomerRepository – it will add a customer to a the database. But what about the new CustomerID? You’ll need that back for creating a log file and so you what do you do? Here’s your choice:

  • Run SubmitChanges() right in your Controller so the changes get pushed and you can access the new CustomerID
  • Open up your CustomerRepository and override the base class Add() method – so it runs SubmitChanges() before returning. This is the solution that the MSDN site came up with, and it’a bomb waiting to go off.
  • Decide that all Add/Remove/Save commands in your repo should SubmitChanges()

Do you see the problem here? The problem is in the implementation itself. Consider why you need the new CustomerID – it’s likely to do something else such as pop it onto a new Order object or a new ActivityLog.

What if we wanted to use the StudentRepository above to create a new student when they bought books from our book store. If you pass in your data context and save that new student… uh oh. You’re entire transaction was just flushed.

Your choice now is to a) not use the StudentRepository (using OrderRepository or something else) or b) remove SubmitChanges() and have lots of fun bugs creep into your code.

If you decide to not use the StudentRepo – you now have duplicate code…

But Rob! EF does this for you transactionally – you don’t need to SubmitChanges just to return the new ID – EF does it in the scope of the transaction already!

That. Is. Correct. And it’s also my point – which I’ll come back to.

Repositories Methods Are Supposed To Be Atomic

That’s the theory anyway. What we have in Repository is not a repository at all – it’s a CRUD abstraction that doesn’t do anything business-related. Repositories are supposed to be focused on specific operations – this one isn’t.

If you’re not using Repository then you know it’s almost impossible to avoid having “Repository Overlap Insanity” – losing all transactionality (and sanity) as your app grows.

OK Smart Guy – What’s the Answer?

There are two ways to stop this over-abstraction silliness. The first is Command/Query separation which at first might look a bit odd – but you don’t need to go Full CQRS – just enjoy the simplicity of doing what’s needed and no more…

Command/Query Objects

Jimmy Bogard wrote a great post on this and I’ve tweaked his example a bit to use properties: but basically you can use a Query or Command object:

public class TransactOrderCommand {
  public Customer NewCustomer {get;set;}
  public Customer ExistingCustomer {get;set;}
  public List<Product> Cart {get;set;}
  //all the parameters we need, as properties...
  //...

  //our UnitOfWork
  StoreContext _context;
  public TransactOrderCommand(StoreContext context){
    //allow it to be injected - though that's only for testing
    _context = context;
  }

  public Order Execute(){
    //allow for mocking and passing in... otherwise new it up
    _context = _context ?? new StoreContext();

    //add products to a new order, assign the customer, etc
    //then...
    _context.SubmitChanges();

    return newOrder;
  }
}

You can do the same thing with a QueryObject – read Jimmy’s post for more on this but the idea is that a query as well as a command has a specific reason for existence – you can change as needed and mock as needed.

Embrace Your DataContext

This is an idea that [Ayende came up with] and I absolutely love it: wrap what you need in a filter or, use a Base Controller (assuming you’re using a web app):

using System;
using System.Web.Mvc;

namespace Web.Controllers
{
  public class DataController : Controller
  {
    protected StoreContext _context;

    protected override void OnActionExecuting(ActionExecutingContext filterContext)
    {
      //make sure your DB context is globally accessible
      MyApp.StoreDB = new StoreDB();
    }

    protected override void OnActionExecuted(ActionExecutedContext filterContext)
    {
      MyApp.StoreDB.SubmitChanges();
    }
  }
}

This will allow you to work with the same DataContext in the scope of a single request – you just need to be sure to inherit from DataController. This means that each request to your app is considered a UnitOfWork… which is quite appropriate really. In some cases it may not be – but you can fix that with QueryObjects above.

Neat Ideas – But I Don’t See What We’ve Gained

We’ve gained a number of things:

  • Explicit Transactions. We know exactly where our DbContext has come from, and what Unit of Work we’re executing in each case. This is helpful both now and into the future.

  • Less Abstraction == Clarity. We’ve lost our Repositories which didn’t have a reason to exist other than to abstract an existing abstraction. Our Command/QueryObject approach is cleaner and the intent of each one is clearer.

  • Less Chance of Bugs. The Repository overlap (and worse yet: Repository) increases the chance that we could have partially-executed transactions and screwed up data.

So there it is. Repositories and UnitOfWork don’t mix and hopefully you’ve found this helpful!

Tags:

Rob Conery

I am the Co-founder of Tekpub.com, Creator of This Developer's Life, an Author, Speaker, and Audio/Video enthusiast. I care for many Open Source projects such as Massive and Subsonic.


Published on

  1. Matt says:

    This is an interesting idea, and I’ve certainly felt the pain of bloated repositories with methods such as “GetProcedureForType” etc. I’ve recenty been struggling with how to use EF6 in a recent project and started to go down the path of GenericRepository… and then I came across this article.

    In your command object example, it almost seems to mimic a service layer that performs a few related actions on multiple different objects. If one were to utilize the pattern you’ve described here, would you suggest creating separate class files for all of your commands and queries?

    1. Rob Conery says:

      Sorry for the late reply! A command can be anything really. I’m using this pattern in a Node app I’m building now and I have a registration “process” class – or “service” – which has the sole responsibility of validating things and popping a user into the DB.

      I also have a “add_new_user” command which does the data work – such as adding login credentials to the user, logging things, preparing a mailer, creating an email validation code, etc. This is “data” work (to me) so I stick it in a single command.

      In other words – there are a set of transactions that typically go off for a given process. Those are your commands :).

      1. Matt says:

        Thanks for the reply, Rob. Lately I’ve been doing my validations in a “service” class that is called when the user attempts to store something (i.e. user posts data, it’s validated with the rules in the service, then if all is well it proceeds, otherwise return errors).

        Given your example about adding a new user, I wonder if another developer might just stick an “add_new_user” or a “signup” method into his UserRepository. I suppose in this case, we could just ditch the repository altogether and in it’s place have a series of UserCommands that do specific things. What are your thoughts?

      2. Rob Conery says:

        Yep. I’ve lately been using the term “process”, it’s clearer. But that’s where I stick validations and let the command do its thing only when ready.

  2. Matt says:

    Thanks. I’ll be giving this pattern a shot in the near future.

  3. Adrian says:

    This Looks very promising Rob.

    I just don’t see where we should place business logic and validation?

    Example: We have requirement to CRUD cities (Entity: City) into database. For this e.g. an Asp.Net Controller called CityController is in place for CRUD of Cities. Business logic must do it’s work and data must be validated before save. So we use a CityCommand object.

    Then we have an AddressController for CRUD Addresses (with an AddressCommand in place, right?). But an Address contains a City. Based on your Command-Pattern, how can we reuse the business logic we just implemented in the CityCommand when working in the AddressController (AddressCommand)?

    Is at all the Command-Objekt the right place to have Business logic and Validation logic?

  4. Simon says:

    Yup. Every time I’ve seen code trying to use the Repository pattern with an ORM it just doesn’t feel right.

    In my current project using Entity Framework, I’ve ended up just using C# extension methods on the DbContext, returning IQueryable which then allows further composition/projection of the query. This works surprisingly well for Read operations. It doesn’t seem so bad to be going context.GetFooById(id). It’s testable too.

    Initially I felt bad about polluting the world with more damn extension methods. But I got over that and it seems to be working well a year down the track.

    Write operations are a different matter however and inevitably end up just being whacked into the Controller method.

  5. Soren says:

    Hi,

    in the second solution you suggest (DataController) – how will you handle errors that arise when the context are being saved? I think it is bad practice to let a framework do the SaveChanges and not issue the command itself.

    If you control when the method is invoked you are better prepared to handle any errors that might arise.

    Fx.

    In a OrderController there is 2 methods.

    CreateOrder

    CreateReturnOrder

    If you decide to use the OnActionExecuted – how can you decide which one of the two were invoked and return a correct error message to the user?

    If you want further reading on the Command object solution I will recommend this excellent article. https://www.cuttingedge.it/blogs/steven/pivot/entry.php?id=91

    1. Rob Conery says:

      Hi Soren – I don’t think I follow your question. Are you asking how you would handle errors on save? I can think of a number of ways – the first would be to try/catch right there in the DataController itself, but what would you do? In my mind if a transaction fails it fails – execution should probably stop… but if you don’t want it to then I suppose handle that case?

      I don’t follow what you mean by:

      I think it is bad practice to let a framework do the SaveChanges and not issue the command itself

      Does “Fx.” mean “for example”?

      Why do I care which method was fired on the Controller?

      1. Soren says:

        Hi Rob,

        if this line ‘MyApp.StoreDB.SubmitChanges();’ returns an exception – how can you know which of your methods generated the exception?

  6. Rob Conery says:

    I think you’re asking me what I’d do if SubmitChanges() threw an Exception yes? If I cared, about it and needed to catch it, I would catch it and probably do something with the ViewState right there. You can access all the information you need in the ActionFilterContext if that’s important to you.

    Sorry if I’m being dense, I don’t see what problem you’re seeing.


Leave a Reply

Rob Conery

I write about web development, speaking, audio/video things, databases and whatever else I might be obsessing on

Back to Home