Factory and Repository in the Domain

I’m a big fan of the book Domain Driven Design, and for some time I’ve been pushing the principles and patterns in my workplace. I’ve never managed to get ‘into’ the more theoretical parts of the book, and I chuckle each time one of my colleagues refers to the “contours of the domain”, however it’s up there with books that have caused a substantial shift in my thinking.

When I’m evangelising the use of Entities, Aggregates, Value Objects, and Repositories, one question that comes up regularly is “what’s the difference between a Repository and a plain old DAO?”. It’s a very good question with the right level of skepticism.

I really enjoyed reading Christian Bauer’s post “Repository Pattern vs. Transparent Persistence” – particularly the title which implies that there is a choice to be made between two approaches. In actual fact transparent persistence as implemented in Hibernate makes DDD and the use of Repositories extremely powerful. I’ll backtrack a little…

A lot of web application code that I find these days that was written with ‘best practice’ of just a few years ago follows a rough outline like this. DAO objects (or Data Mappers) provide find/save/update/delete operations methods for each object in your domain. The DAO is a basic abstraction around database persistence, often JDBC.

Business logic is usually either written directly within Transaction Scripts (recently I’ve become used to inheriting hundreds of lines of code in Struts actions… *sigh*) or within a basic Service Layer. The script or service is the only thing that can hold references to the DAOs, so of course you HAVE to put the business logic there – in most web apps there is a lot of creation and persistence of objects. No problem, we’ll just divide and conquer – from our scripts we’ll extract services, then we’ll divide our services into more services and delegate between them.

Because the application does a lot of operations requiring persistence, there’s a natural force preventing any business logic being moved to anywhere but scripts or service layers. So what are our domain object? Nothing but structures carrying data, with maybe some rudimentary logic that only affect local state. The client code asks the domain to provide it with data, and then the client code makes decisions based on that data.

The transaction script and service layer approach is simple to begin with, but as the application becomes more complex it leads to a bunch of problems, especially as developers try to avoid duplication (cut and paste is bad kiddies). Ever picked up a hierarchy of Struts actions seven layers deep with multiple Template Methods to allow overrides for different sub-classes? Bloody nightmare.

What I’ve achieved by applying the DDD patterns is the elimination of those transaction scripts and (most) services. The transaction script (e.g. struts action in a web application if you’re so inclined) is simply responsible for finding an appropriate entry-point into the domain, then telling that domain class to do some work.

The consequence is that business logic can be expressed more deeply in your domain, you are free to find the right abstractions that will allow you to make your code understandable, and to create small classes with single responsibilities.  This also means that the creation and persistence of objects will be deep in the bowels of the domain, and even better – without the calling client code or script HAVING to be aware of the object creation.  How can this be?  Domain classes aren’t allowed to hold references to DAOs – that would break our traditional view of layering.

But… when I apply DDD, Repository and Factory interfaces are part of the domain.  This is a fundamental change – that a domain object deep in an aggregate can be constructed with a reference to a Repository in order to look up persistent data.  That a domain object can also use a Factory to construct objects, hiding the detail of construction (and dependency injection).  None of this work has to be done in a transaction script or script – move that responsibility into a place in your domain where it makes sense.

Repositories can be used to perform lazy instantiation of relationships in a persistent graph of domain objects.  E.g. retrieve a PurchaseOrder from the PurchaseOrderRepository, when the PurchaseOrder requires it’s line items it asks the LineItemRepository for matching line items.  A consequence is that this will lead to a bunch of Repositories being created in a complex model.  It’s simpler (less code) if you can use a tool like Hibernate to manage persistence of the entire graph, and configure it to automatically instantiate collection relationships lazily.  Nothing about Repository prevents you from doing that – you would just eliminate the use of Repository at that point and use the collection directly.   Hibernate is no solution to fit all problems however, and we commonly find places where explicit use of a Repository is cleaner.

I also deal with a lot of ‘legacy’ code that needs to be renovated into submission – often a safe path with JDBC code is to refactor towards the use of Repositories and moving the use of those Repositories into the domain to make it more understandable.  Often the JDBC code inside the Repository implementation does not have to change much.  When it does later on, you have the opportunity to make a shift towards a technology like Hibernate.

Transparent persistence and the Hibernate Session is a really important tool for implementing a rich domain model – if we make our client code (transaction script, service) as thin as possible, it is no longer responsible for remembering which objects have been updated and saving them back to the database.  Instead we use the Hibernate Session as a Unit of Work – our client code tells the domain to go and do something and the domain can go ahead and perform calculations, validate data, and update fields as need be.  When the work is complete we tell the Session to commit and it works out what needs to be updated in the database.  Brilliant – now if we add more complexity to the domain we don’t have to add matching complexity to our client code, and not in multiple places.

Christian is very concerned with adding additional layers into our code – a Repository wrapping a DAO.  I personally do away with the DAO idea altogether – the Repository implementation is performing the same thing.  The important thing is that the Repository interface (e.g. PurchaseOrderRepository) lives in the domain, and the implementation (e.g. JdbcPurchaseOrderRepository or HibernatePurchaseOrderRepository) lives outside the domain in infrastructure.

Conclusion (otherwise I’ll go on all day):  let go of the idea that domain classes must not interact with a database (or a message queue or a file system) – free them to do their responsibility, just make sure they do so through an appropriate domain abstraction – Repository and Factory are good examples.

Oh – and stop using Struts (1.x anyway).

8 thoughts on “Factory and Repository in the Domain”

  1. Glad to see someone denouncing the dominant design of putting all the business logic in the transaction scripts – get it back on the domain objects where it belongs.

    You say ‘The transaction script (e.g. struts action in a web application if you’re so inclined) is simply responsible for finding an appropriate entry-point into the domain, then telling that domain class to do some work.’

    Are you aware of Naked Objects – we’ve taken this to the extreme, whereby you don’t have any (developer-written) controllers at all. The framework provides you with direct access to the domain objects and their public methods. I think you’d love it. And to get you to the domain objects, we simply provide the user with direct access to (a subset of methods on) the Repositories, which we use in exactly the same way that you do.

    I’d be interested to hear your reactions to this idea.

    Richard

  2. Yes, Yes, Yes. We need this said more often.

    Its good to see the tide turning against the flawed idea that you can have a domain model thats both behaviorally rich and yet never initiates interaction with the world around it.

    Other than domain -> repository access, the other archetypal case of useful domain -> service access are services that listen to domain events. Eg you might want to send an email to Prod Support when a BankAccount becomes illegally overdrawn.

    In my Spring/Hibernate-flavored enterprise world, support for injecting into domain objects is still a practical sticking point. My colleagues oppose the AOP-based weaving model provided by Spring 2.0 as too much “techno-magic”.

  3. Who says that ojbects of domain layer aren’t allowed to hold references on DAOs? In every layered architecture I’ve seen DAOs were in the infrastructure layer. Domain objects are in domain layer. Infrastructure layer is below domain layer. In Eric Evans DDD there is even an example where an association between two model objects is expressed by a getCollection() method that executes a query directly.

  4. A year ago on my first attempt to do DDD (the idea sounded good and strangely simple) I did the mistake of creating too many packages and was afraid of giving access to Repositories from the Domain Model.
    I think there is indeed a lot of misunderstanding in that some people understand Domain Model as being “the” Domain.
    In simplifying DDD by doing so we can only end up with a more anemic model that one might expect.
    My guess is that asynchronous operations should be dealt in a diffeerent maner, probably by using Domain messaging which is simple in itself to implement.

    Daniel Fernandes

  5. As far as I understand, you would define one repository per aggregate root and defining the aggregate boundaries in a model will cause a dramatic reduction in the amount of repositories rather than an over complication. I don’t see how datamapper and repository are mutually exclusive choices. A repository is supposed to behave like an in memory collection of objects, those objects being aggregates. The repository need only expose queries to the aggregates and we navigate down the graph from the root. This view promotes the enriching of the model because it forces you to use the relationships, or a query object to retrieve chidren of interest from the aggregate root. A datamapper typically deals with a class or family of classes. In the case of value objects perhaps used in more than one aggregate, it would have it’s own datamapper. It can… if you enforce the unidirectional relationship (which evans promotes) between the value object and it’s parent because a bi-directional one may not make sense. So you could use the same datamapper in more than one repository. Here I see the repository as the lasting concept and the datamapper as a transient one. Even through I use datamapper my UnitOfWork only knows about an IPersistence interface when we are ready to commit object changes.

  6. Can u give me a example? I think it’s easy understanding if some codes exists.
    I am confused for using repository and factory pattern together, If I created a object using factory ,I must add these object to the repository, should repository be a listener for factory?

Leave a Reply

Your email address will not be published. Required fields are marked *