I’m a big fan of the book Domain Driven Design, and for some time I’ve been pushing the principles and patterns in my workplace. I’ve never managed to get ‘into’ the more theoretical parts of the book, and I chuckle each time one of my colleagues refers to the “contours of the domain”, however it’s up there with books that have caused a substantial shift in my thinking.
When I’m evangelising the use of Entities, Aggregates, Value Objects, and Repositories, one question that comes up regularly is “what’s the difference between a Repository and a plain old DAO?”. It’s a very good question with the right level of skepticism.
I really enjoyed reading Christian Bauer’s post “Repository Pattern vs. Transparent Persistence” – particularly the title which implies that there is a choice to be made between two approaches. In actual fact transparent persistence as implemented in Hibernate makes DDD and the use of Repositories extremely powerful. I’ll backtrack a little…
A lot of web application code that I find these days that was written with ‘best practice’ of just a few years ago follows a rough outline like this. DAO objects (or Data Mappers) provide find/save/update/delete operations methods for each object in your domain. The DAO is a basic abstraction around database persistence, often JDBC.
Business logic is usually either written directly within Transaction Scripts (recently I’ve become used to inheriting hundreds of lines of code in Struts actions… *sigh*) or within a basic Service Layer. The script or service is the only thing that can hold references to the DAOs, so of course you HAVE to put the business logic there – in most web apps there is a lot of creation and persistence of objects. No problem, we’ll just divide and conquer – from our scripts we’ll extract services, then we’ll divide our services into more services and delegate between them.
Because the application does a lot of operations requiring persistence, there’s a natural force preventing any business logic being moved to anywhere but scripts or service layers. So what are our domain object? Nothing but structures carrying data, with maybe some rudimentary logic that only affect local state. The client code asks the domain to provide it with data, and then the client code makes decisions based on that data.
The transaction script and service layer approach is simple to begin with, but as the application becomes more complex it leads to a bunch of problems, especially as developers try to avoid duplication (cut and paste is bad kiddies). Ever picked up a hierarchy of Struts actions seven layers deep with multiple Template Methods to allow overrides for different sub-classes? Bloody nightmare.
What I’ve achieved by applying the DDD patterns is the elimination of those transaction scripts and (most) services. The transaction script (e.g. struts action in a web application if you’re so inclined) is simply responsible for finding an appropriate entry-point into the domain, then telling that domain class to do some work.
The consequence is that business logic can be expressed more deeply in your domain, you are free to find the right abstractions that will allow you to make your code understandable, and to create small classes with single responsibilities. This also means that the creation and persistence of objects will be deep in the bowels of the domain, and even better – without the calling client code or script HAVING to be aware of the object creation. How can this be? Domain classes aren’t allowed to hold references to DAOs – that would break our traditional view of layering.
But… when I apply DDD, Repository and Factory interfaces are part of the domain. This is a fundamental change – that a domain object deep in an aggregate can be constructed with a reference to a Repository in order to look up persistent data. That a domain object can also use a Factory to construct objects, hiding the detail of construction (and dependency injection). None of this work has to be done in a transaction script or script – move that responsibility into a place in your domain where it makes sense.
Repositories can be used to perform lazy instantiation of relationships in a persistent graph of domain objects. E.g. retrieve a PurchaseOrder from the PurchaseOrderRepository, when the PurchaseOrder requires it’s line items it asks the LineItemRepository for matching line items. A consequence is that this will lead to a bunch of Repositories being created in a complex model. It’s simpler (less code) if you can use a tool like Hibernate to manage persistence of the entire graph, and configure it to automatically instantiate collection relationships lazily. Nothing about Repository prevents you from doing that – you would just eliminate the use of Repository at that point and use the collection directly. Hibernate is no solution to fit all problems however, and we commonly find places where explicit use of a Repository is cleaner.
I also deal with a lot of ‘legacy’ code that needs to be renovated into submission – often a safe path with JDBC code is to refactor towards the use of Repositories and moving the use of those Repositories into the domain to make it more understandable. Often the JDBC code inside the Repository implementation does not have to change much. When it does later on, you have the opportunity to make a shift towards a technology like Hibernate.
Transparent persistence and the Hibernate Session is a really important tool for implementing a rich domain model – if we make our client code (transaction script, service) as thin as possible, it is no longer responsible for remembering which objects have been updated and saving them back to the database. Instead we use the Hibernate Session as a Unit of Work – our client code tells the domain to go and do something and the domain can go ahead and perform calculations, validate data, and update fields as need be. When the work is complete we tell the Session to commit and it works out what needs to be updated in the database. Brilliant – now if we add more complexity to the domain we don’t have to add matching complexity to our client code, and not in multiple places.
Christian is very concerned with adding additional layers into our code – a Repository wrapping a DAO. I personally do away with the DAO idea altogether – the Repository implementation is performing the same thing. The important thing is that the Repository interface (e.g. PurchaseOrderRepository) lives in the domain, and the implementation (e.g. JdbcPurchaseOrderRepository or HibernatePurchaseOrderRepository) lives outside the domain in infrastructure.
Conclusion (otherwise I’ll go on all day): let go of the idea that domain classes must not interact with a database (or a message queue or a file system) – free them to do their responsibility, just make sure they do so through an appropriate domain abstraction – Repository and Factory are good examples.
Oh – and stop using Struts (1.x anyway).