Introduction to JPA Sessions and Object Graphs

What You'll Learn

That there are some gotchas with JPA that need careful design
That the Spring Boot makes a design choice that can be justified but which has a number of critics.

In this branch of code, we have taken a very pure design approach. We want to keep the core of our application as independent of the framework and of other concerns as possible. That means. we keep the domain objects pure.

In order to persist our objects, we need to call an interface that takes domain objects and then does the work to persist those objects.

Previously, this interface was implemented by a repository that used JDBC and SQL to map the objects into database queries.

With JPA, the SQL is created for us by Spring Data JPA (which in turn uses Hibernate as the implementation of JPA). JPA allows us to work with objects (entities) which are then "persisted" for us. However, these entities are very much tied into the database mapping so a "pure" architecture (aka "Clean" or "Hexagonal" architecture) will adapt our domain objects into entities. This adaptation is done in an adaptor. The adaptor implements the "repository" interface and uses the JPA repository methods.

Let's look at some code...

Starting with tag 044-1.

In CharityRepoAdaptorJPA.java

//...
@Repository
@Slf4j
public class CharityRepoAdaptorJPA implements CharityRepository {

    private CharityRepositoryJPA charityRepositoryJPA;

    public CharityRepoAdaptorJPA(CharityRepositoryJPA repo) {
        charityRepositoryJPA = repo;
    }

    public List<Charity> findAll() {

        return charityRepositoryJPA.findAll()
                .stream()
                .peek(c -> System.out.println(c))
                .map(c -> c.toDomain())
                .collect(Collectors.toList());
    }
 //...

This is the adaptor. It is annotated as a @Repository and it implements the CharityRepository interface.

The findAll() method is then an example of what needs to be done. The JPA repository returns a list of Charity entities and these are then converted into domain objects. We've chosen to implement this by adding a toDomain() method onto the entity class (since it is really just part of our adaptor code).

What about the trustees?

This is where we can have a problem. In our current design, we keep charities and trustees together in the domain, but they are handled separately in the DTO layer. When we create domain objects from entities, we need to consider whether we want the charities to have their trustees linked to them (i.e. do we want all those objects in memory?).

You may be thinking that, of course, we want those objects, but do we? If we bring back a list of charity objects, do we want all of the trustee objects as well. 100 charities would imply circa 300 trustee objects (all of which consume some memory). Now multiply that by 100 users who all have their own copies of the objects. Now consider that down the line, our charities will have donations made to them. Each charity could have millions of donations made to it over time.
See the problem?

However, if we don't get them straight away, what are we going to do?

Well, these are "magic" JPA objects. Through the magic of proxies, we can still access those other objects when we need them. That's true - as long as the subsequent calls are within the same JPA session.

The way JPA works (at a high level of abstraction) is that it;

Starts a session
Creates and invokes SQL
Maps the returned data into proxy objects (the objects are ‘attached')
(Allows calls on the proxies which may involve more SQL calls)
Closes the session
(the objects are now detached)

If we run the tests at tag 044-1, the tests pass, but at 044-2 one test fails. At 044-3, the tests are passing again.

The difference can be explained by when the session is started and ended. In the two working tags, the service class is annotated with @Transactional. When this is removed, one test fails.

In test > CharitySearches.java

  @Test
    public void shouldGet11Charities() throws Exception {
        List<CharityDTO> charityDTOList = charityService.findAll();
        assertEquals(11, charityDTOList.size());
    }

Why does this test fail and the others pass? By default, Spring Boot enables a pattern called "Open Session in View". This means that the session is started in the MVC layer. All of the other tests go through the MVC layer, but the test above doesn't.

This tests only passes when @Transactional is on the service layer which, essentially, starts the session when the calls reach the service layer.

When we don't have the session managed like this, the session will be closed as soon as the repository method returns and so the calls to get the trustee details will fail and throw a "Lazy Initialisation Exception".

In tag 044-5, we use another JPA concept, namely an ‘Entity Graph'. An entity graph allows the developer to determine how much data to return in one call. The graph defines the network of objects that will be available to callers once the repository call completes.

We'll illustrate this by coding two methods that return the same (on the surface) but which will work differently.

In CharityRepositoryJPA.java

@EntityGraph(attributePaths = {"trustees"})
    @Query("select c from Charity c")
    List<Charity> findAllWithTrustees();

    @Query("select c from Charity c")
    List<Charity> findAllWithoutTrustees();

We've implemented two methods. Both return all charities, but one is annotated with @EntityGraph and tells JPA that the objects on the path trustees should be included in the entity graph.

Note: we have to use @Query here because JPA isn't clever enough to work out that some parts of the method name are merely descriptive (despite some documentation implying that it is).

The JPQL query is very simple - it's just a select.

We've switched the service so that it calls the method with the entity graph. The test will pass.

In tag 044-6, we switch to the method without the entity graph, and the test will fail.

We could try to handle the problem manually. In tag 044-7, we've changed the toDomain() method so that it can be configured to get the trustees or not. However, the test still fails.

This is because we're calling the toString() method on the charity object and that Lombok implementation includes all fields so it tries to print the trustees (using their toString() method).

This shows how we really need to think about the design of the data layer.

Many modern systems are actually inter-connected systems of systems. Businesses will invest in COTS (Common Off-The-Shelf) packages for business domains that are not differentiating. It is common for these and other systems to expose APIs that allow developers to integrate. It is rare for these other systems to allow direct access to the databases of the systems.

APIs provide data as JSON or XML

Your application may have to deal with data that it gets as JSON or XML. A common approach in APIs is to provide a set of data and then provide the API calls required to get related data (see HATEOAS).

This puts us in a similar position, but there may be differences in control of the interfaces. In our application, we have control over all of the code. With APIs, the API may be designed for access by many and therefore, the adaptation required may vary according to what the client system is trying to do.

Software that fits in my head

A phrase (often used by Dan North) is to write software so that it fits in your head. When integrating your application with others, it can be advantageous to write the code your way. Treat that as one problem. Then, approach the problem of mapping a foreign data source into your model as a separate problem.

Doing this may lead to more code, but it will also lead to clean boundaries and your brain only needs to work within the constraints of those boundaries which makes it easier to reason with.

In this tutorial, we've exposed some of the problems that can occur when using JPA. We've also discussed that how these problems are not limited to JPA, but are common in application development, especially when that development involves integration with other systems. In essence, we've almost treated our database as another system that provides a JPA interface and into which we've integrated our own application.

In this next tutorial, we'll make some wide-ranging changes to both our design and code which will simplify the coding, but which reduces the independence of our data access layer.