Schema-based multi-tenancy with Spring Data, Hibernate and Flyway
Multi-tenancy is an architectural pattern that allows multiple tenants to use a single instance of software,
the purpose of which is to provide each tenant with a dedicated share of the instance but to isolate the information belonging to each tenant.
In this tutorial, we are going to look at how to implement schema-based multi-tenancy in Spring Boot application.
1. Project
We will start by creating a simple RESTful web service, protected by Spring Security, that will use Spring Data JPA to persist data in the embedded H2 database.
1.1. Maven Dependencies
First, we add the necessary dependencies to pom.xml:
Next, we create entity classes that represent our domain objects:
1.3. Repositories
Then we define CRUD repositories for the entities:
1.4. Services
Now we can implement some logic in the service layer:
1.5. Controllers
Finally, we can expose our web service with REST API:
1.6. Security
The web service is ready and we can run it, but all of our API endpoints are publicly available.
To protect some of them, we need to configure http security in a configuration class that extends WebSecurityConfigurerAdapter.
We will make the user registration endpoint public, and also open access to the H2 console, which can be enabled using the spring.h2.console.enabled = true parameter.
All other endpoints will require authentication, but in order not to complicate the example with some token-based authentication, we just enable http basic authentication for them:
Note that we use the UserService to provide the authentication manager with user details.
For this purpose, we made it to implement UserDetailsService, and we made our User class to implement UserDetails.
2. Multi-tenancy
Typically, a tenant would be a group of users, such as an organization, but in our simplified implementation, each user will be a separate tenant.
Also you probably already noticed that, although we can create notes on behalf of different users, all of them are saved in one table, and all users have access to all notes.
The obvious solution would be to record the owner of the notes in the database and retrieve the notes for each user using this data.
This is the most common multi-tenancy option, but since we want more isolation between tenant data, this option does not suit us.
We would get the greatest isolation when each tenant uses a separate database, but this complicates and increases the cost of our infrastructure.
But there is a third option that is cheaper and still provides partial isolation: the schema-based multi-tenancy.
2.1. Hibernate configuration
We are going to use the multi-tenancy features of Hibernate, which is the default JPA provider in Spring Data.
All we need to do is provide an implementation of the CurrentTenantIdentifierResolver and MultiTenantConnectionProvider interfaces,
and add them to the JPA properties along with the multi-tenancy strategy.
CurrentTenantIdentifierResolver - resolves the tenant identifier to use.
In our implementation, it gets the authentication data from the security context and uses the username as the identifier of the tenant;
if authentication is anonymous or missing, it falls back to the default tenant identifier.
MultiTenantConnectionProvider – provides connections based on tenant identifier.
Our implementation reuses the JDBC connection pool to serve all tenants, but before using the Connection,
it alters it with the SET SCHEMA command to reference the schema named by the tenant identifier.
It remains only to set all the necessary parameters in the configuration:
2.2. Flyway configuration
The tables in the shared schema for unauthenticated users and in the tenant schemas will be different,
so we need a way to perform different migrations in these schemas, and we will use Flyway for this.
We configure it to start migrations for the default schema from the db/migration/default directory,
and then iterate over all tenants and start migrations for each of them from the db/migration/tenants directory.
Next, we will create the necessary migrations in the appropriate directories:
We also create a service for programmatically creating a scheme for new tenants and performing all relevant migrations:
Finally, we update our user service and make it initiate the creation of a schema for new users:
3. Test
Now we can run our application and test it. We start by creating multiple users:
We create a note using John’s credentials:
Then we also create a note using Jane’s credentials:
We can notice that the generated note ID is the same because notes were stored in separate tables in schemas specific to the tenant.
Finally, we will request all notes using the credentials of one of the users and make sure that we only get notes belonging to this user:
Also, to make sure that everything works as we intended,
we can go to http://localhost:8080/h2-console and use the H2 console,
access to which we opened earlier.
4. Conclusion
Schema-based multi-tenancy provides the best balance between performance, tenant isolation, development complexity, and infrastructure cost.
However, when choosing a multi-tenancy strategy, you should always consider the level of security that your tenants will require
and indicate the level of data isolation that you provide in the Service-Level Agreement: either full or partial.
Besides, sharing resources in a schema-based approach can make SLA compliance difficult,
as one tenant performing resource-intensive tasks can cause latency spikes for everyone else.