Creating a Multitenant Health Platform with Directus

Creating a Multitenant Health Platform with Directus
đź’ˇ
This is an edited script of the talk I gave in the Directus Berlin User Group in July 2023. You can check out the talk here or download the presentation slides at the end of the article.

Today we're going to be talking a bit about multitenancy in Directus; or, in other words, we're going to see how you can make your single Directus instance serve multiple users. We'll start by going over the definition of multitenancy and the reasons why we may need it. Then, we'll look at an implementation example based on our own experiences with multitenancy, some of the challenges we came across, and what we learned. Hopefully, by the end of this article you'll have a clearer idea of how you can setup a basic multitenant configuration in your own projects!

What is multitenancy?

Multitenancy is an architecture where you have one single instance of your application serving multiple users, which can be individuals or groups. For example, let's say you setup a CMS for a group of schools so they can setup their own blog. With multitenancy, you could have all schools using the same instance to manage their content while ensuring that each school only has access to content belonging to them.

Use cases

But why would you want to do that? Why not just setup individual instances for each school that would like to use your CMS? Well, maybe you want to lower infrastructure costs. In some situations (and I want to stress "some"), it may be cheaper to just have one single instance running, especially if you don't expect heavy traffic. However, scaling up one instance can be more expensive than just having multiple instances, so this benefit is very much context-dependent.

Another possible benefit is that it can lower the entry barrier for your product. For example, if a potential client doesn't want to handle infrastructure, but you also don't have the capacity to handle multiple deployments, a good compromise could be to bear the costs of a single instance that can support multiple tenants.

And finally, it can also make it easier to aggregate data; for example, if you want to understand which features of the CMS are used the most across tenants.

DIRECT - a case study

Let's say these benefits convinced you to try out a multitenancy approach: how would you start? Going over a real-life example can give you a solid foundation, and that's why today we're going to look at one of the products developed by Hybrid Heroes. DIRECT is a platform we're currently developing for Freie Universität Berlin focused on the management of psychological studies. It's essentially a one-stop shop for study coordinators (e.g. other universities, humanitarian organisations) to setup the contents of a study, to distribute that content to participants, to manage these interventions, and to analyse the outcomes. One possible example of a study could be to analyse the psychological impact of the immigration process on an asylum seeker. You can learn more about this project here and here.

Where does multitenancy come into play? The main goal of our clients was to be able to offer the platform to other partners (e.g. other universities or NGOs) without having to ask them to setup their own infrastructure and without increasing their own infrastructure costs. These reasons match up nicely with the benefits listed above, and so we opted for a multitenancy approach.

Multitenancy beyond Directus

Before going any further, I'd like to make a small disclaimer. I'll be focusing on adding multitenancy support to Directus. However, it's likely that your own use case has more moving parts, and more places where you also have to implement support for multitenancy.

DIRECT is composed of multiple services, which all need to support multitenancy as well.

DIRECT itself is composed of several services, all of which needed to be modified in order for multitenancy support to be consistent across the platform. This is to say that it's likely that you will need a more comprehensive solution that also takes into consideration other services outside of Directus. What I aim to get across during this article is a starting point that can hopefully make your multitenancy implementation a little bit less overwhelming.

Getting started!

When you google "directus multitenancy", there are not a lot of results to go around. A couple of GitHub issues and a glossary entry in the Directus documentation make up the top results, so let's start by taking a look at the glossary entry.

Multitenancy is an architecture that allows multiple tenants (e.g., customers) to be managed by the platform. There are two main ways to achieve multitenancy: project scoping [...] and role scoping.

The glossary entry provides a very similar definition of multitenancy to the one we looked at earlier on, but it also adds some Directus-specific details. Namely, the two main ways you can achieve it in Directus: either through project-scoping, or role-scoping. We opted for a role scoping approach because we self-host and it wasn't viable to deploy multiple projects.

In this article, I'll showcase a slightly modified approach, and attribute a tenant to each user, as opposed to attributing a tenant to each role. We found this approach to be a bit simpler, and it is also supported by Directus in one of the latest videos released on the topic by them. Regardless of whether you assign tenants to roles or to users, the main concept to keep in mind is permissions.

Implementing multitenancy

In our approach, tenant-based permissions are the core of the tenancy system. These permissions allow us to filter which items a user has access to based on the tenant assigned to that user and to the item in question. Let's take a look at how the data is structured.

Relations between the Tenants, Users, and other custom Collections.

In this approach, we have a Tenants collection. In a real-life scenario, you may want to restrict access to this collection to super-admins, for example. After we create a Tenants collection, we need a way to assign these tenants to users and to other collections which need multitenancy support. To do this, we create a tenant field in the multitenant collections, thus establishing O2M relations with said collections and the Tenants. So now we can assign tenants to things, but we still don't have data segregation, meaning that you can see content belonging to all tenants. To do that, we need to setup the permissions accordingly.

Relations between the Users, Roles, Permissions, and other custom Collections.

So, how do the permissions work? Users belong to Roles, which in turn can have many Permissions that dictate how Roles can interact with Collections. For example, let's say we have a Customer role and an Admin role; we may want to prevent the Customer role from creating Tenants, and assign full CRUD permissions to the Admin role. You may have noticed that the Permissions collection doesn't have a relation with the Tenants collection. So how do we create tenant-based permissions?

Tying the data structure together.

To put it all together, we need the last piece of the puzzle: permission rules. Rules allow you to make permissions more specific. In this case, we're adding a rule that states that this permission in particular (let's say this is a permission for the Customer role to read from the Products collection) is only granted if the Tenant of the item in the Products collection matches the tenant of the User that is trying to read a Product.

And just like that, we have a multi-tenant CMS! There are a lot more topics that we can cover, and I will briefly mention them in the following sections, but this is the gist of implementing multitenancy in Directus. There is one more topic which we think can be helpful in your own implementation, and that is automation.

Automating permissions

As you may be aware, adding permissions manually through the Directus UI can be a bit tedious and error-prone. Forget to add CRUD permissions to a newly-added collection and you'll soon be scratching your head, wondering why you're getting no results when you request items in that collection. Because we have a lot of collections, which are regularly setup in things like review apps, we integrated automated permission handling in when bootstrapping our Directus deployment.

const tenantCollections = (await directus.fields.readAll())
  .filter((item) => item.field === 'tenant')
  .map((item) => item.collection);

Using the Directus SDK, we first fetch all of the collections which support multi-tenancy. Then we update the directus_permissions collection to include permissions for all the relevant CRUD operations we want to support, ensuring that the permissions field specifies the rule that we looked at above, which ensures users can only do the specified action on items that belong to their tenant.

const permissions = [];

// ...


tenantCollections.forEach((collectionName: string) => {
  permissions.push({
    role: userRole,
    collection: collectionName,
    action: 'create',
    permissions: null,
    validation: null,
    presets: { tenant: { id: "$CURRENT_USER.tenant" } },
    fields: ['*']
  });
  // ...
  permissions.push({
    role: userRole,
    collection: collectionName,
    action: 'update',
    permissions: { _and: [{ tenant: { id: { _eq: "$CURRENT_USER.tenant" } } }] },
    validation: null,
    presets: { tenant: { id: "$CURRENT_USER.tenant" } },
    fields: ['*']
  });
});

// ...
// Make sure you're not creating duplicate permissions
// ...

await directus.permissions.createMany(permissions);

In addition to our own collections, we can also set up permissions for system collections like directus_files and directus_folders in a similar fashion.

const permissions = [];

// ...

// directus_folders
permissions.push({
  role: userRole,
  collection: 'directus_folders',
  action: 'create',
  permissions: null,
  validation: null,
  presets: { tenant: { id: "$CURRENT_USER.tenant" } },
  fields: ['*']
});
permissions.push({
  role: userRole,
  collection: 'directus_folders',
  action: 'update',
  permissions: { _and: [{ tenant: { id: { _eq: "$CURRENT_USER.tenant" } } }] },
  validation: null,
  presets: { tenant: { id: tenant } },
  fields: ['*']
});

// ...
// Make sure you're not creating duplicate permissions
// ...

await directus.permissions.createMany(permissions);

With the basic data structure in place, as well as some automation to make our lives easier, we have a simple multitenancy setup ready to go.

Downsides

As useful as we have found multitenancy to be for our use case, the approach we just described does come with some downsides. I would highlight two: complexity and data security.

Setting up a multi-tenant system is inherently more difficult because you have to setup additional systems to ensure data segregation in all the services you offer. Plus, in multi-tenancy systems, you often need to setup even more systems to allow for tenant-based customisation. Think, for example, a way to setup custom UI colors on a per-tenant basis.

Secondly, because data from multiple tenants is colocated, even more attention needs to be paid to the security of that data. We're placing a lot of trust in the multitenancy system to ensure that the data remains isolated from other tenants, which means more checks and possibly more services to make sure that is happening.

Multitenancy has its tradeoffs, and there isn't one clear-cut rule that can tell you whether you should or should not use it. For us, it made sense at the time; it may or may not make sense for your project, and hopefully now you're better equipped to make this decision!

Other topics

This guide is meant as a starting point to multitenancy, so there are a lot of topics which I couldn't cover in this article. For example, how do you handle uniqueness across tenants? Let's say you have a Books collection, and in the schema of that collection, you have an ISBN field that is set to unique. Two of your tenants happen to have the same book in stock. The first tenant that adds the Book to the collection won't have any problem. However, the second tenant will try to add the Book and Directus will throw an error because there is already an entry with that same ISBN.

Maybe you can setup composite indices so that unique constraints apply to the <id, tenant> pair and not just to the ID. But currently that isn't natively supported in Directus, so you need to implemente some workarounds. Or you can remove the unique constraint, if you believe that won't have a negative impact on the integrity of your data schema.

What about tenancy outside of Directus; how do you sync up the Tenants collection with external services? You can setup scripts that run periodically and sync the necessary information, or you can make use of Directus hooks. And how do we make sure that permissions are up to date if we add a new collection? For our use case, automation wasn't worth the effort, but you can adopt a similar approach to the tenant sync I mentioned above.

Like I said previously, here we followed a user-based tenancy as opposed to a role-based one. But both are viable options; I've linked some discussions I've found that go a bit more in-depth into this in the Resources!

Conclusion

Multitenancy is a complex topic, and one that is very much influenced by the context it is used in. It can allow you to use one single instance of your Directus deployment for multiple users, thus potentially reducing infrastructure costs and lowering the barrier of entry for your product. However, it does come at the cost of complexity and increase security concerns. We hope that this article has helped you make a more informed decision as to whether multitenancy makes sense for your project!

References