Multi-Tenant Architecture - Planning and Design

Recap
Important design decisions
Should we use a separate database for each tenant?
How do we determine the tenant for each request?
How do we handle tenant-specific configurations?
User management - authentication, authorization, and role-based access control
Conclusion and footnotes

In my prevuous article, I discussed the basics of multi-tenant architectures. In this article, I'll explore some of the key considerations that developers should keep in mind when designing and implementing a multi-tenant architecture, their pros and cons, and whether a multi-tenant architecture is the right choice for your application. In this article, I'll explore ways in which you can create a multi-tenant application with golang and postgresql.

I'll try to keep the article as simple as possible, so that it can be easily understood by beginners. If you haven't already, you can read about basics of multi-tenant architecture before proceeding.

Should we use a separate database or separate schema for each tenant?
How do we determine the tenant for each request?
How do we handle tenant-specific configurations?
User management - authentication, authorization, and role-based access control.

Let's discuss each of these in detail.

Pros

Data isolation: Each tenant's data is stored in a separate database, which provides better data isolation and security.
Performance: Since each tenant has its own database, the performance of the application is not affected by the number of tenants.

Cons

Maintenance: Managing multiple databases can be complex and time-consuming.
Cost: Running multiple databases can be very expensive, especially if you are using a managed database service.
Scalability: Scaling a multi-tenant application with separate databases can be challenging.

Key Takeaways

It is not just the database that needs to be managed, but also the database connections, migrations, backups, and monitoring.
In most cases, you need some kind of incremental backups, or CDC mechanism to a secondary database for disaster recovery.
There are certain use cases where using a separate database for each tenant makes sense, such as when you need strict data isolation or when you have very large tenants with a lot of data. However, for most applications, using a single database with a shared schema is a better choice.

Conclusion

Let's go with a single database server, and we segregate the data into multiple database within the same server. This way, we can have the benefits of data isolation and security, without the complexity of managing multiple database servers.
This approach chooses to be cost-effective and scalable, as we can easily add more tenants by creating a new database within the same server.
In case of increased load by individual tenants, we can give them a dedicated database server, and move their data to the new server. Since, all of their data is already isolated, this migration can be done without affecting other tenants.
Since we use a single database server, we can use a separate schema for each tenant.

There are several ways to determine the tenant for each request. Some of the common approaches are:

Each tenant are given a unique subdomain, and the tenant is identified based on the subdomain in the request URL. For example, tenant1.example.com, tenant2.example.com (considering the domain is example.com).

Pros

Easy to implement and manage.
Provides a clean separation between tenants.
Tenant configuration can even be tweaked at the DNS level.

Cons

Requires DNS configuration and management for each tenant.
Security concerns, as the tenant is identified based on the subdomain in the request URL.
Not suitable for applications where tenants need to share the same domain.
handling custom domains can be complex, in some cases.

Each tenant is identified based on a path in the request URL. For example, example.com/tenant1, example.com/tenant2.

Pros

Simple to implement and manage.
No DNS configuration required.
Suitable for applications where tenants need to share the same domain.

Cons

Security concerns, as the tenant is identified based on the path in the request URL.
Complex to handle custom domains, tenant specific routes and configurations.
SEO can be affected, as the tenant is identified based on the path in the request URL.

Each tenant is identified based on a header in the request. For example, X-Tenant-ID: tenant1. This will be sent by the client application in the HTTP request headers.

Pros

No DNS configuration required.
Relatively more secure, as the tenant is identified based on a header in the request.
Suitable for applications where tenants need to share the same domain.
Custom headers can be used for other purposes as well.

Cons

Relatively more complex to implement and manage.
Request headers can be spoofed by the client application.

Key Takeaways

The choice of tenant identification mechanism depends on the specific requirements of your application.
Subdomain-based tenant identification is a good choice if you need a clean separation between tenants and if you have a small number of tenants, also if you truly want to provide a white-label solution.
Path-based tenant identification is a good choice if you need to share the same domain between tenants and if you have a large number of tenants.
Header-based tenant identification is a good choice if you need to share the same domain between tenants and if you want to keep the tenant identification hidden from the client application. This approach is independent from the request URL, so it can also be used with other appraoches (like adding a subdomain or a custom domain)

Conclusion

We will use a header-based tenant identification mechanism, as it provides a good balance between security and flexibility. We can easily switch to a different tenant identification mechanism in the future, if needed.
We will use a custom header X-Tenant-ID to identify the tenant for each request.

There are several ways to handle tenant-specific configurations. Some of the common approaches are:

Configuration files: Store tenant-specific configurations in separate configuration files.
Database: Store tenant-specific configurations in a database table.
Environment variables: Store tenant-specific configurations as environment variables.
Key-value store: Store tenant-specific configurations in a key-value store like Redis.

Key Takeaways

Database is the most common way to store tenant-specific configurations, as it provides flexibility and scalability.
File based configurations are easy to manage, but can be difficult to scale, because we need to have static files containing the same configuration for all the tenants.
Environment variables are not suitable for this use-case because keys are static and can't be changed at runtime.
Key-value store is also a good choice if you need to store small amount of data, and if you need fast access to the data.

Conclusion

Tenant-specific configurations would be stored in a database table. This way, we can easily add, update, and delete configurations for each tenant.
The main database would store the tenant configurations, and tenant databases would store all the tenant data.
This would contain configurations like tenant name, logo, theme, email templates, etc. including any custom application level configurations for all tenants, alpha/beta-user/alpha/beta-tenant groups, etc for early access features (if any).
This main database would also store the tenant database connection details, so that we can easily connect to the tenant database based on the tenant ID.

User management is a critical aspect of any multi-tenant application. A good user management system should provide the following features:

Authentication: Verify the identity of users.
Authorization: Control access to resources based on user roles and permissions.
Role-based access control: Assign roles to users and define permissions for each role.
Tenant-based access control: Control access to tenant-specific resources based on the tenant the user belongs to.
Workspaces: Allow users to switch between different tenants or workspaces. This can also mean different teams, departments, sub-organizations withtin the same tenant.

Key Takeaways

User management is a complex topic, and it is important to get it right from the beginning.
A good user management system should be flexible, scalable, and secure.

Conclusion

We will use a JWT-based authentication system, as it provides a good balance between security and performance. Also, it keeps the system stateless, which is important for scalability. We can easily switch to a different authentication mechanism in the future, if needed.
We will use a role-based access control system, where each user is assigned a role, and each role is assigned a set of permissions. This way, we can easily control access to resources based on user roles. The roles, permissions and other ACLs would be stored in the tenant database.
Workspaces would be implemented as a separate entity, where each user can belong to multiple workspaces, and a workspace is associated to a single tenant. This way, we can easily switch between different workspaces.

The above discussion provides a good starting point for designing and implementing a multi-tenant architecture. However, every application is different, and it is important to consider the specific requirements of your application before making any design decisions. In case of any more general use-cases that comes up to my mind, I'll keep updating this article.