Multi Tenant Part 1
HEADER TEXT!!!
The Setting
You have mulitple tenants and you need a way to deploy Azure infrastructure to these tenants. FirstOff… let me just start this off by saying please… DO NOT DO IT! If you want to inflict pain on your IT organization and end users, this is one way to do just that. If you have no chocice i feel for you.
The Goal
Must be able to support the following scenarios:
- Use code to deploy the platform infrastructure
- Adding new tenants must be done with the least amount of effort and manual inputs
- Try to not make things insanesly complex
- Need to be able to deploy platform components as well as landingzones
Also trying not to rant to much in this post and not write a book worth of words is a goal.
The Problem
As I am a Microsoft fan boy, Azure is my choice and for now Azure DevOps, yes yes, i know Github etc… Following the Cloud Adoption Framework to designe Enterprise scale setup in a mulit-tenant setting is hard to describe in few words why is fu..YoloITing difficult.
There is not that much about the topic on The Google that i have been able to find so i have decided to collect some thoughts and links to help make it easier.
One way to do solve it
From my experience so far, the blogs/documentation listed further down have been my inspiration
I will try to do this in 2 parts. Part 1 will be about the thinking behind the second part that will the an example of how to do this. The angle for this post will be on platform deployment and not deploying a set of applications to multiple tenants. That has it’s own set of considerations. Azure resource organization in multitenant solutions
Below is a bunch of link that i have found very usful when designing a multi-tenant deployment.
From the Microsoft Docs
I highly reccommend reading through the docs on the topic.Architect multitenant solutions on Azure The docs are not mainly directed to platform deployment but it lays out a thought process. There will be many other parts of this that i will not get into, like cost allocation, M365 (MS now has examples of this using a DSC like concept) or the logistics of handling customer tenants from an agreement perspective.
My key take aways (click on the images for a larger view):
-
Decide on a strategy for the tenants: Tenant isolation Do you want to treat each tenant individually or as more of a hub-spoke model where you can host shared services in a common tenant. This will have an impact on your pipelines for later. What is the best approach will depend on how your organization works. It is easier but will have a higher cost as you can’t share common services. Be very aware of DNS, since things become complicated with a common on-prem DNS serivces when using private DNS.
-
Do not forget your Governance. One thing is the general governance (as pr. CAF) but maybe different teants have different requirements. In addition, how access to the various tenants is managed and the data in them Architectural approaches for governance and compliance in multitenant solutions
-
Lifecyle management is important: Tenant lifecycle considerations in a multitenant solution For the platform, having a testing environment is a must. Unless you have very large volumes of tenants, self-service automated onboarding probably is not needed, but having it done using code is. Plan for test-tenants. At the very least for platform development. Have a plan for offboarding tenants
-
Considerations for updating a multitenant solution A strategy and a plan for how to update the tenants. This will affect your pipelines. Initially i would go with the easiest approach first and make it more complex as time goes. Deploying an Azure platform to multiple tenants is hard, even more so if you need different tenants to be on different version of your code. Using deployment rings Deployment Rings is one way to help with a more slow rollout updates and new features, canary ring/tenants and early adopters.
With mulitple pipelines and many tenants this is still fairly complex to start with. Creating a process to do as the illustration below is quite demaning, if your platform is split into multiple pipelines and repositories. If some of your tenants also have different kinds of subscriptions, you might need to think about deployment rings for these also, making this super hard. There is just no practical scaleble way to do this without using a YoloIT tone of code.
-
Feature Flags In a platform context you can use this to decide if a tenant should have selected infrastructer. TO vwan or not to vwan as an example. There is not that much about the topic on The Google that i have been able to find so i have decided to collect some thoughts and links to help make it easier.
-
Management of multiple tenants has is challanges.
What is Azure Lighthouse Lighthouse can be used as tool to help. However it is important to understand what you get and what you do not get. Knowing this lets you adapt you Enterprise scale deployment to maximize what you can get out of Lighthouse
-
Decide on how to manage the tenants. Will you have a separate tenant only used for cross-tenant management or use one (or more) of your tenants as hub-tenants. Using Graph queries for lookups is a good way to quickly search resourses cross tenant. Centralized administration of multiple tenants and decide on your strategy for how to handle roles. Using LightHouse you can do cross-tenant management using PIM as an example (again, know what you get and don’t through this).
I like the idea of management from a management tenant, but I am very much not a believer in just deploying stuff from a hub to everywhere. It sounds cool but yeah, a little Yolo. Use pipelines, teamplates (or frickin terraform) and deploy things in a declerative idempotent way.
-
Shared Components Work on a way to share the common infrastructre (if you are using this)
-
Architectural approaches for the deployment and configuration of multitenant solutions Consider everything that you need to do when onboarding a tenant, and document this list and workflow, even if it’s performed manually. Consider whether the onboarding process is likely to be disruptive to other tenants, especially to those who share the same infrastructure. Onboarding and provisioning steps
-
Resource management responsibility Treat tenants as configuration of the resources that you deploy, and use your deployment pipelines to deploy and configure those resources.
-
Antipatterns to avoid manual deployment processes add risk and slow your ability to deploy. Consider using automated deployments using pipelines, the programmatic creation of resources from your solution’s code, or a combination of both. Avoid deploying features or a configuration that only applies to a single tenant. This approach adds complexity to your deployments and testing processes. Instead, use the same resource types and codebase for each tenant, and use strategies like Feature flags and selectively enable features for tenants that require them
-
Tenant list as a configuration When you treat your tenant list as a configuration, you deploy all your resources from your deployment pipeline. When new tenants are onboarded, you reconfigure the pipeline or its parameters.
This approach tends to work well for small numbers of tenants, and for architectures where all resources are shared. It’s a simple approach because all of your Azure resources can be deployed and configured by using a single process. Update the tenant list. This typically happens manually by configuring the pipeline itself, or by modifying a parameters file that’s included in the pipeline’s configuration.
-
If you have a large number of tenants turning to Tenant list as data is going to be neccesary. When you treat your tenant list as data, you still deploy your shared components by using a pipeline. However, for resources and configuration settings that need to be deployed for each tenant, you imperatively deploy or configure your resources.
By doing this, you can provision resources for new tenants without redeploying your entire solution. The time involved in provisioning new resources for each tenant is likely to be shorter, because only those resources need to be deployed. However, this approach is often much more time-consuming to build, and the effort you spend needs to be justified by the number of tenants or the provisioning timeframes you need to meet.
-
When doing the actuall deployment i personally like this approach Option 1 - Use deployment pipelines for everything where a pipeline deploys a Bicep file that includes all of the Azure resources needed (again we are talking Enterprise Scale Platform stuff here) for each tenant.
A parameter file defines the list of tenants, and the Bicep file uses a resource loop to deploy a database for each of the listed tenants, as shown in the following diagram. (more on this later)
-
There are several services that have dependencies or requirements that make multi-tenant deployment more complex. Architectural approaches for a multitenant solution read more about the different reccomendations. It would highly reccomend learning as much as possible about the networking requirements and planning. Key words: overlapping vnet-ranges, private link (several services will have complex deployents examples: synaps hub/log analytics) DNS
-
Review the Checklist for architecting and building multitenant solutions on Azure
What to Build
Know your CAF (at all levels). Document the design, have management aligned with approach (should be the other way round). Make it easily available/understandable and preach it! Governance is a must. No governance leads to YoloITing… There should never be any discussions about why someone does not want to protect their internet exposed services. Enterprise scale gives you the platform. (not that there are multiple design options availaible for different sizes of enterprises. You do not need to have the biggest badest setup if your tenants will have 3 subscriptions. I have found that having separate subscriptions for network/identity (could also hold exchange, but separation of duties etc)/management/selfhosted agents, is a good place to start. I strongly recommend that the first version of your platform be standardized, no exceptions. Agree on what shoudld be included in v1 of this. You might not need all services from day 1. This gives you a baseline.
- Make sure you know the plan for your operating model.
This will heavily impact your plaform design and how you build/deploy your platform. Enterprise operations This along with operations/monitoring and the first parts of CAF (vision/ management training/decision making) I think often is overlooked and or not prioritized but failing at this will make the overall experience bad.
Yes, we want to use Infrastructure as code, but what language
There are many languages out ther that can be used for deployment to Azure. What is best, well it depends… what capabilites and how epic a language is, does not help much unless used correctly and by people that know what they are doing. If you infrastructure teams that will be deploying/managing this do not know anything about Terraform but know how to use PowerShell and Template (ARM/bicep), then chosing TF is probably a bad decision. Make sure you pick a language that has loads of exmaples and good documentation. Know the strengths and weaknesses before making a decision. Once you know the weaknesses you can plan for at mitigate when you hit on one of the weakneses. Make sure the language you chose is one where it is possible to get consultants to help. It is cool to use something special but if only 1-2 people know how to use it, and getting good hiered help is hard then you will lose. Typically you want to opt for the declerative over the imperative ones meaning priamarily use templates and not PS/CLI. Use PS/CLI as additions to the temapltes. Also remember that the code needs to run again (you know, idempotent etc). So for this, lets assume that you agreed on using Native tooling. Infrastructure as Code (IaC): Comparing the Tools
configuration files
multi-tenant deployment using configuration files
https://www.ibm.com/docs/en/voice-gateway?topic=configuration-configuring-tenants-in-multi-tenant-json
Multi-stange deployment with DevOps
https://www.linkedin.com/pulse/cloud-platform-automation-scale-harald-solstad-fianbakken
https://techcommunity.microsoft.com/t5/microsoft-sentinel-blog/ combining-azure-lighthouse-with-microsoft-sentinel-s-devops/ba-p/1210966
https://github.com/javiersoriano/sentinelascode/tree/master/Pipelines/Samples
https://medium.com/@mananu/deploying-a-multi-tenant-application-1-6c1611183a44
Cross tenant deploy from DevOps
Using a service princpal from another tenant in your devOps https://andrewmatveychuk.com/how-to-deploy-to-another-tenant-with-azure-devops/
Reusable pipeline componentes
https://mthai.medium.com/azure-devops-yaml-templates-what-ive-learned-6d46e8d1a404 https://blog.devgenius.io/azure-devops-yaml-template-reuse-74a58c131e74?gi=814744b5309e
To csv or json that is the question
https://www.educba.com/json-vs-csv/ https://coresignal.com/blog/json-vs-csv/ https://blog.datafiniti.co/4-reasons-you-should-use-json-instead-of-csv-2cac362f1943
Decision documentation
MonoRepo vs multi-repo and code organization in a repository
https://www.hashicorp.com/blog/terraform-mono-repo-vs-multi-repo-the-great-debate some thoughts from Hashicorp. https://learn.microsoft.com/en-us/azure/devops/organizations/projects/about-projects?view=azure-devops#use-a-single-project
https://kinsta.com/blog/monorepo-vs-multi-repo/
ps in pipeline
https://www.techtarget.com/searchitoperations/tip/How-to-use-PowerShell-in-CI-CD-pipelines?amp=1 https://www.codewrecks.com/post/general/powershell/pipeline-and-powershell-return-code/
file organization
Find Files Faster: How to Organize Files and Folders - https://zapier.com/blog/organize-files-folders/
Creating a platform will consist of many different deployments. Some are linked and depend on other and some are not. There will in many cases be a neccesary order to the deployment. Keeping track of what to make and in what order is hard once this starts to scale. I like using a number first because it makes it easy to mentally see the order. Starting with 00 is just a “fun” programming approach (and in many cases… Yolo… I need to add something before 01). Could also use categories like IAM/PIM etc however if you have more then 1 task on the same category you are back to what f*** order is this supposted to be in. Use whatever you want, I just like organizing my filder/folders this way. In theory you could just use files and no folders, however I’m leaning towards containing my “main.bicep” in the same folder
Skill up and educate users
perpare for operations
Training materials about your platform
Genral brain dump on the process
Structure your tasks
Unless you have a pre-defined process that all infrastructure teams are familior with i would start off as simple as possible.
Use the Basic Process
Start it super simple and when this is not enough for you, make things more advanced. Organize your tasks in sprints (say 2 weeks at a time). I reccomend also working on keeping up to date on tasks. As you are working, loads of news bugs and tasks will show up.
To make sure you don’t forget or miss a critical task, I suggest adding the new tasks to you a later sprint. When doing sprint planning, decide if this task kan wait or not. If so push it to the next sprint.
This will make it possible for your team to very quickly add a task, forget about it untill next sprint planning where it can be viewed against everything else. In my opinion this approach will help you see how far in advance you are Yoloed…
People
Make informed decisions and document them
No clickOps
Automate everything. Every time you clickOps, a BOFH kills a production service! It will not scale to be manually involved in addition to you now… opening up the possibility of YoloITing and forgetting something or making a mistake.
keeping things simple
keep track of and drive tasks through
inter company politics
The Future
There is much to learn about how this all fits together. Once you learn something it may change a decision
The End
Hope this helps!
Any questions, comments etc. send it to:
hold.my.tenant@yoloit.no
Live long and YoloIT \//_