How We Migrated to Helm3
In my previous articles, I told us that we use our manifest format to describe any microservice in Kubernetes. Also, we use Ansible as an engine for this process (service definition, templating, hooks, and any other action that we need). It helps with migration from Mesos to Kubernetes and makes our deployments standardized and easier for developers.
The simple scheme looks some:
Let’s discuss the pros and cons of this solution.
On the one hand, with Ansible, we can implement any deployment process logic in any order. On the other hand, we should support and fix it by ourselves.
One moment we started to think that our scheme could be reimagined and we could use Helm (as an industry-standard tool) instead of our engine.
The possible scheme looks some:
As you could see values.yml
is analog of AppYAML
format. All templates that we have in the Ansible role could be packed and distributed via Helm Charts and any environment variables we could transfer to separated YAML files and use them in CLI.
So we started to think about what we need because migration would be significant and painful, and we want to achieve substantial benefits.
As for me, Helm consists of three major parts:
- charts
- templating
- resource control via CLI
Let’s discuss them.
Ok, maybe we need charts?
No. We will have one chart for all of our services because of standardization. I understand that charts could be versioned, but we want to deploy services with up-to-date configuration. So, we decided that charts could not be a motivation factor for us to migrate.
Ok, maybe we need templating?
No again. Go-template or Jinja? I think it is a matter of taste. But I definitely know that we should do a lot of work to rewrite templates from one to another syntax. So, we decided that we will spend a lot of time but don’t benefit from it.
Ok, maybe we need resource control?
That's it! We had a problem when a developer eliminated some templated resources. For example, cronjob becomes obsolete. Or ingress resource changes. Or even service has end of life and should be removed from all environments.
Our engine doesn’t know anything about what resources were deployed for a particular service. We do not use Kubernetes as a database, funny yeah? But Helm knows about that and could help us to control a lifecycle of deployed resources.
So, what we’ve done?
We started to use Helm as a part of our engine that is described in the scheme below:
We take only part of Helm to install prepared (by our engine) bunch of zipped resources. So, we delegate resource control to Helm.
Some problems of migration
Resource adoption
When you have more than 100 services in production and want zero downtime, you will not uninstall each service and then deploy it again because Helm fails if any templated resource already exists.
So, we write a temporary step in the Ansible role to adopt existed resources:
Adoption is simple. Just add some labels and annotations.
3-way merge wtf
Before Helm, any resource was entirely replaced by the engine. But after migration, we faced some cases when a developer changes something on a test environment and then a service does not launch because a config becomes broken after a 3-way merge.
We do not eliminate this problem at all but agreed that changing resources in any environment manually is a terrible (and buggy) idea.
Secrets Frankenstein
We have migrations. Before Helm, they were just tasks in the Ansible role. With Helm, we redesign them as pre-hooks. Typically migrations should use the same secrets that services use. And yes, the first thing we did was make secrets as a general resource, which was a mistake because we got into trouble with migrations that tried to use obsolete secrets.
We use Hashicorp Vault to keep and generating secrets, but we don’t use Vault agents at the moment. I think we come to this shortly. Anyway, the problem was here and now.
We didn’t find a better solution than making secrets as pre-hooks:
kind: Secret
metadata:
name: {{ app_name }}-secrets
annotations:
helm.sh/hook-weight: "0"
helm.sh/hook: pre-install,pre-upgrade
helm.sh/hook-delete-policy: before-hook-creation
The trick is that we make this hook with the highest priority and delete it for a second on install. That’s a little creepy but works.
Actually, we do not migrate to Helm, but we started using it to better our deployment process.