k8s – coding like a boss

I was recently involved in migrating data from one database to another for an application deployed to Kubernetes. The idea was to use an in-process migration routine that kicks in during the startup process of the application. Resilience and scaling were to be handled entirely by Kubernetes. Any failure would cause the application to fail fast and let k8s restart it.

This turned out to be quite simple to implement!

We use a rolling update strategy in order to let the newly deployed service migrate the data side-by-side with the old one still running the actual application process. This is defined in the application’s deployment manifest:

strategy:
  type: RollingUpdate
  rollingUpdate:
     maxUnavailable: 50%
     maxSurge: 50%

With a replica count of 8, we will now end up with a minimum of 4 pods running the old version and up to 8 pods running the migration.

However, for the rolling update strategy to work, we also need a readiness probe for the service to tell the deployment when it is okay to swap out pods. Since the application is a message streaming service hosted in a generic .NET Core host, we could simply use an ExecAction probe that executes cat against a file which existence we can control during the life cycle of the application. Simple!

The application’s life cycle went from:

private static async Task Main(string[] args)
{
    using var host = new HostBuilder().Build();
    await host.RunAsync();
}

…to something like this:

internal class Program
{
    private static async Task Main(string[] args)
    {
        using var host = new HostBuilder().Build();
        await host.StartAsync();
        File.Create("/Ready");
        await host.WaitForShutdownAsync();
        File.Delete("/Ready");
        await host.StopAsync();
    }
}

During the startup phase, the migration takes place. At this time, no file called “Ready” exists in the application’s execution directory. When start finishes (the migration is done and the application is ready to serve), the Ready file is created. When sigterm is received, the Ready file is deleted and we start the shutdown process. At this time, the application is no longer ready to serve.

What about the probe configuration? Easy!

readinessProbe:
  exec:
    command: 
    - cat
    - /Ready
  periodSeconds: 2

Every other second, the container will be checked for a file called Ready. If it exists, the service is considered ready for service and the deployment will continue and deploy the next pod according to the update strategy.

Need more scaling? Just add more replicas!

Tag: k8s

Scaling in process migrations with Kubernetes