Microservices and Kubernetes at Yahoo Small Business (Part Two)

Table of Contents

Part Two – Identifying the Tools for Microservice Implementation

As we mentioned in the previous article the microservices architecture is a software architectural pattern which uses individual microservices with clear boundaries which can be used to build complex business applications. The microservices architecture will help us to reduce the time to market, do fast experiments and deliver features continuously as they become ready. The main idea in microservices architecture is to create small independent services that can be developed and deployed independently.

When the YSB engineering team decided to begin implementing microservices architecture, we had to make many technical decisions related to tools and infrastructure to support this move. In this section of our article series on Microservices and Kubernetes, we will discuss a couple of major technology choices we had to make at the start of the project.

In the microservices architecture, when we start decomposing heavy applications into smaller independent microservices, we end up with more components that need to be managed. We have more services, more executables, more configuration files and more connectivity points. To manage this level of complexity, it was very evident that our existing legacy infrastructure had to be updated.

A technology called Linux containers is a great abstraction for creating, deploying, updating and maintaining microservices. It offers uniform packaging, good separation of concerns and speedy startup times instead of depending on existing heavier and less agile technologies like virtual machines or bare metal servers.

Another major advantage of using Linux containers is its support of something called immutable software delivery. Immutable delivery reduces the number of moving pieces into pre-created images as a part of the software build process. During the build process, we can create fully self-contained images consisting of the operating system, the programming language binary, any supporting libraries, and all the configuration needed for the application to run. We can deploy this image in an initial QA environment, run the test suites, and pass it to the next stage in an application delivery pipeline toward the production environment. We don’t need to worry about whether the environment or application configuration is consistent across environments since everything is pre-baked. If we need to make a change to the application, we can build a new immutable image and follow the same process to deliver it to production. This build once run anywhere model provides the flexibility we are trying to achieve with microservices.

The most common and widely used Linux container technology in the industry is Docker, which uses Linux namespaces and cgroups to create a secure and isolated environment to run applications. Docker also provides a centralized repository of base images called Docker registry, where people can define, build and then share their pre-built images with others, so people don’t have to build them themselves. There are already pre-built images for all major Linux distributions, programming languages and many other services. It is easy to just choose one and run our microservice inside it in a short period of time. At YSB, some of our existing products have already been migrated to docker and our engineering team is experienced at running it – which made our container technology choice for microservices obvious – continue with Docker.

Docker Workflow

As we continued to pack more and more applications into isolated containers using Docker, it became clear that we needed a tool which would be able to help us to tackle running many containers simultaneously in multiple environments. Upon further investigation, we realised that the industry already understood a need for something like this and was moving to a technology tool to achieve this which is generally termed a container orchestrator or a container scheduler.

A container orchestrator is a tool which we can use to abstract away from the underlying server, storage or network hardware. The orchestrator manages the execution of the containerized application across the shared pool of these resources. Using an orchestrator is a very fitting deployment pattern for microservices because it simplifies the management of deploying, health checking and scaling any number of independent services. It also helps to utilize the underlying infrastructure resources more efficiently.

When Yahoo Small Business started the microservices journey, there were two main contenders for container orchestration – one provided by Docker themselves called Docker swarm and another project by Google engineers called Kubernetes (shortened to K8s as a numeronym). When our team did a comparison between the two technologies – it was evident that Kubernetes was way ahead of the other orchestrators with its wide industry adoption and engineering mindshare.

K8s was designed based on what the Google engineers had learned from running their internal container orchestration engine aptly named Borg for more than a decade. Borg has been serving the world’s most trafficked websites very well for a long time. Also, the K8s project is completely open-source and other big companies like Microsoft and IBM have been contributing to the project for some time. These things convinced us that Kubernetes is here to stay and is poised to be the industry standard for container orchestration. This realization prompted us to start its adoption for our microservices project.

Kubernetes in its simplest definition is a platform for managing containers across multiple hosts. As a container orchestrator, it has a lot of other management features for container-oriented applications, such as auto-scaling, rolling deployments, high availability as well as compute and storage resource management. Similar to container technology, it is designed to run anywhere – be it on bare metal machines in the data center, in public clouds like AWS or GCP, or even in a hybrid cloud.

In Kubernetes, we typically declare instructions to the scheduler in YAML files. These instructions define something called as K8s objects. These configuration files represent the desired state of the application deployments in the K8s cluster. When we apply these configurations to the K8s master servers the scheduler running on them works continually to maintain that ideal state.

By creating an object called a deployment object in K8s cluster we can roll out, roll forward, or roll back selected containers with only a few commands. The deployment in Kubernetes ensures that a certain number of groups of containers are up and running to maintain the desired state for a microservice. It also supports liveness and readiness probes that help define and monitor the application’s health. For better resource management, we can define resource limits for each group of containers. The K8s master will select a worker node that fulfils the resource criteria to run these containers. K8s also provides an optional horizontal pod autoscaling feature that can be used to scale a microservice horizontally by certain resource usage patterns. It is also designed to have high availability – we are able to create multiple master nodes and prevent any single points of failure. There is even an option to manage multiple clusters together using Kubernetes Cluster Federation.

Kubernetes with Docker

In part three, we will talk more about the options we considered for creating an actual Kubernetes cluster and how we reached the conclusion to use a well-known solution.

Part Two – Identifying the Tools for Microservice Implementation

Featured Offering