Hello Microservice Deployment Part 2: Kubernetes and Google Cloud
Introduction
Hi and welcome back. This is the second part of a three part series. In part 1, we got acquainted with Docker by building an image for a simple web app, and then running that image. We have thus far been working with this repo.
In this part, we'll be taking the Docker image from part 1 and getting it to run on a Kurbenetes cluster that we set up on Google Cloud.
In Part 3, we'll use Drone.io to set up a simple CI/CD pipeline. It will test our application and roll out any changes we make to the master branch of our repo.
So what is Kubernetes (K8s) for?
So what is this container orchestration thing about? Let's revisit Webflix for this one. Let's say Webflix has gone down the microservices path (for their organization, it is a pretty good course of action) and they have isolated a few aspects of their larger application into individual services. Each of those services are then built as images and need to be deployed in the form of live containers.
The diagram here shows just a small portion of the Webflix application. Let's assume it's peak watching time. The little numbers next to the various services are examples of the number of live containers in play. In theory, you would be able to make this happen through use of the docker run
command.
Now, let's say one of the recommend_service containers hits an unrecoverable error and falls over. If we were managing this through docker run
, we would need to notice this and restart the offending container. Or, suddenly, we are no longer in peak time and the numbers of containers needs to be scaled down.
Do you want to deploy a new version of the recommend_service image? Then there are other concerns, like container startup time and aliveness probes. What if you have a bunch of machines that you want to run your containers on? You would need to keep track of what resources are available on each machine and what replicas are running so you can scale up and scale down when you need to.
That is where container orchestration comes in.
Please note, the author of this post is terribly biased and thinks K8s is the shizzle, so that is what we will cover here. There are alternative platforms that you might want to look into, such as Docker Swarm and Mesos/Marathon.
Okay, back to the K8s goodness.
K8s was born and raised and battle hardened at Google. Google used it internally to run huge production workloads, and then open sourced it because they are lovely people. So K8s is a freely available, highly scalable, extendable, and generally solid platform. Another thing that is really useful is that Google Cloud has first class K8s support, so if you have a K8s cluster on Google Cloud, there are all sorts of tools available that make managing your application easier.
Kubernetes lets you run your workload on a cluster of machines. It installs a daemon (a daemon is just a long running background process) on each of these machines and the daemons manage the Kubernetes objects.
K8s job is to try to make the object configuration represent reality, e.g., if you state that there should be three instances of the container recommend_service:1.0.1, then K8s will create those containers. If one of the containers dies (even if you manually kill it yourself), then K8s will make the configuration true again by recreating the killed container somewhere on the cluster.
There are a lot of different objects defined within K8s. I'll introduce you to just a few. This will be a very very brief introduction to K8s objects and their capabilities.
A pod is a group of one of more containers
Think about peas in a pod or a pod of whales. Containers in a pod are managed as a group — they are deployed, replicated, started, and stopped together. We will run our simple service as a single container in a pod. For more information on pods, take a look at the K8s documentation. It's really quite good.
If you deploy a pod to a K8s cluster, then all containers in that specific pod will run on the same node in the cluster. Also, all of the containers in a single pod can address each other as though they are on the same computer (they share IP address and port space).This means that they can address each other via localhost
. And, of course, their ports can't overlap.
A deployment is a group of one or more pods
Now, say Webflix has a pod that runs their recommendation service and they hit a spike in traffic. They would need to create a few new pods to handle the traffic. A common approach is to use a deployment for this.
A deployment specifies a pod as well as rules around how that pod is replicated. It could say that we want three replicas of a pod containing recommend_service:1.2.0. It also does clever things like help with rolling out upgrades — if recommend_service:1.3.0 becomes available, you can instruct K8s to take down the recommend_service pods and bring up new pods in a sensible way.
Deployments add a lot of power to pods. To learn more about them, I would like to refer you to the official documentation.
A service defines networking rules
Containers in a pod can talk to each other as though they are on the same host. But, you'll often need pod to talk to other pods, and you'll also need to expose some of your pods to the big bad Internet because users outside your cluster will likely need to interact with your applications. This is achieved through services. Services define the network interfaces exposed by pods. Again, I would like to refer you to the official documentation if you need more info.
Practical: Deploying our application to Google Cloud
Now we know why a tool like K8s is useful... let's start using it. I've chosen to proceed by making use of Google's Kubernetes engine offering. You can use a different service provider if you want to, run your own cluster, or simulate a cluster using minikube. For the most part, the tutorial will be exactly the same, except for the initial cluster setup and the final cluster cleanup.
Set up our cluster
Visit the Kubernetes engine page. If you are new to Google Cloud, you'll need to sign in with your Google account.
Create or select a project — their user interface is fairly intuitive. You'll need to wait a little while for the K8s API and related services to be activated.
Now visit the Google Cloud Console. At the top of the page, you will see a little terminal button. Click on it to activate the Google Cloud shell.
If you want to use a terminal on your own computer, you can — it just requires a bit more setup. You would need to install the Google Cloud SDK and the kubectl component, and perform some extra configuration that's outside the scope of this text. If you want to go down that route, take a look at this. You'll be interacting with your cluster through use of the gcloud command line utility.
Okay, so back to the shell. Run the following command to create your cluster:
gcloud container clusters create hello-codementor --num-nodes=3 --zone=europe-west1-d
This will create a three node cluster called hello-codementor. It might take a few minutes. Each node is a compute instance (VM) managed by Google. I've chosen to put the nodes in Western Europe. If you want to see a full list of available zones, you can use the command gcloud compute zones list
. Alternatively, you can refer to this document for a list of available zones.
Once your cluster has been created successfully, you should see something like this:
WARNING: Currently, node auto repairs are disabled by default. In the future, this will change and they will be enabled by default. Use `--[no-]enable-autorepair` flag to suppress this warning.
WARNING: Starting in Kubernetes v1.10, new clusters will no longer get compute-rw and storage-ro scopes added to what is specified in --scopes (though the latter will remain included in the default --scopes). To use these scopes, add them
explicitly to --scopes. To use the new behavior, set container/new_scopes_behavior property (gcloud config set container/new_scopes_behavior true).
Creating cluster hello-codementor...done.
Created [https://container.googleapis.com/v1/projects/codementor-tutorial/zones/europe-west1-d/clusters/hello-codementor].
To inspect the contents of your cluster, go to: https://console.cloud.google.com/kubernetes/workload_/gcloud/europe-west1-d/hello-codementor?project=codementor-tutorial
kubeconfig entry generated for hello-codementor.
NAME LOCATION MASTER_VERSION MASTER_IP MACHINE_TYPE NODE_VERSION STATUS
hello-codementor europe-west1-d 1.8.10-gke.0 35.205.54.147 n1-standard-1 1.8.10-gke.0 3 RUNNING
To see the individual nodes:
gcloud compute instances list
This will output something like:
NAME ZONE MACHINE_TYPE PREEMPTIBLE INTERNAL_IP EXTERNAL_IP STATUS
gke-hello-codementor-default-pool-a471ab46-c0q8 europe-west1-d n1-standard-1 10.132.0.3 35.205.110.255 RUNNING
gke-hello-codementor-default-pool-a471ab46-jv7n europe-west1-d n1-standard-1 10.132.0.4 35.189.247.55 RUNNING
gke-hello-codementor-default-pool-a471ab46-mt54 europe-west1-d n1-standard-1 10.132.0.2 35.190.195.64 RUNNING
Achievement unlocked: K8s cluster!
Upload your Docker image
In part 1, we covered creating a Docker image. Now we need to make our image available to our cluster. This means we need to put our image into a container registry that Google can access. Google Cloud has a container registry built in.
In order to make use of Google's registry, you will need to tag your images appropriately. In the commands below, you will notice that our image tags include eu.gcr.io
. This is because we are using a zone in Europe. If you chose to put your cluster in a different zone, refer to this document and update your build and push commands appropriately.
On the Google Cloud shell:
## clone your repo since it's not yet available to this shell. Of course if you
## are using a local install of gcloud then you already have this code
git clone https://gitlab.com/sheena.oconnell/tutorial-codementor-deploying-microservices.git
cd tutorial-codementor-deploying-microservices
## get your project id, we'll need it later
export PROJECT_ID="$(gcloud config get-value project -q)"
## configure docker so it can push properly
gcloud auth configure-docker
## just say yes to whatever it asks
## next up, build the image
docker build -t eu.gcr.io/${PROJECT_ID}/codementor-tutorial:v1 .
## notice the funny name. eu.gcr.io refers to google's container registry in
## Europe. If your cluster is not in Europe (eg you chose a US zone when creating
## your cluster) then you must use a different url. Refer to this document for
## details: https://cloud.google.com/container-registry/docs/pushing-and-pulling
## once you are finished building the image, push it
docker push eu.gcr.io/${PROJECT_ID}/codementor-tutorial:v1
Now we have our application built as an image, and that image is available to our cluster! The next step is to create a deployment and then expose that deployment to the outside world as a service.
Get your application to run as a deployment
Inside the Google Cloud shell, do the following:
kubectl run codementor-tutorial --image=eu.gcr.io/${PROJECT_ID}/codementor-tutorial:v1 --port 8080
The output should be something like:
deployment "codementor-tutorial" created
As in: You just created a K8s deployment!
Now, let's take a look at what we have done. First, take a look at the deployment we just created.
kubectl get deployments
This will give you a brief summary of the current deployments. The output will look like:
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
codementor-tutorial 1 1 1 1 57s
Now a deployment manages pods. So let's look at those:
kubectl get pods
This outputs:
NAME READY STATUS RESTARTS AGE
codementor-tutorial-99f796786-r92fv 1/1 Running 0 1m
Alright! Now we need to expose our application to the Internet by using a K8s service. This is similarly easy:
kubectl expose deployment codementor-tutorial --type=LoadBalancer --port 80 --target-port 8080
This outputs:
service "codementor-tutorial" exposed
Now, this is a bit more complex. What it does is tell Google that we want a LoadBalancer. Loadbalancers decide which pods should get incident traffic. So if I have a bunch of replicas of my pod running, and a bunch of clients trying to access my application, then Google will use its infrastructure to spread the traffic over my pods. This is a very simplified explanation. There are a few more kinds of services you might want to know about if you are actually building a microservices project.
Remember this line from our Dockerfile?
CMD ["gunicorn", "-b", "0.0.0.0:8080", "main:__hug_wsgi__"]
Our container is expecting to communicate with the outside world via port 8080. That's the --target-port
. The port we want to communicate with is 80
. We just tell the service that we want to tie those ports together.
Let's take a look at how our service is doing
kubectl get service -w
Notice the -w
here. Services take a bit of time to get going because LoadBalancers need some time to wake up. The -w
stands for watch
. Whenever the service is updated with new information, that information will get printed to the terminal.
The output is like this:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.47.240.1 <none> 443/TCP 44m
codementor-tutorial LoadBalancer 10.47.241.244 <pending> 80:30262/TCP 40s
## time passes before the next line is printed out
codementor-tutorial LoadBalancer 10.47.241.244 35.205.44.169 80:30262/TCP 55s
Press Ctrl+C
to stop watching the services.
Now let's access the application from your local computer.
## copy the external IP address from the service output above
export EXTERNAL_IP="35.205.44.169"
curl ${EXTERNAL_IP}/index
This outputs:
{"codementor": "so delicious"}
Awesome! We have our image running as a container in a cluster hosted on Google Kurbenetes engine! I don't know about you, but I find this pretty exciting. But we have only just started to scratch the surface of what K8s can do.
Scale Up
Let's say our application has proven to be seriously delicious, so a lot of traffic is coming our way. Cool! Time to scale up. We need to update our deployment object so that it expects three pods.
## first tell k8s to scale it up
kubectl scale deployment codementor-tutorial --replicas=3
## now take a look at your handiwork
kubectl get deployment codementor-tutorial
The output is:
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
codementor-tutorial 3 3 3 3 23m
And take a look at the pods:
kubectl get pods
The output is:
NAME READY STATUS RESTARTS AGE
codementor-tutorial-99f796786-2x964 1/1 Running 0 54s
codementor-tutorial-99f796786-r92fv 1/1 Running 0 23m
codementor-tutorial-99f796786-tlnck 1/1 Running 0 54s
Brilliant! So now when someone accesses our K8s service's external IP address, the traffic will be routed by the LoadBalancer to one of the pods. We can handle three times the amount of traffic we could before.
Deploy a new version of your application
Now let's make a new version of our application. Go back to your Google Cloud shell and check out the version2
branch of our application:
git checkout -b version2 origin/version_2
Now let's build and push a new version of our image:
docker build -t eu.gcr.io/${PROJECT_ID}/codementor-tutorial:v2 .
docker push eu.gcr.io/${PROJECT_ID}/codementor-tutorial:v2
Now we tell the deployment to run our new image:
kubectl set image deployment/codementor-tutorial codementor-tutorial=eu.gcr.io/${PROJECT_ID}/codementor-tutorial:v2
And watch our pods:
kubectl get pods -w
And a whole lot of stuff happens over time:
NAME READY STATUS RESTARTS AGE
codementor-tutorial-66c6545dd9-2lnlt 0/1 ContainerCreating 0 42s
codementor-tutorial-66c6545dd9-s44xm 0/1 ContainerCreating 0 42s
codementor-tutorial-99f796786-2x964 1/1 Running 0 24m
codementor-tutorial-99f796786-r92fv 1/1 Running 0 47m
codementor-tutorial-66c6545dd9-2lnlt 1/1 Running 0 46s
codementor-tutorial-99f796786-2x964 1/1 Terminating 0 24m
codementor-tutorial-66c6545dd9-h7vfv 0/1 Pending 0 1s
codementor-tutorial-66c6545dd9-h7vfv 0/1 Pending 0 1s
codementor-tutorial-66c6545dd9-h7vfv 0/1 ContainerCreating 0 1s
codementor-tutorial-99f796786-2x964 0/1 Terminating 0 24m
codementor-tutorial-99f796786-2x964 0/1 Terminating 0 24m
codementor-tutorial-66c6545dd9-s44xm 1/1 Running 0 48s
codementor-tutorial-99f796786-r92fv 1/1 Terminating 0 47m
codementor-tutorial-99f796786-r92fv 0/1 Terminating 0 47m
codementor-tutorial-66c6545dd9-h7vfv 1/1 Running 0 4s
codementor-tutorial-99f796786-2x964 0/1 Terminating 0 24m
codementor-tutorial-99f796786-2x964 0/1 Terminating 0 24m
codementor-tutorial-99f796786-r92fv 0/1 Terminating 0 47m
codementor-tutorial-99f796786-r92fv 0/1 Terminating 0 47m
Eventually, new things stop happening. Now Ctrl+C
and run get pods again:
kubectl get pods
This outputs:
NAME READY STATUS RESTARTS AGE
codementor-tutorial-66c6545dd9-2lnlt 1/1 Running 0 2m
codementor-tutorial-66c6545dd9-h7vfv 1/1 Running 0 1m
codementor-tutorial-66c6545dd9-s44xm 1/1 Running 0 2m
So what just happened? We started off with three pods of running codementor-tutorial:v1
. Now we have three pods running codementor-tutorial:v2
. K8s doesn't just kill all the running pods and create all of the new ones all at the same time. It is possible to make that happen but it would mean that the application would be down for a little while.
Instead, K8s will start bringing new pods online while terminating old ones and rebalancing traffic. That means that anyone accessing your application will be able to keep accessing it while the update is being rolled out. This is, of course, only part of the story. K8s is capable of performing all sorts of health and readiness checks along the way, and can be configured to roll things out in different ways.
Let's go back to our local terminal and see what our application is doing how:
curl ${EXTERNAL_IP}/index
This outputs:
{"codementor": "so delicious", "why": "because we have the experts"}
IMPORTANT
Clusters cost money! You'll want to shut yours down when you are done with it. We'll talk about cleanup at the end of the next part of this series. If you aren't keen to dive into CI/CD just yet, it's totally fine to just skip to the section on cleanup.
Conclusion
Well done for getting this far! Seriously, you've learned a lot. In this part of the series, we set up K8s clusters on Google's Cloud and we covered some basic K8s object configuration skills, including scaling and upgrading pods on the fly. That's a lot of ground we covered! Give yourself a pat on the back.
The final post on this series will cover setting up a basic CI/CD pipeline for our application, as well as the all important cleanup.
Tks you for article!!.
I had problem for create deployment. I modif command ‘kubectl run codementor-tutorial --image=eu.gcr.io/${PROJECT_ID}/codementor-tutorial:v1 --port 8080’
to ‘kubectl create deployment codementor-tutorial --image=eu.gcr.io/${PROJECT_ID}/codementor-tutorial:v’
As far as I can tell this deployment only has a single Pod definition, which means it’s not really a Microservices example – a single microservice Pod does not a microservice archiecture make.
The difficulties are in getting different Pods to communicate with one another. I can’t find any tutorials that cover this.
Thank you for your article, Sheena!