Codementor Events

Understanding GKE VPC-Native Cluster Pod Networking

Published Oct 02, 2019

Lets use some of the linux and kubectl tools to draw the basic GKE pods communication flow.

Before you create a GKE cluster, you need to create a subnet under a given VPC in google cloud console

codementor1.png

I have given the subnet range as 10.120.20.0/24 (Primary Range) and this primary range will be assigned to any worker nodes we are adding to the cluster.

Once the cluster being created, you can notice that there will be two secondary ip ranges will be added to the subnet.

codementor2.png

Usually these secondary ranges used for Pods and Services

In our cluster 10.12.0.0/14 secondary range will be used for pods and
10.204.0.0/20 secondary range will be used for services end points

Assuming there is a GKE cluster with above details and let's understand the pod communication.

I am also having a cluster-admin privileged instance in which I have installed kubectl and lets gather some details using the instance

  1. Find the nodes

kubectl get nodes -o wide | awk '{print $1 "\t" $2 "\t" $6}'
Screen Shot 2019-09-30 at 2.42.27 PM.png

So we have three nodes

Node3 = gke-cluster-standard-clus-default-pool-0e999ebf-rs5w
Node2 = gke-cluster-standard-clus-default-pool-17b978bb-wpm6
Node1 = gke-cluster-standard-clus-default-pool-7cf8a05e-ksb0

  1. Lets create a deployment with replication 6. create the following file
    deployment.yaml
    Screen Shot 2019-09-30 at 3.01.45 PM.png

  2. kubectl get po -o wide | grep nginx-deployment | awk '{print $1 "\t" $2 "\t" $3 "\t" $6 "\t" $7}'
    Screen Shot 2019-09-30 at 3.24.03 PM.png

So each worker nodes running two pods within it

Node1=nginx-deployment-76bf4969df-vl6jk (Pod1), nginx-deployment-76bf4969df-vzs4r (Pod2)
Node2=nginx-deployment-76bf4969df-89qkh (Pod3), nginx-deployment-76bf4969df-fk2m6 (Pod4)
Node3=nginx-deployment-76bf4969df-2kpxz (Pod5), nginx-deployment-76bf4969df-95bxn (Pod6)

Lets consider the pods running under Node1. So we now know both Pod1 and Pod2 are having ip addresses 10.12.2.57 and 10.12.2.56 in order.

Lets check these Pods network interfaces in details with following commands

kubectl exec -it nginx-deployment-76bf4969df-vl6jk ip a
pod5.png

kubectl exec -it nginx-deployment-76bf4969df-vzs4r ip a
pod6.png

eth0@if60 indicates that that Pod1 ethernet interface is paired with virtual network interface with row number 60 of the host Node1.
Similarly eth0@if59 indicates that that Pod2 ethernet interface is paired with virtual network interface with row number 59 of the host Node1.

Fine! Now let go to the terminal of the Node1 and execute following command to find what these row numbers 59 and 60 referring to

ip a
Screen Shot 2019-09-30 at 10.02.11 PM.pngScreen Shot 2019-09-30 at 10.02.40 PM.png

Now it is very clear that Pod1 and Pod2 ethernet interfaces are paired with Host Node's (Node1) virtual ethernet interface (Veth).

Also below command will show you these veth interfaces veth6758a8b4 and veth3d227386 are connected to the bridge cbr0 having ip address 10.12.2.1/24.

brctl show
Screen Shot 2019-09-30 at 10.13.49 PM.png
Note: you can install bridge-utils if necessary to check

Based on the above identifications the pod communication diagram of a Worker would be like below

Screen Shot 2019-10-02 at 6.42.38 AM.png

The similar way I can draw the diagram for Node2 as below

Screen Shot 2019-10-01 at 6.48.54 AM.png

You know that GKE cluster worker nodes are Compute Engine instances of VPC, so that these instances automatically communicate each other using primary subnet of the VPC.

But if the Pod1 of Node1 need to communicate with Pod3 of Node2, then there must be a routing enabled in the cluster.

But While creating GKE cluster, there is a recommended option to create the cluster with ** Native VPC - IP Aliasing** option

Screen Shot 2019-10-01 at 2.46.44 PM.png

Google says benefit of Alias IPs in Native VPC is

"Pod IP addresses are natively routable within the GCP network (including via VPC Network Peering), and no longer use up route quota."

Please refer to this link

Pod Ranges use something called Alias IP Addresses which are natively routable. This means the logic to route them is built into the networking layer. And there is no need to configure any additional routes. In the mode can scale up to 1000 nodes with VPC Native.

Alias IP addresses are VPC aware and routing is taken care by VPC. With IP aliasing, pod routing is taken care by VPC itself and there is no need to add manual routes to reach pod IP. (refer to this link )

Lets try to ping from Pod1 to Pod3

kubectl exec -it nginx-deployment-76bf4969df-vl6jk ping 10.12.0.32

Screen Shot 2019-10-01 at 3.58.18 PM.png

Screen Shot 2019-10-01 at 4.18.28 PM.png

Following diagram illustrate our cluster
Screen Shot 2019-10-02 at 8.58.52 AM.png

So in my next post I will explain the complete Overlay networking of a Kubernetes Cluster which created from the scratch.

Discover and read more posts from Muthulingam Jayakanthan
get started