Most organizations are moving toward the DevOps model, and as a part of this move, they are looking for a simplified application deployment solution for their containerized applications. This is where Kubernetes, one of the most popular container orchestration solutions, comes into the picture. Kubernetes does solve numerous container deployment problems, but it also brings complexity into the architecture, which is why a managed Kubernetes solution like AWS Elastic Kubernetes Service (EKS) can help. 

Introduction to AWS EKS

Elastic Kubernetes Service (EKS) is the managed Kubernetes solution offered by AWS. AWS takes care of all the heavy lifting, like provisioning the cluster, patching, and performing upgrades. Meanwhile, EKS runs the upstream Kubernetes so that it’s compatible with existing tooling and plugins. As EKS uses open-source Kubernetes, you can easily migrate your on-premises application to AWS without any code change. And by using kubectl, you can simply connect to your EKS cluster, just as you would with a self-hosted solution.

EKS Architecture

EKS is a managed control plane (K8 master nodes, API servers, etcd layer) that provides three masters and three etcd nodes in Multi-AZ to ensure high availability. AWS also ensures that all the nodes under the control plane are healthy, automatically detecting and replacing any unhealthy instance. If your workload increases, AWS will also automatically take care of scaling master nodes as well as performing backups, etcd, snapshots, and auto-scaling. You are only responsible for managing and provisioning the worker nodes.

Introduction to CloudWatch Container Insights 

CloudWatch Container Insights provides you with a single pane to view the performance of your Elastic Container Service (ECS), Elastic Kubernetes Service (EKS), and the Kubernetes platform running on an EC2 cluster. This tool collects, summarizes, and aggregates logs and metrics from your microservices and containerized applications. 

CloudWatch collects metrics like memory, CPU, disk space, and network statistics, while Container Insights offers diagnostics, such as container restart failures. With the combination of both, you can even set CloudWatch alarms on the metrics collected by Container Insights.

Launching the EKS Cluster 

Launching the EKS cluster requires following a series of steps. But first, there are some prerequisites you need to meet.

1: Installing Eksctl

Eksctl is a command-line tool that simplifies the creating and managing of Kubernetes clusters on EKS. It’s written in the Go language and uses CloudFormation on the backend. 

To install eksctl in Linux, use the curl command below, which will download and extract the latest version of eksctl:

$ curl --silent --location "https://github.com/weaveworks/eksctl/releases/latest/download/eksctl_$(uname -s)_amd64.tar.gz" | tar xz -C /tmp

Now, move the extracted version to your /usr/local/bin (PATH definition):

$ sudo mv /tmp/eksctl /usr/local/bin

Finally, test the installation, and verify the eksctl version:

$ eksctl version

0.29.2

2: Installing AWS CLI

AWS CLI is a command-line for working with AWS services, including Amazon EKS. First, you’ll need to verify that you already have AWS CLI installed:

$ aws --version

aws-cli/2.0.24 Python/3.7.3 Linux/5.3.0-1035-aws botocore/2.0.0dev28

If your version is lower than 1.18.157, then install AWS CLI by following these instructions:

curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"

unzip awscliv2.zip

sudo ./aws/install

 

Once you have both eksctl and AWS CLI installed, configure your AWS credentials, which are needed for both AWS CLI and eksctl. To do this, use the following command:

$ aws configure

AWS Access Key ID [None]: <ABDEFGHIJEXAMPLE>

AWS Secret Access Key [None]: <aBCdswfwffesssEXAMPLEKEY>

Default region name [None]: <region-code>

Default output format [None]: <json>

For more information on how to create IAM keys, check out Amazon’s own documentation.

3: Installing Kubectl

Kubectl is the Kubernetes command-line utility used to communicate with the Kubernetes API server. Downloading and installing the EKS-vended kubectl binary for Linux requires another series of steps:

curl -o kubectl https://amazon-eks.s3.us-west-2.amazonaws.com/1.18.8/2020-09-18/bin/linux/amd64/kubectl

Next, give execute permissions to the binary:

chmod +x ./kubectl

Move kubectl to your /usr/local/bin (PATH definition):

sudo mv ./kubectl /usr/local/bin

And, test the installation to verify the kubectl version:

$ kubectl version --short --client

Client Version: v1.16.8-eks-e16311

With all the prerequisites in place, the next step is to set up the EKS cluster. To do this, run the eksctl command and pass the following options:

  • eksctl creates the EKS cluster for you.
  • –name is used to give the EKS cluster a name; if you omit it, eksctl will automatically generate some random name for your cluster.
  • –version lets you specify the Kubernetes version (valid options 1.14, 1.15, 1.16, 1.17 (default)). 
  • –region is the AWS region.
  • –nodegroup-name is the name of the node group.
  • –node-type is your node instance type (default value is m5.large).
  • — nodes is the total number of worker nodes (default value is 2).
  • –ssh-access controls access to the worker nodes.
  • –sh-public-key is the key to access the worker nodes.
 eksctl create cluster --name my-newtest-cluster --version 1.17 --region us-east-2 --nodegroup-name linux-nodes --node-type t2.xlarge --nodes 2 --ssh-access --ssh-public-key eks-kp-east --managed

[ℹ] eksctl version 0.29.2

[ℹ] using region us-east-2

[ℹ] setting availability zones to [us-east-2c us-east-2a us-east-2b]

[ℹ] subnets for us-east-2c – public:192.168.0.0/19 private:192.168.96.0/19

[ℹ] subnets for us-east-2a – public:192.168.32.0/19 private:192.168.128.0/19

[ℹ] subnets for us-east-2b – public:192.168.64.0/19 private:192.168.160.0/19

[ℹ] using Kubernetes version 1.17

[ℹ] creating EKS cluster “my-newtest-cluster” in “us-east-2” region with managed nodes

[ℹ] will create 2 separate CloudFormation stacks for cluster itself and the initial managed nodegroup

[ℹ] if you encounter any issues, check CloudFormation console or try ‘eksctl utils describe-stacks –region=us-east-2 –cluster=my-newtest-cluster’

[ℹ] CloudWatch logging will not be enabled for cluster “my-newtest-cluster” in “us-east-2”

[ℹ] you can enable it with ‘eksctl utils update-cluster-logging –enable-types={SPECIFY-YOUR-LOG-TYPES-HERE (e.g. all)} –region=us-east-2 –cluster=my-newtest-cluster’

[ℹ] Kubernetes API endpoint access will use default of {publicAccess=true, privateAccess=false} for cluster “my-newtest-cluster” in “us-east-2”

[ℹ] 2 sequential tasks: { create cluster control plane “my-newtest-cluster”, 2 sequential sub-tasks: { no tasks, create managed nodegroup “linux-nodes” } }

[ℹ] building cluster stack “eksctl-my-newtest-cluster-cluster”

[ℹ] deploying stack “eksctl-my-newtest-cluster-cluster”

[ℹ] building managed nodegroup stack “eksctl-my-newtest-cluster-nodegroup-linux-nodes”

[ℹ] deploying stack “eksctl-my-newtest-cluster-nodegroup-linux-nodes”

[ℹ] waiting for the control plane availability…

[✔] saved kubeconfig as “/home/ubuntu/.kube/config”

[ℹ] no tasks

[✔] all EKS cluster resources for “my-newtest-cluster” have been created

[ℹ] nodegroup “linux-nodes” has 2 node(s)

[ℹ] node “ip-192-168-0-47.us-east-2.compute.internal” is ready

[ℹ] node “ip-192-168-33-177.us-east-2.compute.internal” is ready

[ℹ] waiting for at least 2 node(s) to become ready in “linux-nodes”

[ℹ] nodegroup “linux-nodes” has 2 node(s)

[ℹ] node “ip-192-168-0-47.us-east-2.compute.internal” is ready

[ℹ] node “ip-192-168-33-177.us-east-2.compute.internal” is ready

[ℹ] kubectl command should work with “/home/ubuntu/.kube/config”, try ‘kubectl get nodes’

[✔] EKS cluster “my-newtest-cluster” in “us-east-2” region is ready

Under the hood, eksctl uses CloudFormation, which creates one EKS stack for the master control plane and another stack for the worker nodes.

To verify the CloudFormation stack, go to the CloudFormation console:

 

Figure 1: CloudFormation stack for eksctl

Also, eksctl uses an EKS-optimized AMI, provided by AWS and based on Amazon Linux 2. This AMI is already pre-configured with Docker, kubelet, and an AWS IAM authenticator. It also runs an EC2 user bootstrap script, which automatically takes care of joining worker nodes to the EKS cluster.

To retrieve information about your EKS cluster:

$ eksctl get cluster -n my-newtest-cluster --region us-east-2
NAME   VERSION    STATUS    CREATED   VPC   SUBNETS   SECURITYGROUPS

my-newtest-cluster    1.17    ACTIVE    2020-10-12T20:08:28Z    vpc-0d2906604b24de2cf    subnet-0082550fc650fe1dc,subnet-0659284c126dace40,subnet-0880481a15ba2c034,subnet-0b6502f09b219f233,subnet-0c815157445a42166,subnet-0de531638ba3cf94f    sg-08003683f61932a03

To verify the status of worker nodes:

$ kubectl get nodes
NAME                                           STATUS   ROLES    AGE   VERSION

ip-192-168-0-47.us-east-2.compute.internal     Ready    <none>   34h   v1.17.11-eks-cfdc40

ip-192-168-33-177.us-east-2.compute.internal   Ready    <none>   34h   v1.17.11-eks-cfdc40

Note: Make sure the worker nodes are in ready status.

Set Up CloudWatch Container Insights for AWS EKS

At this stage, you have your EKS cluster up and running; the next step is to install CloudWatch Container Insights. But before doing that, for worker nodes to push the necessary metrics and logs to CloudWatch, you need to attach an additional IAM policy to the worker nodes. 

Go to the EC2 console, and click on the IAM role attached to your selected worker node:

 

Figure 2: Choosing the IAM Role for your worker node

On the page for IAM roles, click the “Permissions” tab, and click on “Attach policies”:


Figure 3: Attaching an IAM Policy to the worker node

In the search box, search for CloudWatchAgentServerPolicy, select the policy, and click on “Attach policy”:

Figure 4: Attaching CloudWatchAgentServerPolicy to the worker IAM role

 

Deploying Container Insights on EKS

To deploy Container Insights, run the following command:

curl https://raw.githubusercontent.com/aws-samples/amazon-cloudwatch-container-insights/latest/k8s-deployment-manifest-templates/deployment-mode/daemonset/container-insights-monitoring/quickstart/cwagent-fluentd-quickstart.yaml | sed "s/{{cluster_name}}/cluster-name/;s/{{region_name}}/cluster-region/" | kubectl apply -f -
cluster_name: is the name of the EKS cluster; in this example, it's my-newtest-cluster

cluster_region: name of the region where the logs need to be published in this example us-east-2

So, your final command should look like this:

$ curl https://raw.githubusercontent.com/aws-samples/amazon-cloudwatch-container-insights/latest/k8s-deployment-manifest-templates/deployment-mode/daemonset/container-insights-monitoring/quickstart/cwagent-fluentd-quickstart.yaml | sed "s/{{cluster_name}}/my-newtest-cluster/;s/{{region_name}}/us-east-2/" | kubectl apply -f -

When you execute this command, you will see output like this:

namespace/amazon-cloudwatch created

serviceaccount/cloudwatch-agent created

clusterrole.rbac.authorization.k8s.io/cloudwatch-agent-role created

clusterrolebinding.rbac.authorization.k8s.io/cloudwatch-agent-role-binding created

configmap/cwagentconfig created

daemonset.apps/cloudwatch-agent created

configmap/cluster-info created

serviceaccount/fluentd created

clusterrole.rbac.authorization.k8s.io/fluentd-role created

clusterrolebinding.rbac.authorization.k8s.io/fluentd-role-binding created

configmap/fluentd-config created

daemonset.apps/fluentd-cloudwatch created

This command sets up CloudWatch agents and Fluentd. It starts by setting up a namespace for CloudWatch to create a service account, which in turn creates a configmap for the CloudWatch agent and deploys the CloudWatch Agent as a DaemonSet. The same process is repeated with Fluentd.

In order to verify the CloudWatch and Fluentd pods created in the amazon-cloudwatch namespace, run the following command:

$ kubectl get pods -n amazon-cloudwatch
NAME READY STATUS RESTARTS AGE

cloudwatch-agent-44krf 1/1 Running 0 34h

cloudwatch-agent-f5wkf 1/1 Running 0 34h

fluentd-cloudwatch-lrf2r 1/1 Running 0 34h

fluentd-cloudwatch-qtkdx 1/1 Running 0 34h

 

Key Metrics and Logs Monitored via CloudWatch Container Insights

Once Container Insights is configured, you can access the CloudWatch dashboard, which displays numerous metrics like CPU, memory utilization, disk space, and aggregated network statistics across all the EKS clusters in your account. 

Go to the CloudWatch dashboard. Under Container Insights, click on “Performance monitoring,” and here you can see the various statistics that will help you keep track of your application’s infrastructure.

 

Figure 5: Container Insights for your EKS cluster

 

From the drop-down menu, you can even select metrics at the pod level:

 

Figure 6: Container Insights for EKS Pods

Pod-level metrics help you understand if your application is working as expected, allowing you to catch an issue before it impacts your customer/application.

 

Figure 7: Container Insights with different metrics for EKS Pods

Similar to the pod level, you can even monitor metrics at the node level. Node-level metrics help you understand if the pods are placed as desired across your EKS cluster, which is helpful for optimal resource utilization.  

Figure 8: Container Insights for EKS Nodes

 

To view all the logs collected from your container environment, go to the CloudWatch dashboard and click on “Container Insights” from the drop-down menu. Under “Actions,” you will see all the logs collected from your container environment.

  • Application logs are your standard in(stdin) and standard out(stdout) logs, for example, access.log.
  • Control plane logs consist of scheduler logs, API server logs, and audit logs.
  • Data plane logs consist of kubelet and container runtime engine logs.
  • Host logs come from the host level, such as dmesg and secure logs.
  • Performance logs are exactly what their name suggests.

Figure 9: Container Insights with different logs

 

You can click any of these logs to get more detailed information as well. 

Figure 10: Container Logs Insights offers detailed information about your logs

You can even run a customized query on top of these logs, for example, to find out the node’s failure count in the EKS cluster. From the drop-down menu, select “performance logs,” as seen in Figure 9, then run the query below:

stats avg(cluster_failed_node_count) as CountOfNodeFailures 

| filter Type=”Cluster” 

| sort @timestamp desc

There are no failed nodes in this example, but if there were any, they would be displayed at the bottom of the screen:

Figure 11: Container Logs Insights provides information about failed nodes

You can also find the error count by container name by selecting “application logs.” These will give you information on the behavior of the container running in your environment:

stats count() as countoferrors by kubernetes.container_name

| filter stream=”stderr”

| sort countoferrors desc

Figure 12: Container Logs Insights provides an error count by container name

Additionally, you can run a query to find out the top 10 pods restarted in the environment based on the performance logs:

stats max(pod_number_of_container_restarts) as Restarts by PodName, kubernetes.pod_name as PodID

| filter Type=”Pod”

| sort Restarts desc

| limit 10

Figure 13: Container Logs Insights offers information about restarted pods

Or, you can get information like the number of errors that occur in a specific container:

stats count() as CountOfErrors by kubernetes.namespace_name as Namespace, kubernetes.container_name as ContainerName

| sort CountOfErrors desc

| filter Namespace like “<name_of_the_namespace>” and ContainerName like “<name_of_container>”

You can even find out the amount of data sent and received in KBs by a specific pod every five minutes:

fields pod_interface_network_rx_bytes as Network_bytes_recieved, pod_interface_network_tx_bytes as Network_bytes_sent

| filter kubernetes.pod_name like ‘applyboard-website’

| filter (Network_bytes_recieved) > 0

| stats sum(Network_bytes_recieved/1024) as KB_received, sum(Network_bytes_sent/1024) as KB_sent  by bin(5m)

| sort by Timestamp

| limit 100

As you can see, these queries give you insights into what is going on inside your EKS cluster by retrieving data from CloudWatch logs. 

Wrapping Up

In modern DevOps, Kubernetes is an indispensable tool to manage your container environment, but it also brings architectural complexity. To solve this issue, most cloud providers offer their own managed solutions. AWS offers an EKS solution to take care of all the heavy lifting, such as managing the control plane (K8 master nodes, API servers, etcd layer), backups, snapshots, and auto-scaling of master nodes. 

You can then integrate EKS with CloudWatch Container Insights to obtain key metrics and logs, on top of which you can then run queries to gain additional insights about your cluster, as well as deeper insight into your environment. In this way, you can monitor the behavior of your pods and nodes and take proactive action before they fail. 

Read More:

 

Distributed Tracing vs Logging

 

A Complete Guide to Monitoring EKS

 

AWS CloudWatch: Logs and Insights