Collecting traces from EKS with ADOT

EDRANS Stories
6 min readJan 25, 2022

--

By Mike Bendolini, Cloud Engineer

This article explains how to use build-up SDK instrumentations in AWS distro for OpenTelemetry, to collect traces from an application running in EKS and use AWS X-RAY to analyze and understand those traces. This is based on the experience of the “Observability and beyond” EKS workshop in AWS re:invent 2021.

OpenTelemetry

OpenTelemetry is an open-source project, a collection of tools, APIs, and SDKs used for service instrumentation, or a way to measure performance by collecting and exporting telemetry data (metrics, logs and traces). It is the replacement of OpenTracing and OpenCensus at the eyes and sponsorship of the Cloud Native Computing Foundation (CNCF).

In summary, it defines a centralized collector service that you can use for collecting telemetry data from your applications and services, also including exporters to send that data to the observability platform of your choice.

AWS Distro for OpenTelemetry (ADOT)

AWS Distro for OpenTelemetry is a secure, production-ready, AWS-supported distribution of the OpenTelemetry project. AWS Distro for OpenTelemetry also collects metadata from your AWS resources and managed services, so you can correlate application performance data with underlying infrastructure data.

Image from official ADOT documentation

AWS X-Ray

AWS X-Ray helps you analyze and debug distributed applications such as those built using a microservices architecture. It is useful to understand how your application and its underlying services are performing, to identify and troubleshoot the root cause of performance issues and errors.

Image from official AWS x-ray documentation

OTEL collector permissions

In this section we will explain how to allow ADOT to use container insights, which collects container metrics and analyzes them along with other metrics in Amazon CloudWatch.

For this instructive, we will be using wavework’s kubectl CLI for EKS (follow this link for the installation).

Configure permissions

Most common issues in AWS implementations are often related to permissions and how to handle them, this is why this section will be focused on what is actually needed for the collector to work securely and interact with the cloud services involved.

If you haven’t done this yet, save your cluster’s region and account ID as environment variables:

export AWS_REGION=<CLUSTER_REGION>export ACCOUNT_ID=<ACCOUNT_ID>

Where <CLUSTER_REGION> is the AWS region name where you have an EKS cluster that you would like to trace

Use this command to list all your EKS clusters:

aws eks — region $AWS_REGION list-clusters — output=json

Copy the name of the cluster you are planning to trace and export its name also to an environment variable:

export CLUSTER_NAME=<SELECTED_CLUSTER>

Where <SELECTED_CLUSTER> references the name of the cluster selected in the previous step

Enable the IAM OIDC provider on the EKS cluster:

eksctl utils associate-iam-oidc-provider --region=$AWS_REGION \--cluster=$CLUSTER_NAME \--approveexport OIDC_PROVIDER=$(aws eks describe-cluster --name $CLUSTER_NAME --region=$AWS_REGION \--query "cluster.identity.oidc.issuer" --output text | sed -e "s/^https:\/\///")echo -e "OIDC_PROVIDER: $OIDC_PROVIDER"

Create the IAM policy for the collector:

cat << EOF > AWSDistroOpenTelemetryPolicy.json
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"logs:PutLogEvents",
"logs:CreateLogGroup",
"logs:CreateLogStream",
"logs:DescribeLogStreams",
"logs:DescribeLogGroups",
"xray:PutTraceSegments",
"xray:PutTelemetryRecords",
"xray:GetSamplingRules",
"xray:GetSamplingTargets",
"xray:GetSamplingStatisticSummaries",
"cloudwatch:PutMetricData",
"ec2:DescribeVolumes",
"ec2:DescribeTags",
"ssm:GetParameters"
],
"Resource": "*"
}
]
}
EOF
aws iam create-policy --region=${AWS_REGION} \
--policy-name AWSDistroOpenTelemetryPolicy \
--policy-document file://AWSDistroOpenTelemetryPolicy.json
export ADOT_IAMPOLICY_ARN=$(aws iam list-policies --region=${AWS_REGION} | jq '.Policies | .[] | select(.PolicyName == "AWSDistroOpenTelemetryPolicy").Arn' --raw-output)
echo -e "Created ADOT IAM Policy: $ADOT_IAMPOLICY_ARN"

Now, create an IAM role for the collector and attach the previous policy to it:

read -r -d '' TRUST_RELATIONSHIP <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Federated": "arn:aws:iam::${ACCOUNT_ID}:oidc-provider/${OIDC_PROVIDER}"
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringEquals": {
"${OIDC_PROVIDER}:sub": "system:serviceaccount:aws-otel-eks:aws-otel-sa"
}
}
}
]
}
EOF
echo "${TRUST_RELATIONSHIP}" > trust.json
aws iam create-role --role-name AWSDistroOpenTelemetryRole \
--assume-role-policy-document file://trust.json \
--description "IAM Role for ADOT"
aws iam attach-role-policy --role-name AWSDistroOpenTelemetryRole \
--policy-arn=$ADOT_IAMPOLICY_ARN

Now that we have all the permissions needed for the collector to work, you can go ahead and deploy the collector itself. There is a manifest example in this next link.

Container insights and tracing

Once the OTEL collector is installed, it will start sending container insights metrics. They can be visualized in CloudWatch as shown in this image:

As shown in the image above, the metrics can be filtered by cluster name, kubernetes’ namespace and podname.

All of these metrics can be used to generate dashboards and even alarms to keep a granular track of everything going on inside the EKS cluster.

The example microservices installed are configured to generate traces and send them to Jaegor, which is an open-source, cloud-native, end-to-end distributed system designed to monitor and troubleshoot transactions in applications. The traces are generated using OpenTelemetry specification and use X-Ray compatible IDs and propagators allowing it to store and visualize the trace across the services.

As soon as all the pieces mentioned above are configured, we will view traces that the collector received and export to X-Ray. This next image shows how the information is visualized on AWS:

CloudWatch ServiceLens

In addition to what was already mentioned, there is yet another tool to use to your advantage. CloudWatch ServiceLens ties together with CloudWatch metrics and logs, as well as traces from AWS X-Ray to give a complete view of your applications and their dependencies.

This next image provides an example of what can be visualized for both container insights metrics and X-Ray traces generated by the OpenTelemetry Controller:

Wrapping up

Monitoring application traces is possibly one of the best ways to discover application issues, improve performance and even introduce great ideas for new features. In the modern technological world that we work in, where shared responsibilities are part of any successful team, we should never forget that modern infrastructure setups should always count with strong observability over the applications that run on them.

The solution addressed in this article shows how to easily keep an eye over traces with a set of maintained services and open source projects: ADOT + Jaegor + AWS X-ray. This is also a clear commitment from EDRANS and AWS to help our customers exceed their own expectations by introducing edging systems that allow cross-platform monitoring for every single resource that interacts with the application and the infrastructure in which it is running.

References and more

AWS open distro for OpenTelemetry official documentation: https://aws-otel.github.io/docs/introduction

AWS X-Ray official documentation: https://docs.aws.amazon.com/es_es/es_es/xray/latest/devguide/aws-xray.html

Observability & Beyond With Container-Based Services on AWS, 2021’s re:invent workshop that (heavily) inspired this article: https://con204.github.io/advanced/350_opentelemetry/introduction/

--

--

EDRANS Stories

We are an AWS Premier Consulting Partner company. Since 2009 we’ve been delivering business outcomes and we want to share our experience with you. Enjoy!