Series: Kubernetes Services Deployment, Autoscaling and Monitoring, Part 1

At NearForm we are always learning and investigating new ways to work with the open source projects we use in our client projects. The nature of Open Source projects means that they are always being upgraded and improved upon with new features and platforms popping up on a consistent basis. For this reason it is crucial for our developers to stay abreast of the latest trends, technologies and best practices.

Our DevOps community recently did some research on kubernetes services deployment, autoscaling and monitoring and we are sharing the results of their investigation here.

  1. Deploying and Autoscaling Kubernetes with Knative
  2. Autoscaling Kubernetes Services with Keda
  3. Monitoring and Tracing Kubernetes Services with Otel

Kubernetes Overview

Kubernetes (K8s) is an open source platform for automating deployment, scaling, and management of containerized applications. It groups containers that make up an application into logical units for easy management and discovery. Kubernetes builds upon 15 years of experience running production workloads at Google, combined with best-of-breed ideas and practices from the community.

Kubernetes provides a powerful API that enables third-party tools to provision, secure, connect and manage workloads on any cloud or infrastructure platform. It also includes features that help you automate control plane operations such as rollout cadence, versioning and rollback.

Kubernetes is ideal for organizations that want to run containerized applications at scale.

Many organizations now run entire business functions in containers instead of traditional applications. This shift requires changes in how IT operates – from managing virtual machines to managing container orchestrators like Kubernetes. Business leaders are demanding more agility from their IT departments to support fast-moving projects with new technologies like microservices architecture, serverless computing and cloud native computing platforms like OpenShift Container Platform or Azure Kubernetes Service (AKS).

What is Knative?

Knative is a platform that enables serverless cloud native applications to run inside Kubernetes clusters. In order to do that, Knative has tools to make the application management, build and deployment as easy as possible, so developers can focus on the code without needing to worry about setting up complex infrastructure.

It was originally created by Google with contributions from several different companies until it became an open source project hosted by the Cloud Native Computing Foundation (CNCF).

Exploring Knative

Knative offers two main solutions for serverless Kubernetes-based applications:

  • Knative Serving:
    enables serverless workloads for applications inside Kubernetes
  • Knative Eventing:
    enables you to use an event-driven architecture with your serverless application

This article will focus only on the Knative Serving solution.


First of all, follow the steps described here, in order to install Knative on your cluster. Choose the installation option that will best suit your needs. Be aware that, if you are installing in a pre-existing Kubernetes cluster, Knative needs to use a service mesh in order to properly function. So, if you don’t have one configured, you will need to install one (the documentation has a few examples of service meshes that you can use).


Knative Serving is the solution used to enable serverless workloads. In order for the solution to be able to define and control how serverless workloads behave and manage the underlying Kubernetes objects, Knative defines a set of Kubernetes Custom Resource Definitions (CRDs).

The Custom Resource Definitions are:
  • Services:
    Manages the entire lifecycle of your workload and is the main resource, since it also controls the creation of the other resources and ensures that they are working properly
  • Routes:
    Maps a network endpoint to one or more revisions
  • Configurations:
    Resource used to maintain the desired state of your deployment by creating new a new revision when the configuration changes
  • Revisions:
    Point-in-time snapshot of the code and the configuration

You can find more detailed information regarding the CRDs managed by Knative on its documentation:

Benefits of Knative

Knative comes with a collection of solutions that aims to give you a more robust way to manage your cloud native applications out of the box. It also reduces the amount of complexity in terms of how to do it. Below we listed a few benefits of using Knative in your infrastructure


Knative provides autoscaling to the K8s pods managed by the Knative Services (CRD).

Knative implements an autoscaling solution called Knative Pod Autoscaler (KPA) that you can use with your applications, providing the features below:

  • Scale-To-Zero:
    Knative uses the Knative Pod Autoscaler (KPA) by default. With KPA you can scale your application to zero pods, if the application is not receiving any traffic.
  • Concurrency:
    You can use the Concurrency configuration, to determine how many simultaneous connections your pods can process at any given time. If the number of requests exceeds the threshold for each pod, Knative will scale up the number of pods.
  • Requests Per Second:
    You can also use Knative to define how many requests per second each pod can handle. If the number of requests per second exceeds the threshold, Knative will scale up the number of pods.

You can also use Horizontal Pod Autoscaler (HPA) with Knative but, HPA and KPA can’t be used for the same service together. (HPA is not installed by Knative, if you want to use it, you need to install it separately). KPA is used by default, but you can control which type of autoscaler to use, through annotations in the service definitions.

For HPA:

Copy to Clipboard

For KPA:

Copy to Clipboard

Using a similar approach, you can define the type of metric you want to use to autoscale your service and also determine the target to be reached in order to trigger it.

Copy to Clipboard

Take a look at here to learn more about autoscaling in Knative

Traffic Management

With this feature, you can manage routing traffic to different revisions of your configuration, by only making a few changes in a yaml file. Thanks to this, you can use a few features that would be hard to manage if you were only using Kubernetes plain objects, like:

Blue/Green Deployments:
Copy to Clipboard

You can use the following kn CLI command to split traffic between revisions:

Copy to Clipboard
  • <service-name> is the name of the Knative Service that you are configuring traffic routing for.
  • <revision-name> is the name of the revision that you want to configure to receive a percentage of traffic.
  • <percent> is the percentage of traffic that you want to send to the revision specified by <revision-name>.


Copy to Clipboard

Alternatively, you can use the traffic section to perform canary deployments with yaml configuration files like the example:

Copy to Clipboard

You can gradually update the percent values by applying the yaml file changes with the `kubectl apply` command.

With this approach you can rotate the revision versions using Canary and Blue/Green deployments.

You can also achieve the same behavior by managing another Knative resource: routes.

This approach will avoid changing the Knative Service yaml file.

Route example:

Copy to Clipboard
  • <route-name> is the name you choose for your route.
  • <first-revision-name> is the name of the initial Revision from the previous step.


Copy to Clipboard

Once the candidate revision is validated, you can rotate all the traffic to it and perform a Blue/Green deployment:

Copy to Clipboard

Simpler Configuration

Last but not the least, Knative enables you to provision a slightly complex setup for your application using only a few lines of code.

In this first example we define all the needed components to deploy a simple demonstration app in Kubernetes.

Copy to Clipboard

As you can see, all the components need to be declared. As you add more configurations you may need to split this into multiple files, and as you add more services you need to manage and create multiple files for them as well.

In this second example, we are using Knative to achieve the same result as above.

Copy to Clipboard

As you can see, in only a few lines, all the resources can be created. This is only possible because Knative creates the CRDs that are used to deploy and manage these underlying resources.


Knative supports different popular tools for collecting metrics:

Grafana dashboards are available for metrics collected directly with Prometheus.

You can also set up the OpenTelemetry Collector to receive metrics from Knative components and distribute them to other metrics providers that support OpenTelemetry.

You can’t use OpenTelemetry Collector and Prometheus at the same time. The default metrics backend is Prometheus. See “Understanding the Collector” at for more info. 

Knative comes with pre-configured monitoring components.

In an environment with Prometheus and Grafana, the metrics can be exported to Prometheus and presented in a Grafana Dashboard.


If you want to use serverless applications but don’t know how to properly manage them without adding more complexity in your setup, Knative could be your answer. It implements great solutions that will help you to implement and manage these applications and improve your deployments without much effort.

Share Me

Related Reading


Don’t miss a beat

Get all the latest NearForm news, from technology to design. Sign up for our newsletter.

Follow us for more information on this and other topics.