Deploying Serverless Services on Kubernetes using Knative

Using Knative to deploy serverless applications to Kubernetes


Thanks for visiting this written workshop on deploying serverless services on Kubernetes using Knative.
This lab is designed to give you an idea of what Knative does, how you use Knative API to deploy applications and how it relates to Kubernetes.

Assumptions and Prerequisites

  • Basic hands-on experience with Kubernetes.
  • Up and Running Kubernetes Cluster (e.g GKE, AKS, EKS,..) v1.15 or newer
  • You have kubectl  installed in your local machine.
  • You have Go installed in your local machine

Objectives: what you will learn ?

  1. ✅ How to install Knative on a Kubernetes cluster
  2. ✅ Deploy a web application from source to Knative
  3. ✅ Autoscale applications from 0-to-1, 1-to-N, and back to 0
  4. ✅ Knative Serving API types and the relationship between them
  5. ✅ Roll out new versions (blue/green deployments) with Knative Serving API

Before starting

Knative is installed as a set of custom APIs and controllers on Kubernetes. You can easily create managed Kubernetes cluster with Google Kubernetes Engine (GKE) also with Amazon Elastic Kubernetes Service and have operate the cluster and autoscaling for you.
In this lab, in order of deploying serverless Services on Kubernetes using Knative, we’ll be using Azure AKS -Azure’s Managed Kubernetes Service- as our K8S backend cluster.

To create an Azure AKS cluster, preferably you use an Infrastructure as Code tool like Terraform. Although cluster creation isn’t covered by this lab, however you can follow my IaC lab to deploy AKS with Terraform

build your Serverless Services on Kubernetes using Knative, we'll be using Azure AKS
Azure AKS K8S Cluster

Connect to cluster using kubectl

To configure kubectl to connect to your Kubernetes cluster, use the az aks get-credentials command. Following example gets credentials for the AKS cluster named myAKSCluster in the myResourceGroup:

az aks get-credentials --resource-group myResourceGroup --name myAKSCluster

To verify the connection to your cluster, run the kubectl get nodes command to return a list of the cluster nodes.

Now you have a fully-provisioned Kubernetes cluster running in Azure, and you’re ready to install Knative on it!

1- Introduction to Knative

Knative (pronounced kay-nay-tiv) extends Kubernetes to provide a set of middleware components that are essential to build modern, source-centric, and container-based applications that can run anywhere: on premises, in the cloud, or even in a third-party data center.

Knative makes it possible to:

  1. Deploy and serve applications with a higher-level and easier to understand API. These applications automatically scale from zero-to-N, and back to zero, based on requests.
  2. Build and package your application code inside the cluster.
  3. Deliver events to your application. You can define custom event sources and declare subscriptions between event buses and your applications.

Developers on Knative can use familiar idioms, languages, and frameworks to deploy functions, applications, or containers workloads.

This is why, Knative provides developer experiences similar to serverless platforms. You can read the documentation at

Knative is still Kubernetes!

If you deployed applications with Kubernetes before, Knative will feel familiar to you. You will still write YAML manifest files and deploy container images on a Kubernetes cluster.

2- Who is Knative for?

Knative APIs

Kubernetes offers a feature called Custom Resource Definitions (CRDs). With CRDs, third party Kubernetes controllers like Istio or Knative can install more APIs into Kubernetes.

Knative installs of three families of custom resource APIs:

  • Knative Serving: Set of APIs that help you host applications that serve traffic. Provides features like custom routing and autoscaling.
  • Knative Eventing: Set of APIs that let you declare event sources and event delivery to your applications. (Not covered in this codelab due to time constraints.)

Together, Knative Serving and Eventing APIs provide a common set of middleware for Kubernetes applications. We will use these APIs to run build and run applications.

Is Knative for me?

Deploying Serverless Services on Kubernetes using Knative: Knative audiences
Knative Audiences (Knative Doc)

Knative serves two main audiences:

1. I want to deploy on Kubernetes easier:

  • Knative makes it easy to declare an application that auto-scales, without worrying about container parameters like CPU, memory, or concerns like activation/deactivation.
  • You can go from a code in a repo to app running on Knative very easily.

2. I want to build my own PaaS/FaaS on Kubernetes:

  • You can use these Knative components and APIs to build a custom deployment platform that looks like Heroku or AWS Lambda at your company.
  • Knative Serving has many valuable “plumbing” components like the autoscaler, request based activation, telemetry.
  • Knative Build lets you declare transformations on the source code, like converting functions to apps, and apps to containers.
  • You don’t have to reinvent the wheel, can reuse plumbing components offered by Knative.

Knative principles

  • Knative is native to Kubernetes (APIs are hosted on Kubernetes, deployment unit is container images)
  • You can install/use parts of Knative independently (e.g. only Knative Build, to do in-cluster builds)
  • Knative components are pluggable (e.g. don’t like the autoscaler? write your own)

3- Installing Knative

This guide walks you through the installation of the latest version of Knative. Knative has two components, which can be installed and used independently or together.

Knative also has an Observability plugin deprecated @ v0.14 which provides standard tooling that can be used to get visibility into the health of the software running on Knative.

Installing the Serving component

# install the  CRDs
kubectl apply –filename \
# install the core components of Serving
kubectl apply –filename \

Pick a networking layer: Istio

1- Installing Istioctl

curl -sL | sh –
export PATH=$PATH:$HOME/.istioctl/bin

2- Choosing an Istio installation

When you install Istio, there are a few options depending on your goals. For a basic Istio installation suitable for most Knative use cases, follow the Installing Istio without sidecar injection instructions. If you’re familiar with Istio and know what kind of installation you want, read through the options and choose the installation that suits your needs here:

In the context of this lab, we want to get up and running with Knative quickly, we’ll installing Istio without automatic sidecar injection. This install is also recommended for users who don’t need the Istio service mesh, or who want to enable the service mesh by manually injecting the Istio sidecars.

3- Installing Istio

Enter the following command to install Istio:

cat << EOF > ./istio-minimal-operator.yaml
kind: IstioOperator
        autoInject: disabled
      useMCP: false
      # The third-party-jwt is not enabled on all k8s.
      # See:
      jwtPolicy: first-party-jwt

      enabled: true
      enabled: false

      - name: istio-ingressgateway
        enabled: true
      - name: cluster-local-gateway
        enabled: true
          istio: cluster-local-gateway
          app: cluster-local-gateway
            type: ClusterIP
            - port: 15020
              name: status-port
            - port: 80
              name: http2
            - port: 443
              name: https

istioctl manifest apply -f istio-minimal-operator.yaml
Deploying Serverless Services on Kubernetes using Knative: istio installation
Istio installation

4- Install the Knative Istio controller:

kubectl apply –filename

Fetch the External IP or CNAME:

kubectl –namespace istio-system get service istio-ingressgateway

NAME                   TYPE           CLUSTER-IP    EXTERNAL-IP     PORT(S)                                                      AGE
istio-ingressgateway   LoadBalancer   15021:30344/TCP,80:30876/TCP,443:30002/TCP,15443:32265/TCP   15h

Configure DNS

To configure DNS for Knative, take the External IP or CNAME from setting up networking, and configure it with your DNS provider as follows:

  • If the networking layer produced an External IP address, then configure a wildcard A record for the domain:# Here is the domain suffix for your cluster * == A
  • If the networking layer produced a CNAME, then configure a CNAME record for the domain:# Here is the domain suffix for your cluster * == CNAME
Create knative record set on Azure

Once your DNS provider has been configured, direct Knative to use that domain:

# Replace with your domain suffix
kubectl patch configmap/config-domain \
  --namespace knative-serving \
  --type merge \
  --patch '{"data":{"":""}}'

Monitor the Knative components until all of the components show a STATUS of Running or Completed:

|--⫸  kubectl get pods --namespace knative-serving                                                                                                         
NAME                               READY   STATUS    RESTARTS   AGE
activator-76984478f7-2trhj         1/1     Running   0          16h
autoscaler-598d974c99-p42h7        1/1     Running   0          16h
controller-9b998cd47-mbn2l         1/1     Running   0          16h
istio-webhook-69cd874949-nqh7f     1/1     Running   0          15h
networking-istio-df55795c6-n6tsb   1/1     Running   0          15h
webhook-658874f97-x8zd4            1/1     Running   0          16h
|~/run-it-on-cloud/knative 💻 

Bingo 🎶 Knative is now installed on your cluster!

4- Your first Knative application

To run an application with Knative on a Kubernetes cluster and expose it to the public internet, you need:

  • an application packaged as container image
  • a Knative Service manifest file

Service definition

To expose an application on Knative, you need to define a Service object. (This is different than the Kubernetes Service type which helps you set up load balancing for Pods.)

 cat << EOF > ./helloworld.yaml
kind: Service
  name: "helloworld"
            image: ""
              - name: "TARGET"
                value: "world"
kubectl apply -f helloworld.yaml

This Knative Service example uses the container image, which is a Go web application listening on port 8080 (currently required port number by Knative).

Verify it’s deployed by querying “ksvc” (Knative Service) objects:

|--⫸  kubectl get ksvc                                                                                                                                     
NAME           URL                                                    LATESTCREATED        LATESTREADY          READY   REASON
helloworld     helloworld-2vcx5     helloworld-2vcx5     True    

Make a request

External requests to Knative applications in a cluster go through a single public load balancer called istio-ingressgateway which has a public IP address.

Find the external IP: kubectl --namespace istio-system get service istio-ingressgateway

The hostname of the application should be helloworld.default.knative.<your dns zone>.

Now, use curl to make the first request to this function (replace the IP_ADDRESS below with the gateway’s external IP address you found earlier):

|~/run-it-on-cloud/knative 💻
|--⫸  curl -H "Host:"

After you made a request to the helloworld Service, you will see that a Pod is created on the Kubernetes cluster to serve the request. Query the list of Pods deployed:

|--⫸  kubectl get pods                                                                                                                                     
NAME                                           READY   STATUS    RESTARTS   AGE
helloworld-2vcx5-deployment-6b865d74f7-vs8pp   2/2     Running   0          29s

Congratulations! 👏 You’ve just deployed a simple working application to Kubernetes with Knative! The next section explains what happened under the covers😄

5- Introduction to Knative Serving API

When you deploy the helloworld Service to Knative, it creates three kinds of objects: ConfigurationRoute, and Revision:

|--⫸  kubectl get configuration,revision,route                                                                                                             
NAME                                           LATESTCREATED      LATESTREADY        READY   REASON   helloworld-2vcx5   helloworld-2vcx5   True

NAME                                            CONFIG NAME   K8S SERVICE NAME   GENERATION   READY   REASON   helloworld    helloworld-2vcx5   1            True

NAME                                   URL                                                  READY   REASON   True

Here’s what each of these Serving APIs do:

ServiceDescribes an application on Knative.
RevisionRead-only snapshot of an application’s image and other settings (created by Configuration).
ConfigurationCreated by Service (from its spec.configuration field). It creates a new Revision when the revisionTemplate field changes.
RouteConfigures how the traffic coming to the Service should be split between Revisions.
Knative Service’s objects workflow: Google CodeLab

6- Serving multiple versions simultaneously

The helloworld Service had a spec.runLatest field which serves all the traffic to the latest revision created form the service’s revisionTemplate field. To test out the effects of a new version of your application, you will need to run multiple versions of your applications and route a portion of your traffic to the new “canary” version you are testing. This practice is called “blue-green deployment”.
Knative Serving offers the Revision API, which tracks the changes to application configuration, and the Route API, which lets you split the traffic to multiple revisions.

In this exercise you will:

  1. Deploy a “blue” Service version in runLatest mode.
  2. Update Service with “green” configuration and change mode to release to split traffic between two revisions.

Deploying the v1

To try out the a blue-green deployment, first you will need to deploy a “blue” version.
Services in runLatest mode will send all the traffic to the Revision specified in the Service manifest. In the earlier helloworld example, you’ve used a service in runLatest mode.
First, deploy the v1 (blue) version of the Service with runLatest mode by saving manifest to a file named v1.yaml, and apply it to the cluster:

cat << EOF > ./v1.yaml                                                                                                                               
kind: Service
  name: canary
            - name: T_VERSION
              value: "blue"
kubectl apply -f v1.yaml

Query the deployed revision name (should be canary-00001):

kubectl get revisions

NAME               CREATED AT
canary-00001       39s

Make a request and observe the blue version by replacing the IP_ADDRESS below with the gateway’s IP address (the first request may take some time to complete as it starts the Pod):

curl -H “Host:” http://IP_ADDRESS
|--⫸  curl -H "Host:"                                                                         

            <div class="blue">App v1</div>

Deploying the v2

The Knative Service API has a release mode that lets you roll out changes to new revisions with traffic splitting.

Make a copy of v1.yaml named v2.yaml

  • cp v1.yaml v2.yaml

Make the following changes to v2.yaml.

  • change runLatest mode to release
  • change blue to green in “image” and “env” fields
  • add a revisions field with the [current, current+1] revision names
  • specify a rolloutPercent field, routing 20% of traffic to the candidate (“green”) revision

The resulting v2.yaml should look like the following snippet. Save and apply this to the cluster:

kind: Service
  name: canary
    revisions: ["canary-00001", "canary-00002"] # [current, candidate]
    rolloutPercent: 20                          # 20% to green revision
            - name: T_VERSION
              value: "green"
kubectl apply -f v2.yaml

You should now see the new revision created, while the old one is still around:

kubectl get revisions

NAME               CREATED AT
canary-00001       6m
canary-00002       3m

If you see different revision numbers than the output above, change the revisions field with their names in v2.yaml and re-apply the manifest file.

Now, make a few requests and observe the response is served from the new “green” version roughly 20% of the time (replace IP_ADDRESS below):

|--⫸  while true; do                                                                                                                                       
  curl -s -H "Host:" | grep -E 'blue|green';
            <div class="blue">App v1</div>
            <div class="blue">App v1</div>
            <div class="blue">App v1</div>
            <div class="green">App v2</div>
            <div class="blue">App v1</div>
            <div class="green">App v2</div>
            <div class="green">App v2</div>
            <div class="blue">App v1</div>
            <div class="green">App v2</div>
            <div class="blue">App v1</div>
            <div class="blue">App v1</div>
            <div class="blue">App v1</div>

The rolloutPercent determines what portion of the traffic the candidate revision gets. If you set this field to 0, the candidate revision will not get any traffic. If you want to play with the percentages, you can edit the v2.yaml and re-apply it to the cluster.

With the Service configured in release mode, you can also connect to specific revisions through their dedicated addresses:

  • (most recently deployed Revision, even if it’s not specified on the revisions field.)

After the Service is configured with the release mode, you should see the Route object configured with the traffic splitting (20% to “candidate”, 80% to “current”):

kubectl describe route canary        
    Latest Revision:  false
    Percent:          80
    Revision Name:    canary-xnvvq
    Tag:              current
    Latest Revision:  false
    Percent:          20
    Revision Name:    canary-dwf4g
    Tag:              candidate
    Latest Revision:  true
    Revision Name:    canary-dwf4g
    Tag:              latest

As you roll out changes to the Service, you need to repeat finding the Revision name and specify it in the revisions field as the candidate.
Great j😎b! 🥳 You just used the Knative Serving API to create a blue-green deployment.

6- Autoscaling applications with Knative

In this example, we will deploy an application, send some artificial request load to it, observe Knative scales up the number of Pods serving the traffic and look at the monitoring dashboard about why the autoscaling has happened.

Deploy the application

The following manifest describes an application on Knative, where we can configure how long each request takes. Save it to autoscale-go.yaml, and apply to the cluster:

cat << EOF > ./autoscale-go.yaml                                                                                                                     
kind: Service
  name: autoscale-go
            image: ""
kubectl apply -f autoscale-go.yaml

Now, find the public IP address of Istio gateway and save it to IP_ADDRESS variable on your shell:

 IP_ADDRESS="$(kubectl get service --namespace=istio-system istio-ingressgateway --output jsonpath="{.status.loadBalancer.ingress[*].ip}")"

Make a request to this application to verify you can connect (note that the response indicates the request took 1 second):

curl --header "Host:" \

Launch monitoring dashboard

Knative comes with set of observability features to enable logging, metrics, and request tracing in your Serving and Eventing components.
In this context we will use Prometheus and Grafana to collect metrics about requests, applications and autoscaling, and exports these metrics to Grafana dashboards for viewing.

kubectl apply --filename
kubectl apply --filename

To connect to the Grafana dashboard on your cluster, open a new separate terminal and keep the following command running:

 kubectl port-forward --namespace knative-monitoring $(kubectl get pods --namespace knative-monitoring  --selector=app=grafana --output=jsonpath="{}")  8080:3000 

This command exposes the Grafana server on localhost:8080 on your browser.

To view the autoscaling dashboard, follow the steps:

  1. Click “Home” on top right to view dashboards.
  2. Choose “Knative Serving – Scaling Debugging” dashboard.
  3. Click the time settings on the top right, choose “last 5 minutes”, then choose “refresh every 10 seconds”, then click Apply.
  4. In the main panel, choose “Configuration” as “autoscale-go”.
  5. Expand the autoscaler metrics.
  6. You should be seeing graphs for “Pod Counts” and “Observed Concurrency”.
Deploying Serverless Services on Kubernetes using Knative: Grafana
Knative Grafana Dashboard

Keep this dashboard window and kubectl port-forward command running, now you will send some request load to the application.

Triggering autoscaling

In this step, we will send some artificial load through a load generator. Download the load generator named hey using go tool.

go get

Use hey to send 150,000 requests (with 500 requests in parallel), each taking 1 second (leave this command running, as it will take a while to complete).

hey -host -c 500 -n 150000 \

Meanwhile, open a new terminal window and keep an eye on the number of pods.

watch kubectl get pods

Knative Serving, by default, has a concurrent requests target of 100. Sending 500 concurrent requests causes autoscaling to note that it needs to run 5 Pods to satisfy this level.
Go back to the Grafana dashboard and observe that the number of Pods has increased from 1 to 4:

Number of Pods has increased from 1 to 4

Similarly, on Grafana dashboard, you can see that the observed concurrency level briefly peaks, and as Knative created more Pods, it comes back down to below 100 (the default concurrency target):

Concurrency peak

Now, you can close the Grafana window, and stop the hey and kubectl proxy commands after observing the autoscaling.

What’s Next!

Knative is a fairly new project, released in July 2018. Most parts of the project such as the API and the documentation are changing very frequently. To stay up to date, join one of the community forums below and visit the documentation for the most recent instructions.

We have not covered the Knative Eventing APIs during this codelab. If you are interested in getting events from external sources and having them delivered to your applications, read about it and play with it if you have time.

That’s all folks!

That’s all for this lab, thanks for reading 🙏

Take action!


4 thoughts on “Deploying Serverless Services on Kubernetes using Knative”

  1. Batuhan says:

    amazing article thank you

    1. ayseg says:

      Glad to hear that 😃 thanks!

  2. Hamdi says:

    Clear and well explained! Thanks aymen

Leave a Reply

Related Post