Make an App Resilient
In this section of the workshop, we will deploy an application to an ROSA cluster, ensure the application is resilient to node failure, and scale the application when under load.
Deploy an application#
-
First, let's deploy an application. To do so, run the following set of commands:
-
While the application is being built from source, you can watch the rollout status of the deployment object to see when its finished.
-
We can now use the route to view the application in your web browser. To get the route, run the following command:
Then visit the URL presented in a new tab in your web browser (using HTTP). For example, your output will look something similar to:
In that case, you'd visit
http://frontend-js-resilience-ex.apps.user1-mobbws.2ep4.p1.openshiftapps.com
in your browser. -
Initially, this application is deployed with only one pod. In the event a worker node goes down or the pod crashes, there will be an outage of the application. To prevent that, let's scale the number of instances of our applications up to three. To do so, run the following command:
-
Next, check to see that the application has scaled. To do so, run the following command to see the pods. Then check that it has scaled
Your output should look similar to this:
Pod Disruption Budget#
A Pod disruption Budget (PBD) allows you to limit the disruption to your application when its pods need to be rescheduled for upgrades or routine maintenance work on ROSA nodes. In essence, it lets developers define the minimum tolerable operational requirements for a deployment so that it remains stable even during a disruption.
For example, frontend-js deployed as part of the last step contains three replicas distributed evenly across three nodes. We can tolerate losing two pods but not one, so we create a PDB that requires a minimum of one replica.
A PodDisruptionBudget object’s configuration consists of the following key parts:
- A label selector, which is a label query over a set of pods.
- An availability level, which specifies the minimum number of pods that must be available simultaneously, either:
- minAvailable is the number of pods must always be available, even during a disruption.
- maxUnavailable is the number of pods can be unavailable during a disruption.
Danger
A maxUnavailable of 0% or 0 or a minAvailable of 100% or equal to the number of replicas can be used but will block nodes from being drained and can result in application instability during maintenance activities.
-
Let's create a Pod Disruption Budget for our
frontend-js
application. To do so, run the following command:cat <<EOF | oc apply -f - apiVersion: policy/v1 kind: PodDisruptionBudget metadata: name: frontend-js-pdb namespace: resilience-ex spec: minAvailable: 1 selector: matchLabels: deployment: frontend-js EOF
After creating the PDB, OpenShift API will ensure at least one pod of
frontend-js
is running all the time, even when maintenance is going on with the cluster. -
Next, let's check the status of Pod Disruption Budget. To do so, run the following command:
Your output should match this:
Horizontal Pod Autoscaler (HPA)#
As a developer, you can use a horizontal pod autoscaler (HPA) to specify how ROSA clusters should automatically increase or decrease the scale of a replication controller or deployment configuration, based on metrics collected from the pods that belong to that replication controller or deployment configuration. You can create an HPA for any deployment, replica set, replication controller, or stateful set.
In this exercise we will scale the frontend-js
application based on CPU utilization:
- Scale out when average CPU utilization is greater than 50% of CPU limit
- Maximum pods is 4
-
Scale down to min replicas if utilization is lower than threshold for 60 sec
-
First, we should create the HorizontalPodAutoscaler. To do so, run the following command:
cat <<EOF | oc apply -f - apiVersion: autoscaling/v2 kind: HorizontalPodAutoscaler metadata: name: frontend-js-cpu namespace: resilience-ex spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: frontend-js minReplicas: 2 maxReplicas: 4 metrics: - type: Resource resource: name: cpu target: averageUtilization: 50 type: Utilization behavior: scaleDown: stabilizationWindowSeconds: 60 policies: - type: Percent value: 100 periodSeconds: 15 EOF
-
Next, check the status of the HPA. To do so, run the following command:
Your output should match the following:
-
Next, let's generate some load against the
frontend-js
application. To do so, run the following command: -
Wait for a minute and then kill the siege command (by hitting CTRL and c on your keyboard). Then immediately check the status of Horizontal Pod Autoscaler. To do so, run the following command:
Your output should look similar to this:
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE frontend-js-cpu Deployment/frontend-js 113%/50% 2 4 4 3m13s
This means you are now running 4 replicas, instead of the original three that we started with.
-
Once you've killed the seige command, the traffic going to
frontend-js
service will cool down and after a 60 second cool down period, your application's replica count will drop back down to two. To demonstrate this, run the following command:After a minute or two, your output should be similar to this: