This article will provide training for using HPA in the ArvanCloud Container. The concept of HPA in the ArvanCloud Container is similar to that of Kubernetes.
Prerequisites
The only prerequisite for this system is to have an ArvanCloud account and access to the Cloud Container. So first, log in to your account. Then go to the profile section, create a new API KEY in the Machine User header, and save it somewhere.
To perform the steps of this article, you need to use the ArvanCloud command line; if necessary, put it in your PATH and give it administrative access and log in through the command line:
arvan login
Then paste the API KEY you received from the site in the continuation of the above command line.
What Is HPA?
In some systems, with the increase of requests, the system load may increase so much that the application cannot respond, and some requests may encounter errors. In this case, it is possible to respond to the desired load by increasing the application resources or uploading similar applications and load balancing between them.
Increasing the number of applications in the ArvanCloud Container is done by increasing the replicas parameter in deployment.
Now, if this increase in load is temporary or happens at different times of the day and then returns to normal, changing the number of applications manually can be a tedious task, or if you forget, it may cause an error in the response of the service. On the other hand, if you allocate a lot of resources to the application permanently, some resources will remain unused in times of low load. Therefore, you will be charged an additional fee.
HPA or Horizontal Pod Autoscaler, by receiving a series of initial settings, ensures that if the load on your application exceeds a specific value, by automatically adding the number of replicas, your service will not face resource limitations to respond to requests. On the other hand, if the load decreases, it avoids wasting resources and additional costs by automatically reducing the number of replicas.
Note: HPA in the ArvanCloud Container can only be applied to deployments that do not have a persistent disk (so-called stateless services).
What is a Readiness Probe?
Before explaining how to use HPA, we need to get acquainted with the concept of Readiness Probe. A running program may get an error for random reasons, and its execution may have problems. In the ArvanCloud Container Service, if the program encounters some errors, the desired Pod will be restarted automatically if deployment is used. However, some error situations may not be detected by the ArvanCloud Container Service, and the target container is still up but needs to service requests properly. The Readiness Probe can be used to solve this problem.
Note: In addition to the concept of Readiness Probe, there is another concept called Liveness Probe, both of these concepts will be discussed in detail in another article, and only a brief mention of Readiness Probe is made here.
With the definition of Readiness Probe, you can specify the conditions that the ArvanCloud Container Service will automatically check if these conditions are met, and if not, it will prevent traffic from entering the Pod by removing the IP of the Pod from the Endpoint of all services.
Note: To use HPA, it is mandatory to define Readiness Probe in Deployment.
Using HPA
To use HPA, you must first define a Readiness Probe for Deployment. For example, the following file contains the deployment definition of an Nginx and Readiness Probe.
apiVersion: apps/v1 kind: Deployment metadata: name: nginx-deployment labels: app: nginx spec: replicas: 1 selector: matchLabels: app: nginx template: metadata: labels: app: nginx spec: containers: - image: nginx imagePullPolicy: IfNotPresent name: nginx ports: - containerPort: 2368 protocol: TCP readinessProbe: exec: command: - nginx - -v initialDelaySeconds: 15 timeoutSeconds: 1 resources: limits: cpu: '1' ephemeral-storage: 0.5G memory: 1G requests: cpu: '1' ephemeral-storage: 0.5G memory: 1G
Note: Indentation is vital in yaml files, and the slightest shift can cause an error or unwanted settings to be returned.
spec.template.spec.containers.readinessProbe: This section contains the Readiness Probe definition. Readiness Probe can be used in three ways. First, execute a command like the above example, checking an HTTP endpoint and a TCP socket. The explanation of each of these methods is described in another article. In this example, the ArvanCloud Container Service ensures the container’s health by regularly checking the nginx -v command inside the container and checking the exit code of this command.
spec.template.spec.containers.readinessProbe.initialDelaySeconds: Sometimes, it takes time for the container to reach full execution mode, and in this period, the desired output may not be provided in response to the specified check. With appropriate settings, the ArvanCloud Container Service waits for a while after the container is loaded and before checking the specified conditions.
spec.template.spec.containers.readinessProbe.timeoutSeconds: The amount of time ArvanCloud Container Service waits for the condition probe response before it considers it failed.
Enter and save the above lines in a file called nginx-deployment.yaml. Then submit your deployment to the ArvanCloud Container Service through the command line with the following command.
arvan paas apply -f nginx-deployment.yaml
Then, with the following command, you can understand your deployment’s status and execution on the ArvanCloud Container Service.
arvan paas get deployment nginx-deployment
Defining HPA
CPU consumption should be set as an Autoscale indicator to define HPA in the ArvanCloud Container Service. This means that by specifying a specific limit for a Pod’s CPU consumption, ArvanCloud Container Service will increase the number of Deployment Pods if the specified amount is exceeded.
To define HPA for a Deployment, just enter the following command.
arvan paas autoscale deploy nginx-deployment --max 10 --min=1 --cpu-percent=50
The above command enables HPA for the deployment we defined before.
In this command, by specifying –max, we set the maximum number of replicas for Pod in case of load increase.
–min specifies the minimum number of replicas for Pod in case of load reduction.
–cpu-percent specifies that if the average CPU consumption of the current Pods exceeds the set number, the number of Pods should be automatically increased by the ArvanCloud Container Service so that the average CPU consumption is less than the specified limit or the number of Pods reaches the –max number. On the other hand, if the load decreases, the number of Pods is automatically reduced until the average CPU consumption is still below the specified limit or the number of Pods reaches the –min value.
By executing the above command, HPA is activated for the desired deployment. You can view the defined HPA status with the below command.
$ arvan paas get hpa NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE nginx-deployment Deployment/nginx-deployment 1%/50% 1 10 1 1h
Now, if the amount of load (CPU consumption) on the Pod increases, the number of replicas will automatically increase, as shown below.
$ arvan paas get hpa NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE nginx-deployment Deployment/nginx-deployment 50%/50% 1 10 2 1h
As the load decreases, the number of replicas returns to the previous value.
Also, with the following command, you can find the details of the desired HPA function.
$ arvan paas describe hpa nginx-deployment Name: nginx-deployment Namespace: example-project Labels: Annotations: CreationTimestamp: Sat, 30 May 2020 11:08:21 +0430 Reference: Deployment/nginx-deployment Metrics: ( current / target ) resource cpu on pods (as a percentage of request): 50% (251m) / 50% Min replicas: 1 Max replicas: 10 Deployment pods: 2 current / 2 desired Conditions: Type Status Reason Message ---- ------ ------ ------- AbleToScale True ReadyForNewScale the last scale time was sufficiently old as to warrant a new scale ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from cpu resource utilization (percentage of request) ScalingLimited False DesiredWithinRange the desired count is within the acceptable range Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal SuccessfulRescale 1h (x3 over 1h) horizontal-pod-autoscaler New size: 4; reason: cpu resource utilization (percentage of request) above target Normal SuccessfulRescale 57m (x4 over 58m) horizontal-pod-autoscaler New size: 2; reason: Current number of replicas below Spec.MinReplicas Normal SuccessfulRescale 52m horizontal-pod-autoscaler New size: 1; reason: All metrics below target Warning FailedGetResourceMetric 51m (x2 over 51m) horizontal-pod-autoscaler did not receive metrics for any ready pods Warning FailedComputeMetricsReplicas 51m (x2 over 51m) horizontal-pod-autoscaler failed to get cpu utilization: did not receive metrics for any ready pods Normal SuccessfulRescale 46m (x2 over 1h) horizontal-pod-autoscaler New size: 2; reason: cpu resource utilization (percentage of request) above target
For more information, you can refer to OKD and k8s documentation.