Kubernetes Horizontal Pod Autoscaling – What is it and how does it work?

Shivam Chauhan December 12, 2023
Kubernetes Horizontal Pod Autoscaling

Autoscaling is one of the prominent features of the Kubernetes cluster. Once configured correctly, it saves administrators’ time, prevents performance bottlenecks, and helps avoid financial waste. It is a feature wherein the cluster can increase the number of pods as the demand for service response increases and decrease the number of pods as the requirement decreases.

One of the ways in which Kubernetes enables autoscaling is through Horizontal Pod Autoscaling. HPA can help applications scale out to meet increased demand or scale in when resources are no longer needed. This type of autoscaling does not apply to objects that can’t be scaled.

In this article, we will take a deep dive into the topic of Horizontal Pod Autoscaling in Kubernetes. We’ll define HPA, explain how it works, and provide a detailed tutorial to configure HPA. But before that, let’s first understand what is Kubernetes.

So, without further ado, let’s get started!

What is Kubernetes?

Kubernetes is an open-source container management tool that automates container deployment, container scaling, and load balancing. It schedules, runs, and manages isolated containers running on virtual, physical, and cloud machines.

Kubernetes Horizontal Pod Autoscaling (HPA) :

The Kubernetes Horizontal Pod Autoscaling automatically scales the number of pods in a replication controller, deployment, or replica set based on that resource’s CPU utilization.

Kubernetes has the possibility to automatically scale pods based on observed CPU utilization which is horizontal pod autoscaling. Scaling can be done only for scalable objects like controller, deployment, or replica sets. HPA is implemented as a Kubernetes Application Programming Interface (API) resource and a controller.

With the controller, one can periodically adjust the number of replicas in a deployment or replication controller to match the observed average CPU utilization to the target specified by the user.

HPA Autoscaling

How does a Horizontal PodAutoscaler work?

In simpler words, HPA works in a ‘check, update, check again’ style loop. Here’s how each of the steps in that loop work:

1. Horizontal Pod Autscaler keeps monitoring the metrics server for resource usage.

2. HPA will calculate the required number of replicas on the basis of collected resource usage.

3. Then, HPA decides to scale up the application to the number of replicas required.

4. After that, HPA will change the desired number of replicas.

5. Since HPA is monitoring on a continous basis, the process repeats from Step 1.

How does a Horizontal PodAutoscaler work?

Configuring Horizontal Pod AutoScaling

Let’s create a simple deployment :-

kind: Deployment                      #Defines to create deployment type Object apiVersion: apps/v1
metadata: name: mydeploy         #deployment name
replicas: 2                                        #define number of pods you want
selector:              #Apply this deployment to any pods which has the specific label
name: testpod8            #pod name
name: deployment
containers: -
name: c00                    #container name
Image: httpd
- containerPort: 80         #Containers port exposed
cpu: 500m
cpu: 200m

Now, create autoscaling 

  • kubectl autoscale deployment mydeploy –cpu-percent=20 –min=1 –max=10

Let’s check the HPA entries.

  • kubectl get hpa

Talk to our experts

Final thoughts

We hope this blog was helpful in understanding how Kubernetes Horizontal Pod Autoscaling works and how it can be configured. HPA allows you to scale your applications based on different metrics. By scaling to the correct number of pods dynamically, you can utilize the application in a performant and cost-efficient manner.

In case you still need help with the working of Horizontal Pod Autoscaling or want to know more about it, you can contact a trusted and reliable cloud solutions services provider like Appinventiv.

Shivam Chauhan
Prev PostNext Post
Let's Build Digital Excellence Together
Let's Build Digital
Excellence Together
Read more blogs
cloud native application protection platform

Is a Cloud-Native Application Protection Platform (CNAPP) the Answer to Security Woes?

Cloud computing, at the back of its wide-ranged benefits spanning across scalability, high mobility, easy data recovery, high performance, and quick deployment, has come at a stage where the market is set to reach $676 billion in 2024. While on one side, the idea of having on-cloud presence is becoming mainstream, the other side -…

Sudeep Srivastava
on premise vs cloud

On-premise vs. cloud - Analyzing the benefits, risks and costs for enterprises

Are you standing at the crossroads of a technological revolution, pondering the question that's on every modern enterprise's mind: on-premise vs. cloud? The stakes are higher than ever. With the global cloud computing market poised to soar to an astonishing $2.3 trillion by 2032, the future seems to be whispering its secret preference. Yet, the…

Sudeep Srivastava
cloud cost optimization

Navigating the cloud cost landscape - Strategies for efficient spending

It is no news that cloud computing has become an integral part of the modern business world. The movement toward cloud computing, which was gradual and steady in the last decade, was accelerated and catalyzed by the COVID-19 pandemic. So much so that by the end of 2022, end-user spending on public cloud services reached…

Sudeep Srivastava
Mobile App Consulting Company on Clutch Most trusted Mobile App Consulting Company on Clutch
appinventiv India

B-25, Sector 58,
Noida- 201301,
Delhi - NCR, India

appinventiv USA

79, Madison Ave
Manhattan, NY 10001,

appinventiv Australia

107 Shurvell Rd,
Hunchy QLD 4555,

appinventiv London UK

3rd Floor, 86-90
Paul Street EC2A 4NE
London, UK

appinventiv UAE

Tiger Al Yarmook Building,
13th floor B-block
Al Nahda St - Sharjah

appinventiv Canada

Suite 3810, Bankers Hall West,
888 - 3rd Street Sw
Calgary Alberta