Autonomous Container Scaling in Kubernetes via Reinforcement Learning

Imran Qureshi

doi:10.63345/1mymsn26

Authors

Imran Qureshi Independent Researcher Jinnah Colony, Faisalabad, Pakistan (PK) – 38000 Author

DOI:

https://doi.org/10.63345/1mymsn26

Keywords:

Autonomous Container Scaling, Kubernetes, Reinforcement Learning, Deep Q-Learning, Autoscaling Performance, Service-Level Objectives

Abstract

Autonomous container scaling within Kubernetes environments has emerged as a crucial mechanism to guarantee both application performance and cost‐effective resource utilization under highly dynamic workloads. Traditional autoscaling solutions—most notably Kubernetes’s native Horizontal Pod Autoscaler (HPA) and Vertical Pod Autoscaler (VPA)—operate on static threshold‐based rules that monitor CPU and memory utilization. While straightforward to configure, these mechanisms frequently underperform in the presence of bursty traffic patterns or sudden workload shifts, resulting in oscillatory scaling behavior, frequent SLA (Service Level Agreement) violations, and unnecessary overprovisioning. Reinforcement Learning (RL), by contrast, offers a data‐driven, adaptive approach: an RL agent continuously interacts with the cluster environment, observes multidimensional system metrics, and learns an optimal scaling policy through trial and error, balancing performance objectives against resource costs. In this work, we present the design, implementation, and experimental evaluation of the “Multidimensional Pod Autoscaler” (MPA), a Deep Q‐Learning–based autoscaler integrated into Kubernetes as a custom controller. MPA’s state representation comprises percentile‐based CPU and memory metrics, request arrival rates, error rates, and current replica counts. Its action space supports both horizontal scaling (incrementing or decrementing pod replicas) and vertical adjustments (tuning CPU/memory limits), plus a no‐op option for stability. The reward function penalizes SLA breaches—defined as requests exceeding a 200 ms latency threshold—and resource overprovisioning, weighted to reflect business priorities.

We trained MPA offline on historical workload traces and then deployed it for online fine‐tuning under live traffic, comparing its performance against HPA and a heuristics‐driven Smart HPA. Experiments using both the Bookinfo microservices benchmark and a synthetic Poisson‐arrival workload generator demonstrate that MPA can increase average CPU utilization from 65% to 85%, reduce 99th‐percentile request latency by 40%, cut SLA violation rates from 5% to 1%, and achieve a 25% reduction in cloud resource costs. We provide a statistical analysis table summarizing these gains. This manuscript details the full system architecture, state and action definitions, neural network design, training methodology, and deployment strategy. We discuss practical considerations—such as safe exploration policies, integration with Prometheus metrics, and fallback mechanisms—and conclude with an in‐depth look at future research directions, including multi‐agent RL, meta‐RL transfer learning, workload forecasting integration, explainability, and extension to GPU‐aware and edge‐cloud scenarios.

Downloads

Download data is not yet available.

Autonomous Container Scaling in Kubernetes via Reinforcement Learning

Authors

DOI:

Keywords:

Abstract

Downloads

Downloads

Additional Files

Published

Issue

Section

License

How to Cite

Similar Articles

Make a Submission

Language

Sidebar

Keywords

Similar Articles

Cyber-Physical System Security: AI-Driven Intrusion Detection for Industrial IoT

AI-Based Terrain Adaptation Algorithms for Walking Robots

Decentralized DNS Models for Secure, AI-Backed Content Delivery Networks

AI-Driven Bioinformatics for Precision Genome Editing Simulations

Smart Grippers for Predictive Object Recognition in Warehousing Robots

Neurosymbolic AI: Integrating Logical Reasoning with Deep Learning for Complex Problem Solving

Cross-Domain Meta-Learning Frameworks for Real- Time Data Adaptation

AI-Augmented Software Debugging A Self-Learning Approach for Automated Bug Fixing

Self-Evolving Neural Networks for Lifelong Learning Applications

Neuromorphic AI Architectures for Energy-Efficient Autonomous Systems