012HPA自动水平伸缩pod.md
- 先准备一个svc 和 deployment
apiVersion: v1
kind: Service
metadata:
creationTimestamp: null
labels:
app: web
name: web
spec:
ports:
- port: 80
protocol: TCP
targetPort: 80
selector:
app: web
status:
loadBalancer: {}
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: web
name: web
namespace: default
spec:
replicas: 1
selector:
matchLabels:
app: web
template:
metadata:
labels:
app: web
spec:
containers:
- image: nginx:1.21.6
name: nginx
resources:
limits: # 因为我这里是测试环境,所以这里CPU只分配50毫核(0.05核CPU)和20M的内存
cpu: "50m"
memory: 20Mi
requests: # 保证这个pod初始就能分配这么多资源
cpu: "50m"
memory: 20Mi
- 创建一个hpa
# autoscale 表示自动伸缩
# web 是hpa的名称
# --max=3 表示最大扩容数量为3
# --min=1 表示最小扩容数量为1
# --cpu-percent=50 表示当CPU使用率超过50%时扩容
kubectl autoscale deployment web --max=3 --min=1 --cpu-percent=30
kubectl get hpa -w
- 再启动一个终端 启动一个临时pod
kubectl run -it --rm busybox --image=registry.cn-shanghai.aliyuncs.com/acs/busybox:v1.29.2 -- sh
/ # while :;do wget -q -O- http://web;done
- 回到前一个终端
# 查看 kubectl get hpa -w 的输出
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
web Deployment/web cpu: 0%/30% 1 3 1 30s
web Deployment/web cpu: 58%/30% 1 3 1 107s
web Deployment/web cpu: 100%/30% 1 3 2 2m4s
web Deployment/web cpu: 100%/30% 1 3 3 2m22s
web Deployment/web cpu: 95%/30% 1 3 3 2m35s
# 至此可以推出跟踪
# 查看hpa web 的描述
kubectl describe hpa web
Name: web
Namespace: default
Labels: <none>
Annotations: <none>
CreationTimestamp: Thu, 10 Jul 2025 16:58:31 +0800
Reference: Deployment/web
Metrics: ( current / target )
resource cpu on pods (as a percentage of request): 76% (38m) / 30%
Min replicas: 1
Max replicas: 3
Deployment pods: 3 current / 3 desired
Conditions:
Type Status Reason Message
---- ------ ------ -------
AbleToScale True ScaleDownStabilized recent recommendations were higher than current one, applying the highest recent recommendation
ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from cpu resource utilization (percentage of request)
ScalingLimited True TooManyReplicas the desired replica count is more than the maximum replica count
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulRescale 101s horizontal-pod-autoscaler New size: 2; reason: cpu resource utilization (percentage of request) above target
Normal SuccessfulRescale 84s horizontal-pod-autoscaler New size: 3; reason: cpu resource utilization (percentage of request) above target
- 停掉临时pod中的死循环并监听hpa的变化(这个收缩大概时需要在停止临时pod五分钟后才有效)
kubectl get hpa -w
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
web Deployment/web cpu: 68%/30% 1 3 3 5m47s
web Deployment/web cpu: 83%/30% 1 3 3 5m54s
web Deployment/web cpu: 68%/30% 1 3 3 6m9s
web Deployment/web cpu: 0%/30% 1 3 3 6m24s (6分24s降为0)
kubectl get hpa -w
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
web Deployment/web cpu: 0%/30% 1 3 3 9m45s
kubectl get hpa -w
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
web Deployment/web cpu: 0%/30% 1 3 3 11m
web Deployment/web cpu: 0%/30% 1 3 3 11m
web Deployment/web cpu: 0%/30% 1 3 1 11m (11分收缩到1)