012HPA自动水平伸缩pod.md

先准备一个svc 和 deployment

apiVersion: v1
kind: Service
metadata:
  creationTimestamp: null
  labels:
    app: web
  name: web
spec:
  ports:
  - port: 80
    protocol: TCP
    targetPort: 80
  selector:
    app: web
status:
  loadBalancer: {}

---
    
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: web
  name: web
  namespace: default
spec:
  replicas: 1
  selector:
    matchLabels:
      app: web
  template:
    metadata:
      labels:
        app: web
    spec:
      containers:
      - image: nginx:1.21.6
        name: nginx
        resources:
          limits:   # 因为我这里是测试环境，所以这里CPU只分配50毫核（0.05核CPU）和20M的内存
            cpu: "50m"
            memory: 20Mi
          requests: # 保证这个pod初始就能分配这么多资源
            cpu: "50m"
            memory: 20Mi

创建一个hpa

# autoscale 表示自动伸缩
# web 是hpa的名称
# --max=3 表示最大扩容数量为3
# --min=1 表示最小扩容数量为1   
# --cpu-percent=50 表示当CPU使用率超过50%时扩容
kubectl  autoscale deployment web --max=3 --min=1 --cpu-percent=30
kubectl get hpa -w

再启动一个终端启动一个临时pod

kubectl run -it --rm busybox --image=registry.cn-shanghai.aliyuncs.com/acs/busybox:v1.29.2 -- sh
/ # while :;do wget -q -O- http://web;done

回到前一个终端

# 查看 kubectl get hpa -w 的输出
NAME   REFERENCE        TARGETS       MINPODS   MAXPODS   REPLICAS   AGE
web    Deployment/web   cpu: 0%/30%   1         3         1          30s
web    Deployment/web   cpu: 58%/30%   1         3         1          107s
web    Deployment/web   cpu: 100%/30%   1         3         2          2m4s
web    Deployment/web   cpu: 100%/30%   1         3         3          2m22s
web    Deployment/web   cpu: 95%/30%    1         3         3          2m35s

# 至此可以推出跟踪

# 查看hpa web 的描述
kubectl describe hpa web

Name:                                                  web
Namespace:                                             default
Labels:                                                <none>
Annotations:                                           <none>
CreationTimestamp:                                     Thu, 10 Jul 2025 16:58:31 +0800
Reference:                                             Deployment/web
Metrics:                                               ( current / target )
  resource cpu on pods  (as a percentage of request):  76% (38m) / 30%
Min replicas:                                          1
Max replicas:                                          3
Deployment pods:                                       3 current / 3 desired
Conditions:
  Type            Status  Reason               Message
  ----            ------  ------               -------
  AbleToScale     True    ScaleDownStabilized  recent recommendations were higher than current one, applying the highest recent recommendation
  ScalingActive   True    ValidMetricFound     the HPA was able to successfully calculate a replica count from cpu resource utilization (percentage of request)
  ScalingLimited  True    TooManyReplicas      the desired replica count is more than the maximum replica count
Events:
  Type    Reason             Age   From                       Message
  ----    ------             ----  ----                       -------
  Normal  SuccessfulRescale  101s  horizontal-pod-autoscaler  New size: 2; reason: cpu resource utilization (percentage of request) above target
  Normal  SuccessfulRescale  84s   horizontal-pod-autoscaler  New size: 3; reason: cpu resource utilization (percentage of request) above target

停掉临时pod中的死循环并监听hpa的变化(这个收缩大概时需要在停止临时pod五分钟后才有效)

kubectl get hpa -w
NAME   REFERENCE        TARGETS        MINPODS   MAXPODS   REPLICAS   AGE
web    Deployment/web   cpu: 68%/30%   1         3         3          5m47s
web    Deployment/web   cpu: 83%/30%   1         3         3          5m54s
web    Deployment/web   cpu: 68%/30%   1         3         3          6m9s
web    Deployment/web   cpu: 0%/30%    1         3         3          6m24s （6分24s降为0）

kubectl get hpa -w
NAME   REFERENCE        TARGETS       MINPODS   MAXPODS   REPLICAS   AGE
web    Deployment/web   cpu: 0%/30%   1         3         3          9m45s

kubectl get hpa -w
NAME   REFERENCE        TARGETS       MINPODS   MAXPODS   REPLICAS   AGE
web    Deployment/web   cpu: 0%/30%   1         3         3          11m
web    Deployment/web   cpu: 0%/30%   1         3         3          11m
web    Deployment/web   cpu: 0%/30%   1         3         1          11m （11分收缩到1）