005手动创建一个一定会发生故障推出的pod并跟踪这个pod

部署一个mytest的 Deployment 副本数量为10,之后模拟一次发版导致了失败,我们用Readiness来保证不健康的pod不被请求
1. 先准备两个Deployment配置

```yaml
# cat myapp-v1.yaml 是可以通过健康检查

apiVersion: apps/v1
kind: Deployment
metadata:
name: mytest
spec:
replicas: 10     # 这里准备10个数量的pod
selector:
    matchLabels:
    app: mytest
template:
    metadata:
    labels:
        app: mytest
    spec:
    containers:
    - name: mytest
        image: registry.cn-hangzhou.aliyuncs.com/acs/busybox:v1.29.2
        args:
        - /bin/sh
        - -c
        - sleep 10; touch /tmp/healthy; sleep 30000
        readinessProbe:
        exec:
            command:
            - cat
            - /tmp/healthy
        initialDelaySeconds: 10
        periodSeconds: 5

# cat myapp-v2.yaml v2是不能通过检测的 模拟升级发版失败

apiVersion: apps/v1
kind: Deployment
metadata:
name: mytest
spec:
strategy:
    rollingUpdate:
    maxSurge: 35%   # 滚动更新的副本总数最大值(以10的基数为例):10 + 10 * 35% = 13.5 --> 14
    maxUnavailable: 35%  # 可用副本数最大值(默认值两个都是25%): 10 - 10 * 35% = 6.5  --> 7
replicas: 10
selector:
    matchLabels:
    app: mytest
template:
    metadata:
    labels:
        app: mytest
    spec:
    containers:
    - name: mytest
        image: registry.cn-hangzhou.aliyuncs.com/acs/busybox:v1.29.2
        args:
        - /bin/sh
        - -c
        - sleep 30000   # 可见这里并没有生成/tmp/healthy这个文件,所以下面的检测必然失败
        readinessProbe:
        exec:
            command:
            - cat
            - /tmp/healthy
        initialDelaySeconds: 10
        periodSeconds: 5

```

2. 启动myapp-v1.yaml

```shell
kubectl apply -f myapp-v1.yaml
# 别忘了加备注
kubectl annotate deployment/mytest kubernetes.io/change-cause="kubectl apply --filename=myapp-v1.yaml"
# 过一会就会看到pod状态为Running
root@k8s-192-168-0-17:~# kubectl get pod -o wide 
NAME                      READY   STATUS    RESTARTS   AGE    IP               NODE               NOMINATED NODE   READINESS GATES
mytest-59887f89f5-fq6hv   1/1     Running   0          112s   172.20.182.159   k8s-192-168-0-11   <none>           <none>
mytest-59887f89f5-gpsnx   1/1     Running   0          113s   172.20.182.157   k8s-192-168-0-11   <none>           <none>
mytest-59887f89f5-gwkmg   1/1     Running   0          113s   172.20.177.33    k8s-192-168-0-19   <none>           <none>
mytest-59887f89f5-ltdw9   1/1     Running   0          115s   172.20.182.156   k8s-192-168-0-11   <none>           <none>
mytest-59887f89f5-m4vkn   1/1     Running   0          112s   172.20.177.37    k8s-192-168-0-19   <none>           <none>
mytest-59887f89f5-m9z2t   1/1     Running   0          112s   172.20.182.160   k8s-192-168-0-11   <none>           <none>
mytest-59887f89f5-mq9n6   1/1     Running   0          113s   172.20.177.35    k8s-192-168-0-19   <none>           <none>
mytest-59887f89f5-nwsc9   1/1     Running   0          115s   172.20.177.34    k8s-192-168-0-19   <none>           <none>
mytest-59887f89f5-pzm68   1/1     Running   0          115s   172.20.177.36    k8s-192-168-0-19   <none>           <none>
mytest-59887f89f5-qd74c   1/1     Running   0          113s   172.20.182.158   k8s-192-168-0-11   <none>           <none>
```

3. 启动myapp-v2.yaml

```shell
kubectl apply -f myapp-v2.yaml
# 别忘了加备注
kubectl annotate deployment/mytest kubernetes.io/change-cause="kubectl apply --filename=myapp-v2.yaml"
# 过一会查看deployment 输出结果 会稳定在以下结果
root@k8s-192-168-0-17:~# kubectl get deployment mytest
NAME     READY   UP-TO-DATE   AVAILABLE   AGE
mytest   7/10    7            7           3m43s
# READY 现在正在运行的只有7个pod
# UP-TO-DATE 表示当前已经完成更新的副本数:即 7 个新副本
# AVAILABLE 表示当前处于 READY 状态的副本数

# 查看pod
root@k8s-192-168-0-17:~# kubectl get pod
NAME                      READY   STATUS    RESTARTS   AGE
mytest-59887f89f5-fq6hv   1/1     Running   0          5m9s
mytest-59887f89f5-gpsnx   1/1     Running   0          5m10s
mytest-59887f89f5-gwkmg   1/1     Running   0          5m10s
mytest-59887f89f5-ltdw9   1/1     Running   0          5m12s
mytest-59887f89f5-m9z2t   1/1     Running   0          5m9s
mytest-59887f89f5-pzm68   1/1     Running   0          5m12s
mytest-59887f89f5-qd74c   1/1     Running   0          5m10s
mytest-8586c6547d-6sqwt   0/1     Running   0          2m19s
mytest-8586c6547d-b9kql   0/1     Running   0          2m20s
mytest-8586c6547d-cgkrj   0/1     Running   0          2m7s
mytest-8586c6547d-dw6kv   0/1     Running   0          2m18s
mytest-8586c6547d-ht4dq   0/1     Running   0          2m19s
mytest-8586c6547d-v7rh9   0/1     Running   0          2m8s
mytest-8586c6547d-vqn6w   0/1     Running   0          2m7s

# 查看deployment的信息
root@k8s-192-168-0-17:~# kubectl describe deployment mytest
...
Replicas:               10 desired | 7 updated | 14 total | 7 available | 7 unavailable
...
Events:
Type    Reason             Age    From                   Message
----    ------             ----   ----                   -------
Normal  ScalingReplicaSet  5m46s  deployment-controller  Scaled up replica set mytest-59887f89f5 from 0 to 10
Normal  ScalingReplicaSet  2m52s  deployment-controller  Scaled up replica set mytest-8586c6547d from 0 to 4
Normal  ScalingReplicaSet  2m50s  deployment-controller  Scaled down replica set mytest-59887f89f5 from 10 to 7
Normal  ScalingReplicaSet  2m45s  deployment-controller  Scaled up replica set mytest-8586c6547d from 4 to 7
```

4. 如此我们保证了集群中有7个可用的pod

下面来解析一下整个过程

maxSurge:

规定了滚动更新过程中pod副本数可以超过总副本数的上限。配置项可以是一个具体的数字也可以是一个比例,如果是比例则会向上取整

我们的例子副本总数是10 maxSurge: 35% 所以 10 + 10 * 35% = 13.5 --> 14

所以对mytest这个deployment的副本描述Replicas: 10 desired | 7 updated | 14 total | 7 available | 7 unavailable

10个目标值 7个已经更新 14为最大值 7个可用 7个不可用

maxUnavailable:

控制滚动过程中最大pod不可用数量。同样可以是一个数字也可以是一个比例。如果是比例则向上取整

我们例子中 maxUnavailable:35%  所以  10 - 10 * 35% = 6.5 --> 7

我们本次滚动更新的完整过程为

1) 根据maxSurge得到最大副本数14 所以 先创建4个新版本的pod副本,使副本总数达到14

2) 然后根据maxUnavailable 的到最大不可用数量为7 14-7(最大不可用数)=7(最小可用数) 所以销毁3个旧版本的pod 

3) 3个旧版本pod销毁完成之后,再创建3个新版本pod使总副本数保持14

4) 当新版本pod通过Readiness检测后,会使可用pod副本超过7个

5) 再销毁更多旧pod使可用副本保持7个。

6) 随着旧pod销毁,新pod会自动创建,使副本数保持14

7) 依此类推一直到全部更新完成。

我们的实际情况在第4步卡住了。新的pod无法通过Readiness的检测。

此时在实际生产环境中我们需要rollout undo 来回滚上一个版本保证集群整体

```shell
root@k8s-192-168-0-17:~# kubectl rollout history deployment mytest
deployment.apps/mytest 
REVISION  CHANGE-CAUSE
1         kubectl apply --filename=myapp-v1.yaml
2         kubectl apply --filename=myapp-v2.yaml

root@k8s-192-168-0-17:~# kubectl rollout undo deployment mytest --to-revision=1
deployment.apps/mytest rolled back

# 然后 观察全局pod的变化过程
kubectl get pod -w
```