新しいボリュームが接続されたときに保留中のKubernetesポッド（EKS）

Question

私のシナリオを説明しましょう：

TL; DR

1つのボリュームが接続されたKubernetesで展開を作成すると、すべてが完全に機能します。同じ展開を作成しますが、2番目のボリューム（合計：2ボリューム）を接続すると、ポッドが「保留中」でスタックし、エラーが発生します。

pod has unbound PersistentVolumeClaims (repeated 2 times) 0/2 nodes are available: 2 node(s) had no available volume zone.

ボリュームが正しいアベイラビリティーゾーンに作成されていることをすでに確認しました。

詳細な説明

Amazon EKSを使用して、2つのノードでセットアップされたクラスターがあります。次のデフォルトのストレージクラスがあります。

kind: StorageClass apiVersion: storage.k8s.io/v1 metadata: name: gp2 annotations: storageclass.kubernetes.io/is-default-class: "true" provisioner: kubernetes.io/aws-ebs parameters: type: gp2 reclaimPolicy: Retain mountOptions: - debug

そして、2つのボリュームを必要とするmongodbデプロイメントがあります。1つは/data/dbフォルダーにマウントされ、もう1つは必要なランダムディレクトリにマウントされます。以下は、3つのコンポーネントを作成するために使用される最小限のyamlです（意図的にいくつかの行をコメントしました）。

apiVersion: v1 kind: PersistentVolumeClaim metadata: namespace: my-project creationTimestamp: null labels: io.kompose.service: my-project-db-claim0 name: my-project-db-claim0 spec: accessModes: - ReadWriteOnce resources: requests: storage: 5Gi --- apiVersion: v1 kind: PersistentVolumeClaim metadata: namespace: my-project creationTimestamp: null labels: io.kompose.service: my-project-db-claim1 name: my-project-db-claim1 spec: accessModes: - ReadWriteOnce resources: requests: storage: 10Gi --- apiVersion: extensions/v1beta1 kind: Deployment metadata: namespace: my-project name: my-project-db spec: replicas: 1 strategy: type: Recreate template: metadata: labels: name: my-db spec: containers: - name: my-project-db-container image: mongo imagePullPolicy: Always resources: {} volumeMounts: - mountPath: /my_dir name: my-project-db-claim0 # - mountPath: /data/db # name: my-project-db-claim1 ports: - containerPort: 27017 restartPolicy: Always volumes: - name: my-project-db-claim0 persistentVolumeClaim: claimName: my-project-db-claim0 # - name: my-project-db-claim1 # persistentVolumeClaim: # claimName: my-project-db-claim1

そのyamlは完璧に機能します。ボリュームの出力は次のとおりです。

$ kubectl describe pv Name: pvc-307b755a-039e-11e9-b78d-0a68bcb24bc6 Labels: failure-domain.beta.kubernetes.io/region=us-east-1 failure-domain.beta.kubernetes.io/zone=us-east-1c Annotations: kubernetes.io/createdby: aws-ebs-dynamic-provisioner pv.kubernetes.io/bound-by-controller: yes pv.kubernetes.io/provisioned-by: kubernetes.io/aws-ebs Finalizers: [kubernetes.io/pv-protection] StorageClass: gp2 Status: Bound Claim: my-project/my-project-db-claim0 Reclaim Policy: Delete Access Modes: RWO Capacity: 5Gi Node Affinity: <none> Message: Source: Type: AWSElasticBlockStore (a Persistent Disk resource in AWS) VolumeID: aws://us-east-1c/vol-xxxxx FSType: ext4 Partition: 0 ReadOnly: false Events: <none> Name: pvc-308d8979-039e-11e9-b78d-0a68bcb24bc6 Labels: failure-domain.beta.kubernetes.io/region=us-east-1 failure-domain.beta.kubernetes.io/zone=us-east-1b Annotations: kubernetes.io/createdby: aws-ebs-dynamic-provisioner pv.kubernetes.io/bound-by-controller: yes pv.kubernetes.io/provisioned-by: kubernetes.io/aws-ebs Finalizers: [kubernetes.io/pv-protection] StorageClass: gp2 Status: Bound Claim: my-project/my-project-db-claim1 Reclaim Policy: Delete Access Modes: RWO Capacity: 10Gi Node Affinity: <none> Message: Source: Type: AWSElasticBlockStore (a Persistent Disk resource in AWS) VolumeID: aws://us-east-1b/vol-xxxxx FSType: ext4 Partition: 0 ReadOnly: false Events: <none>

そしてポッド出力：

$ kubectl describe pods Name: my-project-db-7d48567b48-slncd Namespace: my-project Priority: 0 PriorityClassName: <none> Node: ip-192-168-212-194.ec2.internal/192.168.212.194 Start Time: Wed, 19 Dec 2018 15:55:58 +0100 Labels: name=my-db pod-template-hash=3804123604 Annotations: <none> Status: Running IP: 192.168.216.33 Controlled By: ReplicaSet/my-project-db-7d48567b48 Containers: my-project-db-container: Container ID: docker://cf8222f15e395b02805c628b6addde2d77de2245aed9406a48c7c6f4dccefd4e Image: mongo Image ID: docker-pullable://mongo@sha256:0823cc2000223420f88b20d5e19e6bc252fa328c30d8261070e4645b02183c6a Port: 27017/TCP Host Port: 0/TCP State: Running Started: Wed, 19 Dec 2018 15:56:42 +0100 Ready: True Restart Count: 0 Environment: <none> Mounts: /my_dir from my-project-db-claim0 (rw) /var/run/secrets/kubernetes.io/serviceaccount from default-token-pf9ks (ro) Conditions: Type Status Initialized True Ready True PodScheduled True Volumes: my-project-db-claim0: Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace) ClaimName: my-project-db-claim0 ReadOnly: false default-token-pf9ks: Type: Secret (a volume populated by a Secret) SecretName: default-token-pf9ks Optional: false QoS Class: BestEffort Node-Selectors: <none> Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s node.kubernetes.io/unreachable:NoExecute for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedScheduling 7m22s (x5 over 7m23s) default-scheduler pod has unbound PersistentVolumeClaims (repeated 2 times) Normal Scheduled 7m21s default-scheduler Successfully assigned my-project/my-project-db-7d48567b48-slncd to ip-192-168-212-194.ec2.internal Normal SuccessfulMountVolume 7m21s kubelet, ip-192-168-212-194.ec2.internal MountVolume.SetUp succeeded for volume "default-token-pf9ks" Warning FailedAttachVolume 7m13s (x5 over 7m21s) attachdetach-controller AttachVolume.Attach failed for volume "pvc-307b755a-039e-11e9-b78d-0a68bcb24bc6" : "Error attaching EBS volume \"vol-01a863d0aa7c7e342\"" to instance "i-0a7dafbbdfeabc50b" since volume is in "creating" state Normal SuccessfulAttachVolume 7m1s attachdetach-controller AttachVolume.Attach succeeded for volume "pvc-307b755a-039e-11e9-b78d-0a68bcb24bc6" Normal SuccessfulMountVolume 6m48s kubelet, ip-192-168-212-194.ec2.internal MountVolume.SetUp succeeded for volume "pvc-307b755a-039e-11e9-b78d-0a68bcb24bc6" Normal Pulling 6m48s kubelet, ip-192-168-212-194.ec2.internal pulling image "mongo" Normal Pulled 6m39s kubelet, ip-192-168-212-194.ec2.internal Successfully pulled image "mongo" Normal Created 6m38s kubelet, ip-192-168-212-194.ec2.internal Created container Normal Started 6m37s kubelet, ip-192-168-212-194.ec2.internal Started container

すべてが問題なく作成されます。ただし、yamlの行のコメントを解除して2つのボリュームをdbデプロイメントにアタッチすると、pv出力は以前と同じになりますが、ポッドは次の出力で保留状態のままになります。

$ kubectl describe pods Name: my-project-db-b8b8d8bcb-l64d7 Namespace: my-project Priority: 0 PriorityClassName: <none> Node: <none> Labels: name=my-db pod-template-hash=646484676 Annotations: <none> Status: Pending IP: Controlled By: ReplicaSet/my-project-db-b8b8d8bcb Containers: my-project-db-container: Image: mongo Port: 27017/TCP Host Port: 0/TCP Environment: <none> Mounts: /data/db from my-project-db-claim1 (rw) /my_dir from my-project-db-claim0 (rw) /var/run/secrets/kubernetes.io/serviceaccount from default-token-pf9ks (ro) Conditions: Type Status PodScheduled False Volumes: my-project-db-claim0: Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace) ClaimName: my-project-db-claim0 ReadOnly: false my-project-db-claim1: Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace) ClaimName: my-project-db-claim1 ReadOnly: false default-token-pf9ks: Type: Secret (a volume populated by a Secret) SecretName: default-token-pf9ks Optional: false QoS Class: BestEffort Node-Selectors: <none> Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s node.kubernetes.io/unreachable:NoExecute for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedScheduling 60s (x5 over 60s) default-scheduler pod has unbound PersistentVolumeClaims (repeated 2 times) Warning FailedScheduling 2s (x16 over 59s) default-scheduler 0/2 nodes are available: 2 node(s) had no available volume zone.

私はすでにこれらの2つの問題を読みました：

ダイナミックボリュームプロビジョニングにより、間違ったアベイラビリティゾーンにEBSボリュームが作成されます

EBSのPersistentVolumeは、ノードのないアベイラビリティーゾーンに作成できます（クローズ）

しかし、ボリュームがクラスターノードインスタンスと同じゾーンに作成されることを既に確認しました。実際、EKSはus-east-1bおよびus-east-1cゾーンにデフォルトで2つのEBSを作成し、それらのボリュームは機能します。投稿されたyamlによって作成されたボリュームもそれらのリージョンにあります。

Jeff · Answer

この記事を参照してください： https://kubernetes.io/blog/2018/10/11/topology-aware-volume-provisioning-in-kubernetes/

要点は、以下を含むようにstorageclassを更新することです。

volumeBindingMode: WaitForFirstConsumer

これにより、ポッドがスケジュールされるまでPVは作成されません。それは私のために同様の問題を修正しました。

Rico · Answer

ボリュームのないアベイラビリティーゾーンにボリュームを作成しようとしているようです。 StorageClassをノードがあるアベイラビリティーゾーンに制限してみてください。

kind: StorageClass apiVersion: storage.k8s.io/v1 metadata: name: gp2 annotations: storageclass.kubernetes.io/is-default-class: "true" provisioner: kubernetes.io/aws-ebs parameters: type: gp2 reclaimPolicy: Retain mountOptions: - debug allowedTopologies: - matchLabelExpressions: - key: failure-domain.beta.kubernetes.io/zone values: - us-east-1b - us-east-1c

これは、これと非常によく似ています question およびこれ answer は、説明されている問題がGCPにあり、この場合はAWSにあることを除きます。

Sarasa Gunawardhana · Answer

この場合、ワーカーノード（EC2インスタンス）の可用性ゾーンを確認する必要があります。

例として：

ワーカーノード1 = eu-central-1b
worker node 2 = eu-central-1c

次に、上記のアベイラビリティーゾーンの1つを含むボリュームを作成します（do not create the volume with eu-central-1a）。

ボリュームを作成したら、新しく作成したボリュームを以下のようにクラスターにアタッチして、PersistentVolumeおよびPersistentVolumeClaimを作成します。

apiVersion: v1 kind: PersistentVolume metadata: labels: failure-domain.beta.kubernetes.io/region: eu-central-1 failure-domain.beta.kubernetes.io/zone: eu-central-1b name: mongo-pv namespace: default spec: accessModes: - ReadWriteOnce capacity: storage: 100Gi awsElasticBlockStore: fsType: ext4 volumeID: aws://eu-central-1b/vol-063342ab9be5d2929 storageClassName: gp2 --- apiVersion: v1 kind: PersistentVolumeClaim metadata: name: mongo-pvc namespace: default spec: accessModes: - ReadWriteOnce resources: requests: storage: 100Gi storageClassName: gp2 volumeName: mongo-pv