Portworx & Red Hat Hands-on Labs Register Now
In Kubernetes, persistent storage is critical for stateful applications like databases and message queues requiring consistent data access across pod restarts and rescheduling. Kubernetes supports dynamic volume provisioning through StorageClasses, which automatically creates and attaches Persistent Volumes (PVs) when applications request storage via Persistent Volume Claims (PVCs).
Due to this dynamic behavior, errors can occur during the volume attachment and detachment process. ‘Unable to attach or mount volumes’ is one such error that occurs during volume operations, typically as a FailedMount or FailedAttach event in pod events, preventing pods from starting.
This error commonly occurs in stateful workloads like databases or message queues that use PVCs to request storage. This happens when the volume controller manager or the kubelet fails to complete their respective volume operations, leading to the pods being in a pending state.
Situations in which error most commonly appears
The above error generally occurs in multiple scenarios; some of these are:
- Node affinity conflicts and topology constraints: When using topology-aware volume provisioning, volume attachment fails if the pod is scheduled to a node that doesn’t satisfy the volume’s topology requirements. For instance, with the WaitForFirstConsumer volume binding mode, when the selected node doesn’t match the storage provider’s topology constraints.
- CSI driver issues: Container Storage Interface (CSI) drivers manage the lifecycle of storage volumes. Any malfunction or error caused by misconfiguration leads to provisioning failures, attachment, or mounting volume issues. This issue may occur when the CSI node plugin pod fails, network connectivity issues between the CSI controller and storage backends, and version mismatch between the CSI driver and storage provider APIs.
- Storage class misconfiguration issue: Volume operations can fail due to StorageClass misconfiguration including invalid provisioner names, unsupported filesystem type, incompatible volumeBindingMode settings and invalid storage provider-specific parameters.
Imagine you’re running a stateful application like a MySQL database on your Azure Kubernetes Service (AKS) cluster, and the pod uses a PVC to store its data on an Azure disk. Storage mount failures can occur when the underlying Azure disk becomes inaccessible due to accidental deletion, misconfiguration, or network issues.
The process starts with Kubernetes attempting to dynamically provision an Azure disk and bind it to PVC. However, since the disk is missing or inaccessible when the pod is scheduled on a node, the kubelet fails to attach and mount the volume, resulting in the pod being stuck in “pending” state with a FailedMount error.
You may observe the following events when this error occurs:
``` Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 2m default-scheduler Successfully assigned default/mysql-pod to aks-nodepool-12345678-vmss000000 Warning FailedMount 1m (x5 over 2m) kubelet Unable to attach or mount volumes: unmounted volumes=[mysql-data], unattached volumes=[mysql-data kube-api-access-xyz]: timed out waiting for the condition Warning FailedMount 30s (x10 over 2m) kubelet MountVolume.SetUp failed for volume "pvc-12345678-90ab-cdef-0123-4567890abcde" : rpc error: code = NotFound desc = Volume resource not found
What Causes This Error
This error could have multiple reasons, including:
- Volume management failures: Volume attachment controller failures in the Kubernetes control plane can affect the attachment process. The controller is responsible for attaching volumes to nodes. Failure due to scenarios such as loss of connectivity to the Kubernetes API server, controller pod crashes, or resource exhaustion in the control plane can prevent the controller from managing the VolumeAttachment objects necessary for storage provisioning and cause the volume to be unavailable for new pods.
- Volume plugin issues: Volume plugins can be implemented as CSI drivers or in-tree volume plugins. These plugins act as a bridge between the Kubernetes cluster and the underlying storage system, transforming the Kubernetes volume operations into storage-specific API calls. Any bugs or misconfiguration in these plugins can lead to volume mounting or attaching failures.
- Node resource contention: Nodes execute and manage pods, relying on system resources like CPU, memory, and disk space to execute the operations effectively. However, when a node does not have enough resources, some operations, like attaching or mounting volumes, as the kubelet may fail to execute volume operations when these resources are constrained.
Potential impacts of the error
Volume attachment and mount failures can impact application availability and cluster operations.
- Application Downtime: When pods fail to deploy as a result of volume mounting, it causes a delay in application lifecycle operations. Pod startup failures prevent application initialization, inaccessible filesystems, and cascading effect on dependent services.
- Deployment rollbacks: Due to this error, pods frequently fail to run or transition to the desired state. This could result from several reasons, including:
- Misconfigured PVC/PV and incorrect storage class settings
- StatefulSet controller unable to progress with ordered pod creation
- Rolling updates blocked by volume attachment limits
- Increased operational overhead: This type of error can lead to increased operational overhead because it requires troubleshooting and investigating various components (such as kubelet, CSI driver, storage backend, etc.). Increased monitoring and alerting requirements lead to manual interventions and cross-team coordination, which adds to the overall overhead.
- Resource utilization impact: Unmounted volumes in Kubernetes cause severe waste of resources. Storage PVs are assigned even when not mounted, which might increase cloud storage expenses (AWS EBS, GCP Persistent Disks) for storage provisioned but unused space. Orphaned volume resources due to misconfigurations, scheduling issues, or delays in the finalizer also affect resource usage. These unused volumes utilize storage resources, limit node disk utilization, and hinder garbage collection. This inefficiency impacts cluster performance, increased costs, and application performance degradation in shared environments.
How to Troubleshoot and Resolve
Follow these common troubleshooting strategies to detect and fix any volume error conflicts in Kubernetes:
1. Check the pod for errors and scheduling details.
Use the kubectl describe command to check the pod’s status and view events for error messages related to volume attachment or mounting failures.
``` # Examine pod events and volume mount status kubectl describe pod <pod-name> -n <namespace> # Check pod conditions and container states kubectl get pod <pod-name> -n <namespace> -o yaml | grep -A 10 conditions ```
Check the Events section for any errors or warnings.
2. Verify node status and confirm that the node where the pod is scheduled is in healthy condition:
``` # View the node status kubectl get nodes ```
3. Inspect Kubelet’s log output for errors, warnings, or messages related to the scheduled pod.
``` # View kubelet volume operations journalctl -u kubelet -f | grep "volume" # Filter for specific volume operations journalctl -u kubelet -f | grep "mount volume" ```
4. Verify PV and PVC status.
``` # Check PVC status and events kubectl get pvc -n <namespace> kubectl describe pvc <pvc-name> -n <namespace> # Verify PV binding and reclaim status kubectl get pv kubectl describe pv <pv-name> # Examine VolumeAttachment objects kubectl get volumeattachment kubectl describe volumeattachment <name> ```
5. If the volume remains stuck, you may need to remove finalizers from VolumeAttachment:
``` # Get the volumeattachment name VOLUME_ATTACHMENT=$(kubectl get volumeattachment | grep <pv-name> | awk '{print $1}') # Remove finalizers from the VolumeAttachment kubectl patch volumeattachment $VOLUME_ATTACHMENT -p '{"metadata":{"finalizers":null}}' --type=merge # Force delete stuck PVC finalizers (use with caution) kubectl patch pvc <pvc-name> -n <namespace> \ -p '{"metadata":{"finalizers":null}}' --type=merge ```
If you encounter PVC in a Lost or Pending status, you can either rebind it or delete it.
6. Verify the node and monitor if nodes have adequate CPU, memory, and disk space by using commands:
``` # Monitor the resource usage kubectl top node <node-name> ```
Verify that the volume plugin and CSI driver are correctly installed and configured. Check the logs of the CSI driver for any errors:
``` # Logs of the CSI driver kubectl logs -n kube-system -l app=csi-driver ```
7. Retry the operation, as they may be temporary or transient.
``` # Delete and recreate the pod kubectl delete pod <pod-name> Kubectl apply -f <pod-name>.yaml ```
While the troubleshooting steps listed above will help you resolve this error, more robust and comprehensive data management solutions, such as Portworx, can help you reduce volume attachment and mounting errors by ensuring high availability and consistency through dynamic provisioning, automated recovery and multi-node volume replication.
Portworx’s backup and restoration solution, PX-Backup and STORK, automates the recovery process and offers real-time visibility of storage health, facilitating the detection and resolution of issues such as unmounted or stuck volumes.
To dynamically provision storage using Portworx, install Portworx, then define a StorageClass by following the steps below:
``` apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: portworx-sc provisioner: kubernetes.io/portworx-volume # Portworx provisioner parameters: repl: "3" # Number of replicas for high availability io_profile: "db_remote" # I/O profile for the volume io_priority: "high" # High I/O priority for performance secure: "true" # Enable encryption (optional) allowVolumeExpansion: true # Allow volume expansion reclaimPolicy: Delete # Delete volume when PVC is deleted volumeBindingMode: Immediate # Consider WaitForFirstConsumer for better placement ```
And, then update your PVC configuration to use this StorageClass.
``` apiVersion: v1 kind: PersistentVolumeClaim metadata: name: px-pvc spec: accessModes: - ReadWriteOnce storageClassName: portworx-sc resources: requests: storage: 10Gi ```
Using Portworx can assist in the prevention of this error by:
- Clearing up unused or orphaned volumes and resources
- Managing multiple volume replicas on different nodes to achieve high availability.
- Delivering real-time monitoring of volume health and status.
- Using STORK to build storage-aware scheduling.
How to avoid this error in the future
Volume mounting and attaching issues can be avoided in the future with the help of operational strategies, best practices, and advanced storage solutions such as Portworx. Follow the
- Implement robust monitoring: Setting up monitoring tools like Prometheus, Grafana, or ELK Stack to monitor critical volume attachment data, such as the number of pending volume attachments, the time it takes to mount volumes, and the CSI driver’s error rates. This approach provides a comprehensive view of volume, pod health, and performance, allowing you to detect and resolve pod and volume attaching or mounting issues in real-time. This allows you to resolve issues quickly and minimize downtime.
- Use a proper grace period for pod termination. This will allow enough time for the pods to terminate properly, enabling Kubernetes to unmount and clean up volume resources effectively.
``` kubectl delete pod <pod name> -–force --grace-period=30 ```
This gives the kubelet 30 seconds to safely terminate the pod, unmount volumes, and release resources. Avoid using the –force flag unless necessary, as it may lead to incomplete resource cleanup.
- Implement Pod Disruption Budgets (PDBs): Implementing PDBs ensures that volume resources are not overwhelmed by concurrent operations, preventing issues that can arise during pod rescheduling, such as volume attachment or detachment failures. PDBs minimize the possibility of volume-related errors by ensuring that the fewest replicas (pods) are consistently available during disruptions to ensure application stability.
- Volume management storage solution: Use Portworx as your storage solution to benefit from advanced features such as distributed storage management capabilities, automated volume provisioning, and data protection capabilities while avoiding volume mounting or attaching errors.
- Optimize storage class configurations: Configuring storage classes to ensure that they are compatible with the storage infrastructure and offer the requisite performance and features associated with the applications.
- Update Kubernetes and CSI drivers: Maintaining the Kubernetes cluster and CSI drivers up to date delivers advantages such as improved performance and bug fixes.
Conclusion
The “Unable to Attach or Mount Volumes” is a common but complex issue in the Kubernetes ecosystem. This could happen for several reasons, including issues with the CSI driver, incorrect storage class configuration, or issues with cleaning up the volume, to name a few.
Implementing PDBs, optimizing storage class configurations, streamlining pod termination, monitoring storage infrastructure, and using advanced storage solutions like Portworx helps you to prevent this error from occurring and ensure the seamless operation of storage orchestration.
Regular monitoring and troubleshooting can help to make the Kubernetes environment more reliable and efficient.

Pod has unbound immediate persistent volume claims

Top 5 Kubernetes-Based Alternatives to VMware


The Future of Modern Application Developement: The 3 Key Benefits of Platform Engineering
