In Kubernetes, pods often require temporary data storage for non-critical operations such as caching, logging, or intermediate processing files. Such operations would be inefficient using persistent storage because it stores data beyond the lifecycle of a pod, resulting in additional costs and complexities.
Ephemeral storage can be the ideal solution for such cases. In this blog post, we will explore ephemeral storage, a volatile and non-persistent storage solution. We will discuss its importance, characteristics, types, operational mechanisms, and how it differs from persistent storage.
What is Ephemeral Storage?
Ephemeral storage is temporary storage that exists only during the pod’s life cycle and is removed upon termination or restart. It is used for non-critical, transient data such as build artifacts during CI/CD pipelines, session data, and temporary config files. This differs from Persistent Volumes (PVs), which persist data beyond the pod’s life cycle.
It typically includes the writable layer of containers, emptyDir volumes, directories holding node-level logs, and other transient data generated during pod execution.
What is the Importance of Ephemeral Storage in Kubernetes?
Ephemeral storage offers short-term storage required for pod operation, ensuring efficient resource utilization and optimizing the cluster’s storage resources. It plays several critical roles in Kubernetes environments, which include:
- Temporary data processing: Enables transient data storage like cached files, intermediate computation results, or runtime logs. Uses ephemeral-storage, ensuring fair resource allocation across all containers.
- Specifies resource limits
- Container runtime operations: Offers a writable layer, enabling the containers to modify files stored in runtime partitions (e.g., /var/lib/containers) and managed by overlay filesystems like overlay2 or aufs without corrupting the base image.
- Storage log management: Allows deployment of logging agents such as sidecar containers or daemonsets to transmit logs to Kubernetes persistent storage or external systems for analysis and debugging.
- Communication between containers: Uses emptyDir volumes to share data between containers within the same pod.
What are Ephemeral Volumes?
Ephemeral volumes in Kubernetes are short-lived volumes attached to a pod’s life cycle. When a pod is created, Kubernetes creates these volumes automatically and removes them when it is shut down.
Characteristics of Ephemeral Volumes
Ephemeral volumes have several crucial characteristics that distinguish them from persistent storage volumes.
- Local storage
- It uses local storage such as emptyDir on the node, offering lower
latency and higher throughput than Network-Attached Storage (NAS).
- It uses local storage such as emptyDir on the node, offering lower
- Non-persistence
- Data is not retained across pod restarts or reschedules to a different
node.
- Data is not retained across pod restarts or reschedules to a different
- No explicit provisioning
- Unlike PVs, ephemeral storage is automatically provisioned by the
kubelet, simplifying the temporary provisioning of storage
resources.
- Unlike PVs, ephemeral storage is automatically provisioned by the
Different Types of Ephemeral Volumes
Kubernetes supports several types of ephemeral volumes:
- emptyDir: These are initially created as an empty directory and use the node disk’s storage but can also be configured to use memory (RAM) for improved performance. It can store any temporary data, such as logs or caches.
- ConfigMap Volumes: These store non-sensitive string data (environment and configuration values) as key-value pairs and can reflect updates without pod restarts.
- Secret Volumes: These store sensitive encrypted data like passwords and API tokens. Similar to ConfigMaps, changes made to Secrets would reflect in the volume without pod restarts.
- CSI Ephemeral Volumes: These integrate with third-party storage solutions for volume using CSI drivers. These help extend the in-built volume types, with advanced capabilities such as encryption and snapshots.
- DownwardAPI Volumes: These enable containers in a pod to obtain pod-specific metadata, such as pod name, resource request/limit, etc., without direct API interaction. This is useful for configuring applications based on their deployment environment, e.g., modifying behavior based on resource limits or pod-specific labels.
How Ephemeral Storage Works in Kubernetes
In Kubernetes, ephemeral storage is managed by kubelet along with container runtime and node-level resources. When a pod is scheduled on a node, kubelet allocates ephemeral storage resources based on the pod’s specification. Ephemeral storage is divided into two categories:
- Root filesystem (rootfs): This is the primary filesystem that stores OS files, binaries, logs, etc., with limited disk space shared by all the pods on a node.
- Resource quota-based storage (like emptyDir volume): For these, administrators can set specific resources and limits to prevent resource contention and node instability.
Ephemeral Storage vs. Persistent Storage
Feature | Ephemeral Storage | Persistent Storage |
Lifecycle Management | Bound with the pod lifecycle; data is lost when the pod terminates | Independent of the pod lifecycle; data persists even after pod termination |
Storage Architecture & Implementation | Implemented as temporary filesystem storage within the node’s local storage pool, managed by kubelet. Examples: emptyDir, ConfigMap, Secret, downwardAPI volumes | Implemented through CSI or volume plugins that connect to storage provisioned outside the Kubernetes node lifecycle. Examples: PVs backed by NFS, iSCSI, cloud storage (AWS EBS, Google PDs, Azure Disk), or Local PV |
Configuration | Defined directly in the pod specification without requiring separate storage resources | Requires PersistentVolume (PV) and PersistentVolumeClaim (PVC) objects for provisioning |
Resource Management | Managed through container resource specifications using resources.limits.ephemeral-storage and resources.requests.ephemeral-storage, enforced by kubelet | Capacity defined in the PVC spec’s resources.requests.storage field and enforced through the StorageClass’s provisioner |
Access Types | Traditionally ReadWriteOnce, but CSI-Ephemeral volumes can support ReadWriteMany | Supports multiple access modes (ReadWriteOnce, ReadWriteMany, ReadOnlyMany) depending on the storage provider |
Performance Characteristics | Direct access to local node storage without additional abstraction layers, generally eliminating network hops | May introduce additional abstraction layers and potential network latency depending on the storage backend (Local PV implementations can match ephemeral storage performance) |
Use Cases | Scratch space, caching, temporary processing files, non-critical runtime data | Databases, stateful applications, shared configuration, user-generated content, critical business data |
Lifecycle of Ephemeral Storage in Pods
Ephemeral storage provides temporary storage that remains active during a pod’s operation. The following are the phases of how ephemeral storage operates through the lifecycle of a pod:
- Pod creation: When a pod is created, the kubelet automatically initializes the ephemeral volumes on the node’s local filesystem when implemented through CSI or volume plugins.
- Volume mounting: Ephemeral volumes are mounted into the pod’s containers.
- Container runtime: Containers within the pod read from and write to the ephemeral volumes during execution.
- Resource monitoring: The kubelet constantly monitors the storage usage and enforces resource limits.
- Clean up volumes: Pod cleans up ephemeral volumes when terminating or rescheduling.
- Pod rescheduling: When restarted on another node, new storage is created.
Configuring Ephemeral Storage
Proper ephemeral storage configuration ensures performance and stability. Setting resource limits and monitoring prevents pod evictions and optimizes resource use in Kubernetes.
Setting Resource Requests and Limits
Kubernetes uses a declarative approach for resource management that allows minimum required resources (requests) and maximum allowed resources (limits) for ephemeral storage:
```yaml apiVersion: v1 kind: Pod metadata: name: frontend spec: containers: - name: app image: nginx resources: requests: ephemeral-storage: "2Gi" limits: ephemeral-storage: "4Gi" ```
In this configuration:
- Requests (2Gi) guarantees a minimum storage allocation
- Limits (4Gi) defines the maximum storage the container may consume
When a pod exceeds its ephemeral storage limit, the kubelet initiates an eviction process based upon:
- Pods with the lowest Priority Class are evicted first
- Within the same Priority Class, BestEffort pods are evicted before Burstable pods
- Within the same QoS class, pods with the highest percentage of usage above requests are evicted first
Pod Configuration Examples
Workloads have different ephemeral storage needs. Below are examples that demonstrate common patterns for Kubernetes storage.
Configuring Ephemeral Storage Limits
For a multi-container application that requires temporary storage for data processing:
```yaml apiVersion: v1 kind: Pod metadata: name: data-processor spec: containers: - name: processor image: data-processor:latest resources: requests: ephemeral-storage: "1Gi" limits: ephemeral-storage: "2Gi" volumeMounts: - mountPath: /cache name: cache-volume - name: log-collector image: log-collector:latest resources: requests: ephemeral-storage: "500Mi" limits: ephemeral-storage: "1Gi" volumeMounts: - mountPath: /logs name: log-volume volumes: - name: cache-volume emptyDir: {} - name: log-volume emptyDir: {} ```
This example shows important configuration patterns:
- Each container has independent resource constraints
- Both containers share ephemeral volumes via emptyDir volumes
- The volumes are mounted at different paths for the separation of concerns
For high-performance needs, you can configure emptyDir with memory backing:
```yaml volumes: - name: high-speed-cache emptyDir: medium: Memory sizeLimit: 1Gi ```
This configuration uses RAM instead of disk, offering much faster I/O at the cost of consuming node memory.
For more advanced configuration options, check out our guide on how to use ephemeral volumes.
Ephemeral Storage Monitoring Tools
Effective monitoring helps you maintain healthy Kubernetes clusters by providing visibility into storage usage and identifying potential issues before they affect workloads.
Kubectl Commands
Basic monitoring can be done directly through kubectl with commands like:
```bash kubectl describe node <node-name> ```
This displays node-level information, including allocatable ephemeral storage and current usage.
For pod-specific metrics:
```bash kubectl describe pod <pod-name> ```
This shows container-level ephemeral storage requests, limits, and usage.
Metrics Server
The Kubernetes Metrics Server collects resource metrics from kubelets, providing real-time data through kubectl top commands and integration with Horizontal Pod Autoscaler for dynamic scaling based on resource utilization.
Prometheus
Prometheus is a comprehensive monitoring solution that offers custom metrics collection for detailed filesystem statistics, long-term storage for historical analysis, advanced querying capabilities with PromQL, and custom alerting rules based on storage thresholds.
A Prometheus query for ephemeral storage might look like this:
```txt sum(container_fs_usage_bytes{pod=~"app-.*"}) by (pod) / 1024 / 1024 / 1024 ```
This returns ephemeral storage usage in GiB for all pods matching the “app-” prefix.
Grafana
Grafana enhances Prometheus data as a visualization layer with customizable dashboards, time-series graphs showing usage trends, and threshold-based color coding for quick issue identification. You can also configure Kubernetes, node, and application-level metrics as sources in Grafana and create dashboards.
Kubernetes Dashboard
The official Kubernetes Dashboard also provides a web-based UI that shows various metrics, including node storage capacity and utilization, production resource usage, ephemeral storage, namespace-level aggregations, and alert indicators.The dashboard refreshes approximately every 30 seconds and offers a convenient overview of cluster health.
Use Cases and Best Practices
Understanding ephemeral storage usage is key to optimizing Kubernetes. This section covers best use cases and implementation strategies.
Common Use Cases for Ephemeral Storage
Ephemeral storage shines in scenarios requiring temporary, high-speed data access without the need for persistence across pod restarts or node failures.
Caching and Buffering
Application-level caching significantly improves performance by temporarily storing frequently accessed data or computation results. In-memory databases like Redis can use ephemeral storage for snapshot files. Web applications can cache API responses, reducing backend load. This technique allows systems to respond more quickly to repeated requests without repeatedly generating the same content.
Data Processing Pipelines
Ephemeral storage for intermediate results benefits multi-stage data workflows. ETL jobs can use this approach to store data between transformation stages, machine learning pipelines cache preprocessed datasets, and video processing applications store intermediate frames during encoding. This pattern greatly improves efficiency by avoiding repeated computation of intermediary results and consuming persistent storage.
Log Collection
Logs temporarily stored before forwarding to centralized systems offer several benefits. By buffering logs during network outages, preprocessing and filtering before transmission, and aggregating logs from multiple containers, applications can ensure reliable log management while reducing transmitted data volume. This approach balances immediate logging needs with efficient centralized storage.
A logging sidecar pattern example:
```yaml - name: log-collector image: fluent/fluent-bit:latest volumeMounts: - mountPath: /var/log/containers name: container-logs readOnly: true - mountPath: /var/log/buffer name: log-buffer volumes: - name: container-logs hostPath: path: /var/log/containers - name: log-buffer emptyDir: sizeLimit: 1Gi ```
This configuration provides temporary storage for logs during collection and processing.
CI/CD Workloads
Modern build systems depend on temporary storage throughout the workflow. Artifact storage enables multi-stage processing without constant rebuilding. Test environments leverage temporary space for fixtures and outputs, enabling complex verification without permanent storage commitment.
Best Practices for Managing Ephemeral Storage
Effective ephemeral storage management requires balancing resource efficiency with application stability. The following practices help optimize storage utilization while preventing resource conflicts.
Avoiding Storage Overcommitment
Overcommitting ephemeral storage can lead to cascading failures through pod evictions. Implement these strategies to maintain stability:
Define Appropriate Limits
Set realistic storage limits based on the actual application usage plus a 20-30% buffer to accommodate growth during peak operations. This balanced approach prevents both resource waste and unexpected evictions.
Implement Graceful Handling
Design applications to handle storage limitations by implementing cleanup routines triggered at specific thresholds and employing cache eviction policies like Least Recently Used (LRU) or Time To Live (TTL) to manage growth proactively.
Use Namespace Quotas
Set ResourceQuotas at the namespace level to prevent any single team or application from consuming excessive resources:
```yaml apiVersion: v1 kind: ResourceQuota metadata: name: storage-quota spec: hard: requests.ephemeral-storage: 10Gi limits.ephemeral-storage: 20Gi ```
This creates a “hard ceiling” that prevents the deployment of additional pods once the quota is reached.
Monitoring Ephemeral Storage Usage
Proactive monitoring prevents unexpected disruptions and helps optimize resource allocation.
Set Up Alerts
Configure alerts for when pods approach storage limits to prevent disruptions. Setting early warning alerts gives teams time to investigate and address growing storage needs, while critical warnings signal that urgent intervention is needed.
A Prometheus alerting rule:
```yaml - alert: EphemeralStorageNearLimit expr: container_fs_usage_bytes / container_fs_limit_bytes > 0.85 for: 10m labels: severity: warning annotations: summary: "Container {{ $labels.container }} in pod {{ $labels.pod }} approaching ephemeral storage limit" description: "Container is using {{ $value | humanizePercentage }} of its ephemeral storage limit for over 10 minutes." ```
Here we have set the critical warning at 85% of the limit. Well-crafted alerts include context about which containers and pods are approaching limits, along with usage percentages to help prioritize responses.
Regular Auditing
Periodically review pod configurations to identify overprovisioned or underprovisioned containers. Such audits can help you manage and optimize resource usage. Have mechanisms to compare actual usage against configured requests and identify potential gaps that can degrade performance over time. Regular audits can help bring down costs and optimize resource utilization.
Implement Garbage Collection
Regular maintenance routines prevent storage accumulation from impacting application stability. Scheduled cleanup jobs can run during off-peak hours to remove obsolete files based on age or other criteria. Size-based triggers provide dynamic cleanup when storage reaches certain thresholds, regardless of timing. The most effective approach combines time-based scheduling for predictability with size-triggered operations for safety.
A simple cleanup strategy might use a Kubernetes CronJob:
```yaml apiVersion: batch/v1 kind: CronJob metadata: name: cleanup-tmp spec: schedule: "0 */6 * * *" # Every 6 hours jobTemplate: spec: template: spec: containers: - name: cleanup image: busybox command: ["/bin/sh", "-c", "find /data -type f -mtime +1 -delete"] volumeMounts: - mountPath: /data name: data-volume volumes: - name: data-volume emptyDir: {} restartPolicy: OnFailure ```
Creating a clear strategy for monitoring and alerting helps prevent unexpected disruptions to your services due to storage issues.
Looking for a complete approach? Our Kubernetes storage solutions guide provides comprehensive insights for both ephemeral and persistent storage needs.
Challenges and Considerations
Limitations of Ephemeral Storage
While ephemeral storage provides valuable functionality for temporary data needs, several constraints must be considered during system design.
Data Loss Risk
All data in ephemeral storage is lost when pods terminate due to crashes, node failures, or Kubernetes events like evictions or scaling. Mitigation strategies include regular checkpointing of important data to persistent storage and designing applications with idempotent processing capabilities.
One way to avoid this is to implement work quests with persistent backing stores.
For example, a data processing pod might implement:
```yaml volumeMounts: - name: work-dir mountPath: /work - name: checkpoint-vol mountPath: /checkpoints volumes: - name: work-dir emptyDir: {} - name: checkpoint-vol persistentVolumeClaim: claimName: checkpoint-pvc ```
This configuration uses ephemeral storage for processing but preserves critical progress in persistent storage.
Node-Local Only
Ephemeral storage is restricted to a single node and cannot be shared across multiple nodes, limiting scalability and high availability. Alternatives include persistent volumes with ReadWriteMany access or application-level replication.
No Built-in Redundancy
Unlike persistent storage, ephemeral storage lacks data protection features like cross-node replication and backups. For critical data, use persistent storage or implement application-level redundancy.
Impact on Application Performance
Access Speed
Local ephemeral storage, such as node-local SSDs, typically provides significantly faster I/O than network-attached persistent storage due to reduced network overhead. This makes it well-suited for high-throughput data processing and caching frequently accessed data. Memory-backed volumes, like tmpfs, offer even higher performance for speed-critical operations by utilizing system RAM.
Eviction Impact
When storage limits are exceeded, pods are quickly evicted, and all containers are terminated simultaneously. Applications should implement checkpoint mechanisms and use stateless designs to handle these disruptions.
Memory vs. Disk Tradeoffs
When configuring emptyDir volumes, selecting memory or disk backing impacts performance. Memory-backed volumes offer up to 10x faster I/O but reduce available RAM, potentially affecting overall system performance under high load.
When scaling applications, be aware of potential CSI driver issues affecting storage performance.
Strategies for Storage Management
Effective management of ephemeral storage requires thoughtful planning and implementation of proven strategies.
Implement Tiered Storage
Identify different types of data and use separate volumes accordingly. Separate cache data from processing workspaces and use different volumes for different retention requirements. Combine ephemeral storage for high-IOPS temporary data with persistent storage for valuable intermediate results. This approach balances performance needs with data safety requirements.
Stateless Design
Design applications to externalize state-to-dedicated services and implement idempotent processing for graceful recovery after disruptions. You could implement a work queue architecture with persistent backing stores or use external session storage for web applications. This architecture increases resilience against storage-related failures.
Use Init Containers
Leverage init containers for one-time data preparation and configuration tasks before application containers start. This pattern separates setup operations from runtime processes, improving overall application organization.
Init containers complete before application containers start, enabling the preparation of working environments:
```yaml initContainers: - name: setup-data image: data-prep:latest command: ['sh', '-c', 'curl -L https://example.com/dataset | tar -xz -C /data'] volumeMounts: - name: data-volume mountPath: /data containers: - name: app image: app:latest volumeMounts: - name: data-volume mountPath: /data volumes: - name: data-volume emptyDir: {} ```
Ephemeral Storage Resources and Further Reading
To summarize, ephemeral storage provides temporary, node-local storage that exists only for the lifecycle of a pod. Resource limits and requests prevent contention and ensure stable operation. When used appropriately for caching, logging, data processing, and CI/CD workloads, ephemeral storage offers significant performance advantages despite its limitations.
For organizations with complex Kubernetes requirements, Portworx offers comprehensive solutions that extend beyond native Kubernetes capabilities. Our platform enhances ephemeral and persistent storage management with enterprise-grade features for mission-critical applications.
As you build your Kubernetes infrastructure, the right combination of storage solutions is critical. Whether you’re managing stateless applications or implementing StatefulSets in Kubernetes for stateful workloads, considering both ephemeral and persistent storage options will help optimize your container environments. Portworx helps organizations overcome these challenges with a comprehensive solution – get started with a free Portworx trial today, or reach out to us for a discussion tailored to your organization’s needs