On this page:
Architect for Kubernetes revolutionizes Kubernetes cost optimization by enabling pods to hibernate in place when idle and wake instantly (<50ms) when needed. Unlike traditional autoscaling solutions that delete pods and cause cold starts, Architect keeps pods scheduled while reducing their resource consumption to zero during idle periods.
Key Benefits
- Zero idle costs: Hibernated pods consume no CPU or memory
- Instant wake times: Pods restore in <50ms vs 30-60+ seconds for cold starts
- No application changes: Works with existing workloads
- Pods stay scheduled: No delays from rescheduling, PVC mounting, or service registration
Quick Start
Want to see Architect in action immediately? Here's the fastest way to get started:
# 1. Label nodes and install Architect (get your command from https://console.preview.architect.io/)
kubectl label nodes <node-name> architect.loopholelabs.io/node=true
kubectl label nodes <node-name> architect.loopholelabs.io/critical-node=true
helm uninstall -n architect architect || true
helm install architect oci://ghcr.io/loopholelabs/architect-chart \
--namespace architect --create-namespace \
--set kubernetesDistro="eks" \
--set machineToken="mymachinetoken" \
--set clusterName="myclustername" --wait
# 2. Deploy the example Go application
helm uninstall example-go || true
helm install example-go oci://ghcr.io/loopholelabs/example-go-chart --wait
# 3. Watch the pod hibernate after 10 seconds of inactivity
kubectl get pods -w
# 4. Wake it up instantly with a request
kubectl exec -it <example-go-pod> -- curl localhost:8080
# 5. Observe the resource savings
kubectl top podsOther example applications that you can deploy for testing Architect behaviour (these are already pre-configured to be managed by Architect):
helm upgrade example-valkey oci://ghcr.io/loopholelabs/example-valkey-chart --install --wait
helm upgrade example-python oci://ghcr.io/loopholelabs/example-python-chart --install --wait
helm upgrade example-ruby oci://ghcr.io/loopholelabs/example-ruby-chart --install --wait
helm upgrade example-rust-miniserve oci://ghcr.io/loopholelabs/example-rust-miniserve-chart --install --wait
helm upgrade example-kafka oci://ghcr.io/loopholelabs/example-kafka-chart --install --wait
helm upgrade example-spring-boot oci://ghcr.io/loopholelabs/example-spring-boot-chart --install --wait
helm upgrade example-php-wordpress oci://ghcr.io/loopholelabs/example-php-wordpress-chart --install --wait
helm upgrade example-postgres oci://ghcr.io/loopholelabs/example-postgres-chart --install --waitHow It Works
Architect continuously monitors your pods for activity. When a pod becomes idle (no network traffic for a configured duration), Architect:
- Creates a checkpoint of the complete pod state (memory, file descriptors, network connections)
- Hibernates the pod in place, reducing resource requests to zero
- Keeps the pod scheduled and registered with services
- Instantly restores the pod when traffic arrives or when accessed via
kubectl exec
Wake Triggers
Pods automatically wake from hibernation when:
- Network traffic arrives - Any incoming network packet triggers immediate restoration
- kubectl exec commands - Running commands in the container wakes it instantly
- API calls (coming soon) - Programmatic wake/sleep control via Architect API
Installation
Prerequisites
- Kubernetes cluster version 1.32 or higher (1.33 is required for pod sleeping)
- Helm 3 or higher
- Nodes where Architect workloads will run must be labeled
- For Amazon EKS: must use AL2023 AMI (AL2 is not supported)
Step 1: Install Architect
Sign into https://console.preview.architect.io/ and click on the + Add Cluster button, then follow the instructions.
Step 2: Verify Installation
# Check that all Architect components are running
kubectl get pods -n architect
# You should see:
# - architect-manager (admission controller)
# - architect-control-plane
# - architectd pods on each labeled nodeConfiguring Workloads for Architect
To enable Architect for your workloads, you need to:
- Set the runtime class to
runc-architect - Specify which containers to manage
- Configure idle timeouts
Basic Configuration
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app
spec:
replicas: 3
template:
metadata:
annotations:
# Specify which containers Architect should manage
architect.loopholelabs.io/managed-containers: '["my-app-container"]'
# Set idle timeout (optional, default is 10s)
architect.loopholelabs.io/scaledown-durations: '{"my-app-container":"30s"}'
spec:
runtimeClassName: runc-architect # Required
containers:
- name: my-app-container
image: my-app:latest
resources:
requests:
memory: "512Mi"
cpu: "250m"Configuration Options
Runtime Class (Required)
spec:
runtimeClassName: runc-architectThis tells Kubernetes to use Architect's custom runtime for this pod. An additional runtime class powered by gVisor, runsc-architect, is also available.
Managed Containers Annotation
architect.loopholelabs.io/managed-containers: '["container-1", "container-2"]'- Lists which containers in the pod should be managed by Architect
- Containers not in this list run normally without hibernation
- Useful for excluding sidecar containers (e.g., logging agents)
Scale-down Durations Annotation
architect.loopholelabs.io/scaledown-durations: '{"container-1":"30s", "container-2":"60s"}'- Sets how long a container must be idle before hibernating
- Format: JSON object with container names as keys and durations as values
- Default: 10 seconds if not specified
- Minimum: 1s, Maximum: unlimited
Post-Migration Auto Scale Up Containers Annotation
architect.loopholelabs.io/postmigration-autoscaleup-containers: '["container-1", "container-2"]'- List which containers should automatically scale back up after a migration (by default, containers stay scaled down so as to not cause a thundering herd on migrations)
Disable Auto Scale Down Containers Annotation
architect.loopholelabs.io/disable-autoscaledown-containers: '["container-1", "container-2"]'- Lists which containers should not automatically scale down. By default, containers scale down after the duration set in the scale-down duration annotation. If a container is listed in this annotation, it will never scale down automatically
- Mostly useful for long-running background jobs that should still be migrated by default, but not scale down when there is no traffic
Scale-Up Timeout Containers Annotation
architect.loopholelabs.io/scaleup-timeout-containers: '{"container-1": "60s", "container-2": "2m"}'- Sets how long a container should wait for a checkpoint to become available during scale-up
- Format: JSON object with container names as keys and durations as values
- Default: 30 seconds if not specified
- When a new pod starts and other pods with the same template hash exist, the container waits up to this timeout for a checkpoint CRD to be advertised before aborting the checkpoint download and starting a fresh container
Usage Examples
Example 1: Web API Service
apiVersion: apps/v1
kind: Deployment
metadata:
name: api-service
spec:
replicas: 10 # Can now overprovision without cost penalty
strategy:
type: Recreate
template:
metadata:
annotations:
architect.loopholelabs.io/managed-containers: '["api"]'
architect.loopholelabs.io/scaledown-durations: '{"api":"30s"}'
spec:
runtimeClassName: runc-architect
containers:
- name: api
image: mycompany/api:v2.1
ports:
- containerPort: 8080
resources:
requests:
memory: "1Gi"
cpu: "500m"
limits:
memory: "2Gi"
cpu: "1000m"Example 2: Microservices with Sidecar
apiVersion: apps/v1
kind: Deployment
metadata:
name: order-service
spec:
replicas: 15
template:
metadata:
annotations:
# Only manage the main container, not the sidecar
architect.loopholelabs.io/managed-containers: '["order-service"]'
architect.loopholelabs.io/scaledown-durations: '{"order-service":"60s"}'
spec:
runtimeClassName: runc-architect
containers:
- name: order-service
image: mycompany/order-service:v1.5
ports:
- containerPort: 8080
- name: logging-agent
image: fluentd:latest
# This container is not managed by ArchitectExample 3: Development Environment
apiVersion: apps/v1
kind: Deployment
metadata:
name: dev-environment
namespace: development
spec:
replicas: 50 # One per developer, most idle
template:
metadata:
annotations:
architect.loopholelabs.io/managed-containers: '["dev-container"]'
# Aggressive hibernation for dev environments
architect.loopholelabs.io/scaledown-durations: '{"dev-container":"5s"}'
spec:
runtimeClassName: runc-architect
containers:
- name: dev-container
image: mycompany/dev-env:latest
resources:
requests:
memory: "4Gi"
cpu: "2000m"Monitoring and Observability
Pod Status Labels
Architect adds specific labels to track container hibernation state:
# Check hibernation status for a specific container
kubectl get pods -l status.architect.loopholelabs.io/<container-name>=SCALED_DOWN
# Example: Check if the 'api' container is hibernated
kubectl get pods -l status.architect.loopholelabs.io/api=SCALED_DOWN
# List all pods with any hibernated containers
kubectl get pods -o json | jq '.items[] | select(.metadata.labels | to_entries[] | select(.key | startswith("status.architect.loopholelabs.io/")) | .value == "SCALED_DOWN") | .metadata.name'Resource Tracking Annotations
When a pod hibernates, Architect preserves the original resource requests in annotations:
# View original CPU requests for hibernated containers
kubectl get pod <pod-name> -o jsonpath='{.metadata.annotations.architect\.loopholelabs\.io/cpu-requests}'
# Output: {"container-name":"250m"}
# View original memory requests
kubectl get pod <pod-name> -o jsonpath='{.metadata.annotations.architect\.loopholelabs\.io/memory-requests}'
# Output: {"container-name":"6Gi"}Resource Consumption
Monitor actual resource usage to see savings (requires Kubernetes 1.33 or higher):
# View resource consumption of pods
kubectl top pods
# Hibernated pods will show zero CPU and memory usage
# Compare with original requests stored in annotations to calculate savingsLogs
Architect components log important events:
# View architectd logs on a specific node
kubectl logs -n architect -l app=architectd --tail=100
# View admission controller logs
kubectl logs -n architect -l app=architect-manager --tail=100
# Filter logs for specific pod events
kubectl logs -n architect -l app=architectd | grep <pod-name>Known Limitations
- GPU Workloads: GPU state preservation is under development
Testing Your Application
Before deploying to production, test your application's compatibility:
# 1. Deploy with Architect in staging
# 2. Generate typical load
# 3. Let it hibernate (check status label)
kubectl get pod <pod> -o jsonpath='{.metadata.labels.status\.architect\.loopholelabs\.io/<container>}'
# 4. Wake it with traffic
kubectl exec <pod> -- curl localhost:<port>/health
# 5. Verify functionality and state preservation
# 6. Check logs if there are errors
kubectl logs -n architect -l app=architectd | grep <pod>Best Practices
1. Node Configuration
- Label nodes appropriately: Only label nodes where you want Architect workloads to run
- Avoid preemptable nodes for Architect components: The
architect-managerandarchitect-control-planeshould run on stable nodes - Separate control plane from workloads: Run Architect control components on different nodes than your workloads when possible
2. Application Suitability
Well-suited applications:
- Stateless web services and APIs
- Microservices with intermittent traffic
- Development and staging environments
- Batch processing jobs with idle periods
- Services with predictable traffic patterns
Applications requiring careful consideration:
- GPU workloads requiring CUDA state preservation (under development)
3. Configuration Guidelines
- Start with conservative timeouts: Begin with 30-60 second idle timeouts and decrease gradually
- Test in staging first: Always validate hibernation behavior in non-production environments
- Monitor wake times: Ensure your SLOs are met with the hibernation/wake cycle
4. Capacity Planning
With Architect, you can:
- Overprovision without cost penalty: Run more replicas for better availability
- Eliminate scaling buffers: No need for extra replicas to handle scale-up delays
- Simplify HPA configuration: Focus on actual capacity needs, not scaling delays
Updating and Managing Workloads
Adding Architect to Existing Workloads
- Add the runtime class:
spec:
runtimeClassName: runc-architect- Add the managed containers annotation:
annotations:
architect.loopholelabs.io/managed-containers: '["your-container"]'- Apply the changes:
kubectl apply -f your-deployment.yamlRemoving Architect from Workloads
To disable Architect for a workload:
- Remove the container from the managed containers list:
annotations:
architect.loopholelabs.io/managed-containers: "[]"- Or remove the runtime class:
# Remove or comment out:
# runtimeClassName: runc-architect- Apply changes (no need to delete the pod):
kubectl apply -f your-deployment.yamlTroubleshooting
Pod Not Hibernating
Check idle timeout configuration:
# View configured timeout (default is 10s if not set)
kubectl get pod <pod-name> -o jsonpath='{.metadata.annotations.architect\.loopholelabs\.io/scaledown-durations}'Verify container is managed:
kubectl get pod <pod-name> -o jsonpath='{.metadata.annotations.architect\.loopholelabs\.io/managed-containers}'Check container status label:
# Check if container shows as scaled down
kubectl get pod <pod-name> -o jsonpath='{.metadata.labels.status\.architect\.loopholelabs\.io/<container-name>}'Review architectd logs for hibernation events:
kubectl logs -n architect -l app=architectd | grep <pod-name>Pod Not Waking
Test wake triggers:
# Wake via kubectl exec
kubectl exec -it <pod-name> -- /bin/sh -c "echo test"
# Wake via network traffic (if service exposed)
kubectl port-forward <pod-name> <port>:<port>
curl localhost:<port>Check pod events:
kubectl describe pod <pod-name>Verify architectd is running on the node:
# Find which node the pod is on
kubectl get pod <pod-name> -o wide
# Check architectd on that node
kubectl get pods -n architect -o wide | grep <node-name>High Wake Times
If wake times exceed 50ms:
- Check node CPU and memory availability
- Verify no resource contention on the node
- Check checkpoint size (larger applications take longer)
- Review architectd logs for restore errors
Checkpoint Failures
Common causes and solutions:
-
Application incompatibility:
- Applications using GPUs are not currently supported
-
Disk space issues:
# Check disk space on nodes kubectl get nodes -o custom-columns=NAME:.metadata.name,DISK:.status.allocatable.ephemeral-storage -
Permission issues:
- Ensure the runtime class is properly set
- Verify node labels are correct
-
Review detailed logs:
# Get detailed architectd logs kubectl logs -n architect -l app=architectd --tail=500 | grep -E "checkpoint|restore|error"
Customizing the Helm Chart
The Helm chart supports additional configuration options for customizing components, for example you can chose to install a pre-release version:
--devel --version 0.0.0-pojntfx-arch-394-implement-p2p-evac-for-new-architect.1.9b433b9Or add custom node selectors for components to further restrict pod placement:
--set 'architectdNodeSelector.custom-label=value' \
--set 'architectAdmissionControllerNodeSelector.zone=us-east-1a' \
--set 'architectControlPlaneNodeSelector.tier=critical'Add tolerations for components to allow scheduling pods to tained nodes:
--set 'architectdTolerations[0].key=dedicated' \
--set 'architectdTolerations[0].operator=Equal' \
--set 'architectdTolerations[0].value=architect' \
--set 'architectdTolerations[0].effect=NoSchedule'And set resource requests and limits for the different components:
--set 'architectAdmissionControllerResources.requests.cpu=100m' \
--set 'architectAdmissionControllerResources.requests.memory=128Mi' \
--set 'architectAdmissionControllerResources.limits.cpu=500m' \
--set 'architectAdmissionControllerResources.limits.memory=512Mi' \
--set 'architectControlPlaneResources.requests.cpu=200m' \
--set 'architectControlPlaneResources.requests.memory=256Mi'FAQ
Q: How is this different from scale-to-zero solutions like KEDA or Knative?
A: Scale-to-zero solutions delete pods entirely, causing 30-60+ second cold starts when they're needed again. Architect keeps pods scheduled but hibernates them in place, enabling <50ms wake times. Your pods stay registered with services, keep their PVCs mounted, and maintain their network configuration.
Q: What triggers a pod to wake from hibernation?
A: Pods wake instantly (<50ms) when:
- Network traffic arrives at the pod
- You run
kubectl execcommands on the container - (Coming soon) API calls to programmatically wake pods
The wake process is automatic and transparent - your application doesn't need any modifications.
Q: Can I use Architect with HPA (Horizontal Pod Autoscaler)?
A: Yes! Architect complements HPA perfectly. HPA handles adding/removing replicas based on metrics, while Architect ensures idle replicas don't consume resources. You can now set more aggressive HPA policies without cost concerns.
Q: What applications are not compatible?
A: Currently, applications using GPUs are not compatible. We recommend thorough testing in staging environments.
Q: How much overhead does Architect add?
A: Architect adds minimal overhead - typically <1% CPU and <50MB memory per node for the architectd daemon. The checkpoint/restore process itself is highly optimized with near-zero impact on running workloads.
Q: Can I migrate hibernated pods between nodes?
A: Yes - pods that are deleted on one node will have their checkpoints moved to whichever node the replacement pod is scheduled to.
Q: What happens during Kubernetes upgrades?
A: Architect components should be upgraded first, followed by your workloads. Hibernated pods will be woken during node drains and can be safely rescheduled.
Q: Is there a limit to how many pods can be hibernated?
A: There's no hard limit. The practical limit depends on your node's disk space for storing checkpoints (typically 50-200MB per pod) and the architectd daemon's capacity.
Q: How do I know how much I'm saving?
A: Monitor the difference between provisioned resources and actual usage:
# Provisioned resources
kubectl get pods -o custom-columns=NAME:.metadata.name,CPU:.spec.containers[0].resources.requests.cpu,MEMORY:.spec.containers[0].resources.requests.memory
# Actual usage (hibernated pods show ~0)
kubectl top podsA more concise breakdown will be available soon at https://console.preview.architect.io/
Q: What happens to in-flight requests?
A: Architect monitors network traffic and only hibernates pods that have been truly idle (no traffic) for the configured duration. If a request arrives while a pod is transitioning to hibernation or while it's hibernated, it's buffered and delivered once the pod wakes (typically within 50ms). No packets are dropped.
Q: Can I change the managed containers list without restarting pods?
A: While you can update the managed-containers annotation without restarting, it's not recommended. When you remove a container from the managed list, its checkpoint is deleted and it becomes unmanaged. For predictable behavior, use the Recreate deployment strategy or restart pods after changing the annotation.
Q: How much disk space do checkpoints require?
A: Checkpoint size varies by application but typically ranges from 50-200MB per pod. The size depends on the application's memory footprint and state. Monitor disk usage on nodes with:
kubectl exec -n architect <architectd-pod> -- du -sh /var/lib/architect/checkpoints/Q: Does Architect work with StatefulSets?
A: Yes, Architect works with StatefulSets. Each pod maintains its own checkpoint and persistent volume claims remain mounted during hibernation. Use the same annotations and runtime class as with Deployments.
Q: What happens if the architectd daemon crashes?
A: If architectd crashes on a node, pods on that node continue running normally but won't hibernate or wake. The daemon automatically restarts via the DaemonSet controller. Existing checkpoints are preserved and operations resume once architectd is back online.
Q: Can I exclude certain pods from hibernation temporarily?
A: Yes, you can:
- Remove the container from the
managed-containersannotation - Set a very long timeout (e.g., "24h")
- Remove the
runc-architectruntime class (requires pod restart)
Q: How do I calculate my actual cost savings?
A: Track these metrics:
# Total provisioned resources
kubectl get pods -o json | jq '[.items[] | .spec.containers[] | .resources.requests] | map(.memory // "0", .cpu // "0") | add'
# Resources actually being used
kubectl top pods --no-headers | awk '{sum+=$2} END {print sum}'
# Savings = (Provisioned - Actual) * Cloud provider ratesQ: Where are checkpoints cached?
A: Checkpoints are cached on each node that architectd runs on, at /root/.local/state/architect/.
Support and Resources
- Documentation: This guide and architecture documentation
- Support: Contact Loophole Labs support team
- License: Contact admin@loopholelabs.io for enterprise installations
Conclusion
Architect for Kubernetes fundamentally changes the economics of running Kubernetes workloads. By eliminating idle resource consumption while maintaining instant availability, you can:
- Overprovision for peak capacity without cost penalties
- Reduce infrastructure spend by 30-80%
- Maintain or improve application performance
- Simplify capacity planning and autoscaling
Next Steps
- Start Small: Deploy Architect in a development or staging environment first
- Test Compatibility: Verify your applications checkpoint and restore correctly
- Monitor Savings: Track resource consumption before and after enabling Architect
- Optimize Timeouts: Fine-tune idle timeouts based on your traffic patterns
Getting Help
- Review the architecture documentation for deep technical details
- Contact support for assistance with specific use cases
- Join our community discussions for tips and best practices
Start with a small subset of workloads, measure the benefits, and gradually expand your Architect deployment for maximum savings.