Architect for Kubernetes Documentation

Architect for Kubernetes revolutionizes Kubernetes cost optimization by enabling pods to hibernate in place when idle and wake instantly (<50ms) when needed. Unlike traditional autoscaling solutions that delete pods and cause cold starts, Architect keeps pods scheduled while reducing their resource consumption to zero during idle periods.

Key Benefits

  • Zero idle costs: Hibernated pods consume no CPU or memory
  • Instant wake times: Pods restore in <50ms vs 30-60+ seconds for cold starts
  • No application changes: Works with existing workloads
  • Pods stay scheduled: No delays from rescheduling, PVC mounting, or service registration

Quick Start

Want to see Architect in action immediately? Here's the fastest way to get started:

# 1. Label nodes and install Architect (get your command from https://console.preview.architect.io/)
kubectl label nodes <node-name> architect.loopholelabs.io/node=true
kubectl label nodes <node-name> architect.loopholelabs.io/critical-node=true

helm uninstall -n architect architect || true
helm install architect oci://ghcr.io/loopholelabs/architect-chart \
  --namespace architect --create-namespace \
  --set kubernetesDistro="eks" \
  --set machineToken="mymachinetoken" \
  --set clusterName="myclustername" --wait

# 2. Deploy the example Go application
helm uninstall example-go || true
helm install example-go oci://ghcr.io/loopholelabs/example-go-chart --wait

# 3. Watch the pod hibernate after 10 seconds of inactivity
kubectl get pods -w

# 4. Wake it up instantly with a request
kubectl exec -it <example-go-pod> -- curl localhost:8080

# 5. Observe the resource savings
kubectl top pods

Other example applications that you can deploy for testing Architect behaviour (these are already pre-configured to be managed by Architect):

helm upgrade example-valkey oci://ghcr.io/loopholelabs/example-valkey-chart --install --wait
helm upgrade example-python oci://ghcr.io/loopholelabs/example-python-chart --install --wait
helm upgrade example-ruby oci://ghcr.io/loopholelabs/example-ruby-chart --install --wait
helm upgrade example-rust-miniserve oci://ghcr.io/loopholelabs/example-rust-miniserve-chart --install --wait
helm upgrade example-kafka oci://ghcr.io/loopholelabs/example-kafka-chart --install --wait
helm upgrade example-spring-boot oci://ghcr.io/loopholelabs/example-spring-boot-chart --install --wait
helm upgrade example-php-wordpress oci://ghcr.io/loopholelabs/example-php-wordpress-chart --install --wait
helm upgrade example-postgres oci://ghcr.io/loopholelabs/example-postgres-chart --install --wait

How It Works

Architect continuously monitors your pods for activity. When a pod becomes idle (no network traffic for a configured duration), Architect:

  1. Creates a checkpoint of the complete pod state (memory, file descriptors, network connections)
  2. Hibernates the pod in place, reducing resource requests to zero
  3. Keeps the pod scheduled and registered with services
  4. Instantly restores the pod when traffic arrives or when accessed via kubectl exec

Wake Triggers

Pods automatically wake from hibernation when:

  • Network traffic arrives - Any incoming network packet triggers immediate restoration
  • kubectl exec commands - Running commands in the container wakes it instantly
  • API calls (coming soon) - Programmatic wake/sleep control via Architect API

Installation

Prerequisites

  • Kubernetes cluster version 1.32 or higher (1.33 is required for pod sleeping)
  • Helm 3 or higher
  • Nodes where Architect workloads will run must be labeled
  • For Amazon EKS: must use AL2023 AMI (AL2 is not supported)

Step 1: Install Architect

Sign into https://console.preview.architect.io/ and click on the + Add Cluster button, then follow the instructions.

Step 2: Verify Installation

# Check that all Architect components are running
kubectl get pods -n architect

# You should see:
# - architect-manager (admission controller)
# - architect-control-plane
# - architectd pods on each labeled node

Configuring Workloads for Architect

To enable Architect for your workloads, you need to:

  1. Set the runtime class to runc-architect
  2. Specify which containers to manage
  3. Configure idle timeouts

Basic Configuration

apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-app
spec:
  replicas: 3
  template:
    metadata:
      annotations:
        # Specify which containers Architect should manage
        architect.loopholelabs.io/managed-containers: '["my-app-container"]'
        # Set idle timeout (optional, default is 10s)
        architect.loopholelabs.io/scaledown-durations: '{"my-app-container":"30s"}'
    spec:
      runtimeClassName: runc-architect # Required
      containers:
        - name: my-app-container
          image: my-app:latest
          resources:
            requests:
              memory: "512Mi"
              cpu: "250m"

Configuration Options

Runtime Class (Required)

spec:
  runtimeClassName: runc-architect

This tells Kubernetes to use Architect's custom runtime for this pod. An additional runtime class powered by gVisor, runsc-architect, is also available.

Managed Containers Annotation

architect.loopholelabs.io/managed-containers: '["container-1", "container-2"]'
  • Lists which containers in the pod should be managed by Architect
  • Containers not in this list run normally without hibernation
  • Useful for excluding sidecar containers (e.g., logging agents)

Scale-down Durations Annotation

architect.loopholelabs.io/scaledown-durations: '{"container-1":"30s", "container-2":"60s"}'
  • Sets how long a container must be idle before hibernating
  • Format: JSON object with container names as keys and durations as values
  • Default: 10 seconds if not specified
  • Minimum: 1s, Maximum: unlimited

Post-Migration Auto Scale Up Containers Annotation

architect.loopholelabs.io/postmigration-autoscaleup-containers: '["container-1", "container-2"]'
  • List which containers should automatically scale back up after a migration (by default, containers stay scaled down so as to not cause a thundering herd on migrations)

Disable Auto Scale Down Containers Annotation

architect.loopholelabs.io/disable-autoscaledown-containers: '["container-1", "container-2"]'
  • Lists which containers should not automatically scale down. By default, containers scale down after the duration set in the scale-down duration annotation. If a container is listed in this annotation, it will never scale down automatically
  • Mostly useful for long-running background jobs that should still be migrated by default, but not scale down when there is no traffic

Scale-Up Timeout Containers Annotation

architect.loopholelabs.io/scaleup-timeout-containers: '{"container-1": "60s", "container-2": "2m"}'
  • Sets how long a container should wait for a checkpoint to become available during scale-up
  • Format: JSON object with container names as keys and durations as values
  • Default: 30 seconds if not specified
  • When a new pod starts and other pods with the same template hash exist, the container waits up to this timeout for a checkpoint CRD to be advertised before aborting the checkpoint download and starting a fresh container

Usage Examples

Example 1: Web API Service

apiVersion: apps/v1
kind: Deployment
metadata:
  name: api-service
spec:
  replicas: 10 # Can now overprovision without cost penalty
  strategy:
    type: Recreate
  template:
    metadata:
      annotations:
        architect.loopholelabs.io/managed-containers: '["api"]'
        architect.loopholelabs.io/scaledown-durations: '{"api":"30s"}'
    spec:
      runtimeClassName: runc-architect
      containers:
        - name: api
          image: mycompany/api:v2.1
          ports:
            - containerPort: 8080
          resources:
            requests:
              memory: "1Gi"
              cpu: "500m"
            limits:
              memory: "2Gi"
              cpu: "1000m"

Example 2: Microservices with Sidecar

apiVersion: apps/v1
kind: Deployment
metadata:
  name: order-service
spec:
  replicas: 15
  template:
    metadata:
      annotations:
        # Only manage the main container, not the sidecar
        architect.loopholelabs.io/managed-containers: '["order-service"]'
        architect.loopholelabs.io/scaledown-durations: '{"order-service":"60s"}'
    spec:
      runtimeClassName: runc-architect
      containers:
        - name: order-service
          image: mycompany/order-service:v1.5
          ports:
            - containerPort: 8080
        - name: logging-agent
          image: fluentd:latest
          # This container is not managed by Architect

Example 3: Development Environment

apiVersion: apps/v1
kind: Deployment
metadata:
  name: dev-environment
  namespace: development
spec:
  replicas: 50 # One per developer, most idle
  template:
    metadata:
      annotations:
        architect.loopholelabs.io/managed-containers: '["dev-container"]'
        # Aggressive hibernation for dev environments
        architect.loopholelabs.io/scaledown-durations: '{"dev-container":"5s"}'
    spec:
      runtimeClassName: runc-architect
      containers:
        - name: dev-container
          image: mycompany/dev-env:latest
          resources:
            requests:
              memory: "4Gi"
              cpu: "2000m"

Monitoring and Observability

Pod Status Labels

Architect adds specific labels to track container hibernation state:

# Check hibernation status for a specific container
kubectl get pods -l status.architect.loopholelabs.io/<container-name>=SCALED_DOWN

# Example: Check if the 'api' container is hibernated
kubectl get pods -l status.architect.loopholelabs.io/api=SCALED_DOWN

# List all pods with any hibernated containers
kubectl get pods -o json | jq '.items[] | select(.metadata.labels | to_entries[] | select(.key | startswith("status.architect.loopholelabs.io/")) | .value == "SCALED_DOWN") | .metadata.name'

Resource Tracking Annotations

When a pod hibernates, Architect preserves the original resource requests in annotations:

# View original CPU requests for hibernated containers
kubectl get pod <pod-name> -o jsonpath='{.metadata.annotations.architect\.loopholelabs\.io/cpu-requests}'
# Output: {"container-name":"250m"}

# View original memory requests
kubectl get pod <pod-name> -o jsonpath='{.metadata.annotations.architect\.loopholelabs\.io/memory-requests}'
# Output: {"container-name":"6Gi"}

Resource Consumption

Monitor actual resource usage to see savings (requires Kubernetes 1.33 or higher):

# View resource consumption of pods
kubectl top pods

# Hibernated pods will show zero CPU and memory usage
# Compare with original requests stored in annotations to calculate savings

Logs

Architect components log important events:

# View architectd logs on a specific node
kubectl logs -n architect -l app=architectd --tail=100

# View admission controller logs
kubectl logs -n architect -l app=architect-manager --tail=100

# Filter logs for specific pod events
kubectl logs -n architect -l app=architectd | grep <pod-name>

Known Limitations

  1. GPU Workloads: GPU state preservation is under development

Testing Your Application

Before deploying to production, test your application's compatibility:

# 1. Deploy with Architect in staging
# 2. Generate typical load
# 3. Let it hibernate (check status label)
kubectl get pod <pod> -o jsonpath='{.metadata.labels.status\.architect\.loopholelabs\.io/<container>}'

# 4. Wake it with traffic
kubectl exec <pod> -- curl localhost:<port>/health

# 5. Verify functionality and state preservation
# 6. Check logs if there are errors
kubectl logs -n architect -l app=architectd | grep <pod>

Best Practices

1. Node Configuration

  • Label nodes appropriately: Only label nodes where you want Architect workloads to run
  • Avoid preemptable nodes for Architect components: The architect-manager and architect-control-plane should run on stable nodes
  • Separate control plane from workloads: Run Architect control components on different nodes than your workloads when possible

2. Application Suitability

Well-suited applications:

  • Stateless web services and APIs
  • Microservices with intermittent traffic
  • Development and staging environments
  • Batch processing jobs with idle periods
  • Services with predictable traffic patterns

Applications requiring careful consideration:

  • GPU workloads requiring CUDA state preservation (under development)

3. Configuration Guidelines

  • Start with conservative timeouts: Begin with 30-60 second idle timeouts and decrease gradually
  • Test in staging first: Always validate hibernation behavior in non-production environments
  • Monitor wake times: Ensure your SLOs are met with the hibernation/wake cycle

4. Capacity Planning

With Architect, you can:

  • Overprovision without cost penalty: Run more replicas for better availability
  • Eliminate scaling buffers: No need for extra replicas to handle scale-up delays
  • Simplify HPA configuration: Focus on actual capacity needs, not scaling delays

Updating and Managing Workloads

Adding Architect to Existing Workloads

  1. Add the runtime class:
spec:
  runtimeClassName: runc-architect
  1. Add the managed containers annotation:
annotations:
  architect.loopholelabs.io/managed-containers: '["your-container"]'
  1. Apply the changes:
kubectl apply -f your-deployment.yaml

Removing Architect from Workloads

To disable Architect for a workload:

  1. Remove the container from the managed containers list:
annotations:
  architect.loopholelabs.io/managed-containers: "[]"
  1. Or remove the runtime class:
# Remove or comment out:
# runtimeClassName: runc-architect
  1. Apply changes (no need to delete the pod):
kubectl apply -f your-deployment.yaml

Troubleshooting

Pod Not Hibernating

Check idle timeout configuration:

# View configured timeout (default is 10s if not set)
kubectl get pod <pod-name> -o jsonpath='{.metadata.annotations.architect\.loopholelabs\.io/scaledown-durations}'

Verify container is managed:

kubectl get pod <pod-name> -o jsonpath='{.metadata.annotations.architect\.loopholelabs\.io/managed-containers}'

Check container status label:

# Check if container shows as scaled down
kubectl get pod <pod-name> -o jsonpath='{.metadata.labels.status\.architect\.loopholelabs\.io/<container-name>}'

Review architectd logs for hibernation events:

kubectl logs -n architect -l app=architectd | grep <pod-name>

Pod Not Waking

Test wake triggers:

# Wake via kubectl exec
kubectl exec -it <pod-name> -- /bin/sh -c "echo test"

# Wake via network traffic (if service exposed)
kubectl port-forward <pod-name> <port>:<port>
curl localhost:<port>

Check pod events:

kubectl describe pod <pod-name>

Verify architectd is running on the node:

# Find which node the pod is on
kubectl get pod <pod-name> -o wide

# Check architectd on that node
kubectl get pods -n architect -o wide | grep <node-name>

High Wake Times

If wake times exceed 50ms:

  • Check node CPU and memory availability
  • Verify no resource contention on the node
  • Check checkpoint size (larger applications take longer)
  • Review architectd logs for restore errors

Checkpoint Failures

Common causes and solutions:

  1. Application incompatibility:

    • Applications using GPUs are not currently supported
  2. Disk space issues:

    # Check disk space on nodes
    kubectl get nodes -o custom-columns=NAME:.metadata.name,DISK:.status.allocatable.ephemeral-storage
  3. Permission issues:

    • Ensure the runtime class is properly set
    • Verify node labels are correct
  4. Review detailed logs:

    # Get detailed architectd logs
    kubectl logs -n architect -l app=architectd --tail=500 | grep -E "checkpoint|restore|error"

Customizing the Helm Chart

The Helm chart supports additional configuration options for customizing components, for example you can chose to install a pre-release version:

--devel --version 0.0.0-pojntfx-arch-394-implement-p2p-evac-for-new-architect.1.9b433b9

Or add custom node selectors for components to further restrict pod placement:

--set 'architectdNodeSelector.custom-label=value' \
--set 'architectAdmissionControllerNodeSelector.zone=us-east-1a' \
--set 'architectControlPlaneNodeSelector.tier=critical'

Add tolerations for components to allow scheduling pods to tained nodes:

--set 'architectdTolerations[0].key=dedicated' \
--set 'architectdTolerations[0].operator=Equal' \
--set 'architectdTolerations[0].value=architect' \
--set 'architectdTolerations[0].effect=NoSchedule'

And set resource requests and limits for the different components:

--set 'architectAdmissionControllerResources.requests.cpu=100m' \
--set 'architectAdmissionControllerResources.requests.memory=128Mi' \
--set 'architectAdmissionControllerResources.limits.cpu=500m' \
--set 'architectAdmissionControllerResources.limits.memory=512Mi' \
--set 'architectControlPlaneResources.requests.cpu=200m' \
--set 'architectControlPlaneResources.requests.memory=256Mi'

FAQ

Q: How is this different from scale-to-zero solutions like KEDA or Knative?

A: Scale-to-zero solutions delete pods entirely, causing 30-60+ second cold starts when they're needed again. Architect keeps pods scheduled but hibernates them in place, enabling <50ms wake times. Your pods stay registered with services, keep their PVCs mounted, and maintain their network configuration.

Q: What triggers a pod to wake from hibernation?

A: Pods wake instantly (<50ms) when:

  • Network traffic arrives at the pod
  • You run kubectl exec commands on the container
  • (Coming soon) API calls to programmatically wake pods

The wake process is automatic and transparent - your application doesn't need any modifications.

Q: Can I use Architect with HPA (Horizontal Pod Autoscaler)?

A: Yes! Architect complements HPA perfectly. HPA handles adding/removing replicas based on metrics, while Architect ensures idle replicas don't consume resources. You can now set more aggressive HPA policies without cost concerns.

Q: What applications are not compatible?

A: Currently, applications using GPUs are not compatible. We recommend thorough testing in staging environments.

Q: How much overhead does Architect add?

A: Architect adds minimal overhead - typically <1% CPU and <50MB memory per node for the architectd daemon. The checkpoint/restore process itself is highly optimized with near-zero impact on running workloads.

Q: Can I migrate hibernated pods between nodes?

A: Yes - pods that are deleted on one node will have their checkpoints moved to whichever node the replacement pod is scheduled to.

Q: What happens during Kubernetes upgrades?

A: Architect components should be upgraded first, followed by your workloads. Hibernated pods will be woken during node drains and can be safely rescheduled.

Q: Is there a limit to how many pods can be hibernated?

A: There's no hard limit. The practical limit depends on your node's disk space for storing checkpoints (typically 50-200MB per pod) and the architectd daemon's capacity.

Q: How do I know how much I'm saving?

A: Monitor the difference between provisioned resources and actual usage:

# Provisioned resources
kubectl get pods -o custom-columns=NAME:.metadata.name,CPU:.spec.containers[0].resources.requests.cpu,MEMORY:.spec.containers[0].resources.requests.memory

# Actual usage (hibernated pods show ~0)
kubectl top pods

A more concise breakdown will be available soon at https://console.preview.architect.io/

Q: What happens to in-flight requests?

A: Architect monitors network traffic and only hibernates pods that have been truly idle (no traffic) for the configured duration. If a request arrives while a pod is transitioning to hibernation or while it's hibernated, it's buffered and delivered once the pod wakes (typically within 50ms). No packets are dropped.

Q: Can I change the managed containers list without restarting pods?

A: While you can update the managed-containers annotation without restarting, it's not recommended. When you remove a container from the managed list, its checkpoint is deleted and it becomes unmanaged. For predictable behavior, use the Recreate deployment strategy or restart pods after changing the annotation.

Q: How much disk space do checkpoints require?

A: Checkpoint size varies by application but typically ranges from 50-200MB per pod. The size depends on the application's memory footprint and state. Monitor disk usage on nodes with:

kubectl exec -n architect <architectd-pod> -- du -sh /var/lib/architect/checkpoints/

Q: Does Architect work with StatefulSets?

A: Yes, Architect works with StatefulSets. Each pod maintains its own checkpoint and persistent volume claims remain mounted during hibernation. Use the same annotations and runtime class as with Deployments.

Q: What happens if the architectd daemon crashes?

A: If architectd crashes on a node, pods on that node continue running normally but won't hibernate or wake. The daemon automatically restarts via the DaemonSet controller. Existing checkpoints are preserved and operations resume once architectd is back online.

Q: Can I exclude certain pods from hibernation temporarily?

A: Yes, you can:

  1. Remove the container from the managed-containers annotation
  2. Set a very long timeout (e.g., "24h")
  3. Remove the runc-architect runtime class (requires pod restart)

Q: How do I calculate my actual cost savings?

A: Track these metrics:

# Total provisioned resources
kubectl get pods -o json | jq '[.items[] | .spec.containers[] | .resources.requests] | map(.memory // "0", .cpu // "0") | add'

# Resources actually being used
kubectl top pods --no-headers | awk '{sum+=$2} END {print sum}'

# Savings = (Provisioned - Actual) * Cloud provider rates

Q: Where are checkpoints cached?

A: Checkpoints are cached on each node that architectd runs on, at /root/.local/state/architect/.

Support and Resources

  • Documentation: This guide and architecture documentation
  • Support: Contact Loophole Labs support team
  • License: Contact admin@loopholelabs.io for enterprise installations

Conclusion

Architect for Kubernetes fundamentally changes the economics of running Kubernetes workloads. By eliminating idle resource consumption while maintaining instant availability, you can:

  • Overprovision for peak capacity without cost penalties
  • Reduce infrastructure spend by 30-80%
  • Maintain or improve application performance
  • Simplify capacity planning and autoscaling

Next Steps

  1. Start Small: Deploy Architect in a development or staging environment first
  2. Test Compatibility: Verify your applications checkpoint and restore correctly
  3. Monitor Savings: Track resource consumption before and after enabling Architect
  4. Optimize Timeouts: Fine-tune idle timeouts based on your traffic patterns

Getting Help

  • Review the architecture documentation for deep technical details
  • Contact support for assistance with specific use cases
  • Join our community discussions for tips and best practices

Start with a small subset of workloads, measure the benefits, and gradually expand your Architect deployment for maximum savings.