Annotations
Pod annotations control per-workload behavior. Set them on the pod template of a
pod that has runtimeClassName: runc-architect. Each entry lists its default and
what it requires.
managed-containers
architect.loopholelabs.io/managed-containers: '["container-1", "container-2"]'Which containers Architect manages. Unlisted containers run normally.
Default: none. · Requires: runtimeClassName: runc-architect.
scaledown-durations
architect.loopholelabs.io/scaledown-durations: '{"container-1":"30s", "container-2":"60s"}'Idle time before a container hibernates.
Default: 60s. · Requires: managed-containers.
initial-scaledown-delays
architect.loopholelabs.io/initial-scaledown-delays: '{"container-1":"90s"}'Grace period (a Go duration string) that suppresses hibernation for the
configured duration after the container's first scale-up. Useful for slow-starting workloads
(for example JVMs whose readiness probes take longer than scaledown-durations)
so they are not hibernated mid-startup. Normal activity-based scale-down resumes
after the window elapses. The window is not re-armed after a migration or
post-scale-down restart, since the workload is already past its slow startup by
then.
Default: 0 (disabled), values clamped to 24h. · Requires: managed-containers.
network-monitor
architect.loopholelabs.io/network-monitor: '{"container-1":"packets", "container-2":"connections"}'Enables network-based wake: a scaled-down container wakes when it receives
network traffic. An eBPF program in the pod's network namespace watches the
container's declared ports and triggers a scale-up. Without this annotation, the
only way to wake a scaled-down container is kubectl exec.
Modes:
packets: wake on any incoming TCP/UDP packet on a tracked port. Suits sporadic request/response workloads such as HTTP APIs and webhook receivers.connections: TCP only. Wake on connection establishment and stay awake while any TCP connection is open. Suits long-lived connection patterns such as databases, message brokers, and gRPC servers. A client that holds a pooled connection open indefinitely keeps the container awake.
Activity is tracked per port. Architect monitors only the ports the container
declares in its ports array. Shadow ports injected by
health-check-proxy and shadow-ports
are added to that array so Kubernetes Services can target them, but Architect
ignores traffic on them when assessing activity. The traffic still reaches the
application; it just does not keep the container running.
Activity is also scoped per container. Sidecars sharing the pod's network
namespace (Istio sidecars, fluentd, and the like) do not keep the managed
container awake, and outbound traffic from an ephemeral source port does not
count. A workload that only does outbound traffic from ephemeral ports should
use disable-autoscaledown-containers.
Default: off. · Requires: managed-containers.
health-check-proxy
architect.loopholelabs.io/health-check-proxy: '{"mappings":[{"containerName":"app","appPort":8080,"shadowPort":9080}]}'Lets kubelet liveness, readiness, and startup probes pass while the container is
scaled down, without waking it. Probes are pointed at the shadowPort; Architect
injects an architect-health-check-proxy sidecar that forwards probes to the
application while it runs and answers them itself while it is scaled down, so
kubelet keeps seeing a healthy response. Without this, every probe hits the
application port and counts as activity, so a probed container never scales down.
Mapping fields:
containerName(required): a container inmanaged-containers.appPort(required, 1 to 65535): the application's real probe port.shadowPort(required, 1 to 65535): the port to point probes at.
Duplicate shadowPort values across mappings are dropped with a warning. The
sidecar is not added (and a warning is logged) if managed-containers or
network-monitor is missing.
See Examples for a worked example, and Troubleshooting if probes still wake the container.
Default: none. · Requires: managed-containers, network-monitor.
shadow-ports
architect.loopholelabs.io/shadow-ports: '{"mappings":[{"containerName":"app","appPort":9090,"shadowPort":29090}]}'Lets a scraper (Prometheus, an external health check, a debug tool) reach an
application port without counting as activity, so regular scrapes do not keep the
container awake. The scraper is pointed at the shadowPort; traffic still reaches
the application on the real port, and the application is unaware of the redirect.
Without this, a recurring scrape looks like continuous traffic and the container
never scales down.
Mapping fields:
containerName(required): a container inmanaged-containers.appPort(required, 1 to 65535): the real port the application listens on.shadowPort(required, 1 to 65535): the port to point the scraper at.
Duplicate shadowPort values are dropped with a warning. The shadow ports are
not added (and a warning is logged) if managed-containers or network-monitor
is missing. When the scraper cannot be moved to a different port (for example it
is hard-coded in Prometheus discovery), use
ignore-activity-ports instead.
See Examples for a worked example, and Troubleshooting if scrapes still wake the container.
Default: none. · Requires: managed-containers, network-monitor.
ignore-activity-ports
architect.loopholelabs.io/ignore-activity-ports: '{"container-1":[9091, 9100]}'Marks specific ports on the container's existing port spec as conntrack-bypassed,
so traffic to them does not count as activity. Unlike shadow-ports
there is no DNAT and no new port is injected; the operator asserts that the listed
ports are already declared on the container and the application already listens on
them. Use this when a metrics scraper hits the real application port directly and
should not keep the workload awake.
Default: none. · Requires: managed-containers, network-monitor.
postmigration-autoscaleup-containers
architect.loopholelabs.io/postmigration-autoscaleup-containers: '["container-1"]'Containers that automatically scale up after migration. By default they stay hibernated to avoid a thundering herd.
Default: off (containers stay hibernated after migration). · Requires: managed-containers.
disable-autoscaledown-containers
architect.loopholelabs.io/disable-autoscaledown-containers: '["container-1"]'Prevents automatic hibernation. Useful for background jobs that should migrate but not hibernate on idle.
Default: off (containers hibernate on idle). · Requires: managed-containers.
scaleup-timeout-containers
architect.loopholelabs.io/scaleup-timeout-containers: '{"container-1": "60s"}'How long to wait for a checkpoint during startup.
Default: 30s.
migrate-emptydir-containers
architect.loopholelabs.io/migrate-emptydir-containers: '["container-1"]'Preserves emptyDir volume data during migration. By default, emptyDir volumes are not migrated.
Default: off (emptyDir not migrated). · Requires: managed-containers.
sparse-files-containers
architect.loopholelabs.io/sparse-files-containers: '{"container-1": ["/var/cache/app.db"]}'Recreates the listed files as sparse files (same size and mode, contents zeroed) at the destination instead of copying their bytes through the upper-layer snapshot, and skips them on the source so the migration avoids the per-byte snapshot cost. Use for workloads that re-scan or rewrite the file post-restore (caches, generated artifacts, scratch space). Workloads that read the original contents after migration see zeros.
Default: none.
lazy-pages-migration-containers
Experimental. Only enable when advised by Loophole Labs.
architect.loopholelabs.io/lazy-pages-migration-containers: '["container-1"]'Enables CRIU lazy-pages migration, fetching memory pages on demand from the source pod during restore instead of copying everything upfront. Helps with memory-heavy containers. Falls back to eager migration if lazy-pages migration fails.
Default: off.
lazy-pages-restore-timeout-containers
Experimental. Only enable when advised by Loophole Labs.
architect.loopholelabs.io/lazy-pages-restore-timeout-containers: '{"container-1":"30s"}'Bounds how long a lazy-pages restore waits for memory pages from the source before falling back to a fresh start. Useful when the source page-server is unreachable but the underlying TCP connection appears healthy. Values are Go duration strings.
Default: 0 (disabled), clamped to 24h.
rewrite-listener-addresses-containers
architect.loopholelabs.io/rewrite-listener-addresses-containers: '["container-1"]'Rewrites listener socket addresses in CRIU checkpoints during migration. When an
application binds to the pod IP (rather than 0.0.0.0), the listener address
becomes invalid on the destination pod. This rewrites those addresses to
INADDR_ANY (0.0.0.0) or in6addr_any (::) so the restore succeeds.
Default: off.
rewrite-established-addresses-containers
architect.loopholelabs.io/rewrite-established-addresses-containers: '["container-1"]'Rewrites the source IP of established TCP connections in CRIU checkpoints during
migration. The source pod's IP no longer exists on the destination pod, which
causes CRIU's socket restore to fail. This rewrites the source address to the new
pod's IP (read from /etc/hosts). Supports IPv4 and IPv6.
Default: off.
start-from-persistent-checkpoint
# Same namespace (name only):
architect.loopholelabs.io/start-from-persistent-checkpoint: "persistent-checkpoint-name"
# Cross-namespace (namespace/name):
architect.loopholelabs.io/start-from-persistent-checkpoint: "namespace/persistent-checkpoint-name"Restore from a PersistentCheckpoint CRD on startup. With a bare name the
PersistentCheckpoint is looked up in the pod's namespace; use namespace/name
to reference one in a different namespace. When set, this takes priority over
pod-template-hash-based Checkpoint CRDs: on any failure (not found, empty,
download error, registry storage) the pod starts fresh rather than falling back
to the migration path.
Default: none.
checkpoint-engine
Experimental. Only enable when advised by Loophole Labs.
architect.loopholelabs.io/checkpoint-engine: "cruise"Selects the checkpoint/restore engine for the pod's managed containers. Set it to
cruise to route runc checkpoint/restore to the in-tree cruise engine instead of
CRIU. This is a pod-global setting; unmanaged containers in the pod are never
checkpointed.
Default: criu.