Glossary
Short definitions of the terms used throughout these docs.
Admission controller
The component that configures a pod to work with Architect. When you create a pod
that requests Architect's runtime class, Kubernetes hands it to the admission
controller first, which reads your annotations and adjusts the pod so its
containers can be hibernated and restored. It runs as a Deployment named
architect-admission-controller, in the architect namespace, on a
critical node.
architectd (daemon)
The part of Architect that runs on every worker node and does the actual hibernating, waking, and migrating of that node's containers. It runs as a DaemonSet, one copy per node; if it stops on a node, the containers there keep running but cannot hibernate or wake until it comes back.
Checkpoint
A point-in-time snapshot of a running container's memory and process state. A checkpoint can be written to disk as a checkpoint file, or exist only ephemerally in memory. Restoring one resumes the container exactly where it left off, with no cold start.
Checkpoint engine
The component that creates a checkpoint (and restores it again later). This is an internal detail: in normal use you don't need to choose or think about the engine. Architect uses CRIU today and is moving to its own engine, Cruise, over time.
Cold start
The delay when a container starts from scratch and has to re-initialize: load code, warm caches, and rebuild in-memory state. Restoring from a checkpoint skips this, because the container resumes with that work already done.
Control plane
Architect's central coordinator. It keeps checkpoint handoffs between nodes
consistent during hibernation and migration, so container state can move around
the cluster safely. It runs as a Deployment named architect-control-plane, in
the architect namespace, on a critical node.
Critical node
A node labeled architect.loopholelabs.io/critical-node=true, where Architect's
control plane and admission controller
run. Choose long-lived, on-demand nodes for these: not spot or preemptible
capacity, and not nodes you routinely drain or autoscale away, so these
components stay available.
Health-check proxy
A helper that answers a container's liveness and readiness probes while it is hibernated or migrating, so Kubernetes does not mark the container unhealthy and the probes themselves do not keep waking it. The admission controller injects it as a sidecar into the managed pod, where it reads container state from the shim.
Hibernate (scale-down)
Checkpointing an idle managed container and dropping its
pod's CPU and memory requests to zero, while the pod stays scheduled and keeps its
IP, Services, and volumes. Architect hibernates a container after it has been idle
long enough. Its status label reads SCALED_DOWN while it is
hibernated.
Lazy-pages migration
A faster form of migration for containers with a large memory footprint: instead of copying all the memory up front, the destination node fetches each page (a small block of memory) from the source only as the container first touches it.
Managed container
A container that Architect hibernates and restores. You choose which containers are managed; any you leave out (for example a logging sidecar) run normally and are never hibernated.
Migration
Moving a managed container's in-memory state to its replacement on another node when Kubernetes replaces the pod, for example during a node drain, rolling update, or spot interruption. Architect checkpoints the container on the old node and restores it in the new pod, so it keeps its running state instead of starting cold. Only managed containers migrate this way.
PersistentCheckpoint
A checkpoint you create deliberately and keep, captured while the container keeps running. It saves a known-good or pre-warmed state, a golden image, that other pods can start from to skip a slow cold start. It is stored on the node by default, or in S3-compatible object storage when you configure it. Unlike the checkpoints Architect takes automatically when hibernating, a PersistentCheckpoint stays until you delete it.
Restore
Re-creating a container from a checkpoint so it resumes with its previous memory and running state instead of starting fresh. Both waking and migration end in a restore.
Router (and router-shim)
The components that route network traffic to managed pods and buffer it during a
migration. The router runs as a DaemonSet named architect-router, in the
architect namespace, with the router-shim as a sidecar in its pods. They are
installed only when the experimental traffic-buffering feature is enabled in the
Helm values.
Runtime class (runc-architect)
The Kubernetes RuntimeClass that opts a pod into Architect. Setting
runtimeClassName: runc-architect on a pod is what tells Architect to manage it.
Shadow port
An extra port Architect exposes, named shadow-<port>, so health probes or
metrics scrapers can reach a container without that traffic counting as activity
that would wake it. Configured with the
shadow-ports annotation.
Shim (shim-runc)
Architect's shim is a containerd shim: a small program that sits between containerd (the container runtime Kubernetes drives) and runc (the low-level tool that actually starts and stops containers). Architect's version adds checkpoint and restore at that layer, so a container can be hibernated and woken with no changes to your workload. It runs on each node and coordinates with the architectd on that node.
Status label
The label status.architect.loopholelabs.io/<container> that Architect sets to
RUNNING or SCALED_DOWN to show whether each
managed container is currently awake or hibernated.
Wake (scale-up)
Restoring a hibernated container and returning its CPU and memory requests,
triggered by a kubectl exec, or by incoming network traffic if network-based
wake is enabled.
The status label returns to RUNNING.