6. CRIU - Checkpointing and Restoring

About 4 min

6. CRIU - Checkpointing and Restoring 관련

With some help from a program called CRIU, Podman can checkpoint and restore containers on the same host. This can be useful with workloads that have a long startup period or require a long time to warm up caches. For example, large memcached servers, database, or even Java workloads can take several minutes or even hours to reach maximum throughput performance. This is often referred to as cache warming.

If this doesn't quite make sense, let's talk about it in the context of container creation and deletion. Podman allow you to break the creation and deletion of containers down into very granular steps. Here's what the life cycle of a container looks like from start to finish:

podman pull - Pull the container image
podman create - Add tracking meta-data to /var/lib/containers or .local/share/containers
podman mount - Create a copy-on-write layer and mount the container image with a read/write layer above it
podman init - Create a config.json file
podman start - Run the workload by handing the config.json and root file system to runc
Workload runs either as a batch process, or as a daemon
podman kill - kills the process or processes in the container
podman rm - Unmount and delete the copy-on-write layer
podman rmi - remove the image /var/lib/containers or .local/share/containers

To understand CRIU, you need to understand step 6. When this step is executed, Podman sends a kill signal to the processes in the container. CRIU allows us to break this down even further like this:

podman pull - Pull the container image
podman create - Add tracking meta-data to /var/lib/containers or .local/share/containers
podman mount - Create a copy-on-write layer and mount the container image with a read/write layer above it
podman init - Create a config.json file
podman start - Run the workload by handing the config.json and root file system to runc
Workload runs either as a batch process, or as a daemon
podman checkpoint - Dump contents of memory to disk and kill processes
Workload process no longer running, memory contents are saved on disk
podman restore - Restore memory contents to new processes
Workload runs either as a batch process, or as a daemon
podman kill - kills the process or processes in the container
podman rm - Unmount and delete the copy-on-write layer
podman rmi - remove the image /var/lib/containers or .local/share/containers

So, in a nutshell, CRIU gives you more flexibility with containerized processes. Let's see it in action. First, start a simple container which generates incrementing numbers so that we can verify memory contents are really restored:

Input

podman run -d --name looper ubi8 /bin/sh -c \
         'i=0; while true; do echo $i; i=$(expr $i + 1); sleep 1; done'

Error

# Resolved "ubi8" as an alias (/etc/containers/registries.conf.d/001-rhel-shortnames.conf)
# Trying to pull registry.access.redhat.com/ubi8:latest...
# Getting image source signatures
# Checking if image destination supports signatures
# Copying blob 70de3d8fc2c6 done  
# Copying config 62ac1f7ef5 done  
# Writing manifest to image destination
# Storing signatures
# Error: runc: container_linux.go:349: starting container process caused "error adding seccomp filter rule for syscall bdflush: permission denied": OCI permission denied

Input (Correct)

podman run -d --name looper --privileged ubi8 /bin/sh -c \
         'i=0; while true; do echo $i; i=$(expr $i + 1); sleep 1; done'

Result

# Error: OCI runtime error: runc: container_linux.go:349: starting container process caused "unknown capability \"CAP_BPF\""

Now, verify that numbers are being generated. Run this a few times to see the numbers incrementing:

Input

podman logs -l

Result

Now, let's dump the contents of memory to disk, and kill the process:

Input

podman container checkpoint -l

Result

# Error: "created" is not running, cannot checkpoint: container state improper

Verify that it's not running. Notice that that container is in the exited state. This means the copy-on-write layer for the container has not been deleted. Since we used the checkpoint sub-command, the contents of memory are also saved on disk:

Input

podman ps -a

Result

# CONTAINER ID  IMAGE                                     COMMAND               CREATED             STATUS             PORTS       NAMES
# a83dd382c16d  registry.fedoraproject.org/fedora:latest  bash                  27 minutes ago      Up 27 minutes ago              meta-data-container
# 592a894441ef  registry.access.redhat.com/ubi8:latest    /bin/sh -c i=0; w...  About a minute ago  Created                        looper

Verify that numbers are not being generated. Run this a few times to verify:

Input

podman logs -l

Result

Restore the container:

Input

podman container restore -l

Result

Verify the contents of memory and disk are being used and the numbers are incrementing again:

Input

podman logs -l

Result

We're all done, so clean up. This will kill the process, delete the contents of the copy-on-write layer, and remove all of the meta-data for all containers:

Input

podman kill -a

Result

Conclusions

Checkpointing and restoring containers is easy with CRIU and Podman. As part of the container-tools application streams, specific versions of Podman and CRIU are tested and verified to work together (not all versions of Podman and CRIU are guaranteed to work together). Now,