-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add "cross-emulation" support #63
Draft
albertofaria
wants to merge
8
commits into
main
Choose a base branch
from
cross-emulation
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Changes from all commits
Commits
Show all changes
8 commits
Select commit
Hold shift + click to select a range
7a649c3
Tolerate images with entrypoint /sbin/init and similar
albertofaria 4b98cd9
Add support for running bootc bootable containers
albertofaria 9facde0
Cache VM images generated from bootc container images
albertofaria ebb134f
Improve engine detection logic
albertofaria 1c63aee
Extend bootc container support to Docker
albertofaria 5c399d6
Add --bootc-disk-size option
albertofaria 8944e2d
tests/env.sh: Expose TEST_ID variable to tests
albertofaria dc4616c
create: Auto-detect image architecture
albertofaria File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,88 @@ | ||
{ | ||
"ociVersion": "1.0.0", | ||
"process": { | ||
"terminal": true, | ||
"user": { "uid": 0, "gid": 0 }, | ||
"args": ["/output/entrypoint.sh", "<IMAGE_NAME>"], | ||
"env": [ | ||
"PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin", | ||
"TERM=xterm" | ||
], | ||
"cwd": "/", | ||
"capabilities": { | ||
"bounding": [], | ||
"effective": [], | ||
"inheritable": [], | ||
"permitted": [], | ||
"ambient": [] | ||
}, | ||
"rlimits": [ | ||
{ | ||
"type": "RLIMIT_NOFILE", | ||
"hard": 262144, | ||
"soft": 262144 | ||
} | ||
], | ||
"noNewPrivileges": true | ||
}, | ||
"root": { | ||
"path": "<ORIGINAL_ROOT>", | ||
"readonly": false | ||
}, | ||
"hostname": "bootc-install", | ||
"mounts": [ | ||
{ | ||
"type": "bind", | ||
"source": "<PRIV_DIR>/root/crun-vm/bootc", | ||
"destination": "/output", | ||
"options": ["bind", "rprivate", "rw"] | ||
}, | ||
{ | ||
"destination": "/proc", | ||
"type": "proc", | ||
"source": "proc" | ||
}, | ||
{ | ||
"destination": "/dev/pts", | ||
"type": "devpts", | ||
"source": "devpts", | ||
"options": [ | ||
"nosuid", | ||
"noexec", | ||
"newinstance", | ||
"ptmxmode=0666", | ||
"mode=0620", | ||
"gid=5" | ||
] | ||
} | ||
], | ||
"linux": { | ||
"namespaces": [ | ||
{ "type": "pid" }, | ||
{ "type": "network" }, | ||
{ "type": "ipc" }, | ||
{ "type": "uts" }, | ||
{ "type": "cgroup" }, | ||
{ "type": "mount" } | ||
], | ||
"maskedPaths": [ | ||
"/proc/acpi", | ||
"/proc/asound", | ||
"/proc/kcore", | ||
"/proc/keys", | ||
"/proc/latency_stats", | ||
"/proc/timer_list", | ||
"/proc/timer_stats", | ||
"/proc/sched_debug", | ||
"/sys/firmware", | ||
"/proc/scsi" | ||
], | ||
"readonlyPaths": [ | ||
"/proc/bus", | ||
"/proc/fs", | ||
"/proc/irq", | ||
"/proc/sys", | ||
"/proc/sysrq-trigger" | ||
] | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,51 @@ | ||
#!/bin/sh | ||
# SPDX-License-Identifier: GPL-2.0-or-later | ||
|
||
set -e | ||
|
||
image_name=$1 | ||
|
||
# monkey-patch loopdev partition detection, given we're not running systemd | ||
# (bootc runs `udevadm settle` as a way to wait until loopdev partitions are | ||
# detected; we hijack that call and use partx to set up the partition devices) | ||
|
||
original_udevadm=$( which udevadm ) | ||
|
||
mkdir -p /output/bin | ||
|
||
cat >/output/bin/udevadm <<EOF | ||
#!/bin/sh | ||
${original_udevadm@Q} "\$@" && partx --add /dev/loop0 | ||
EOF | ||
|
||
chmod +x /output/bin/udevadm | ||
|
||
# default to an xfs root file system if there is no bootc config (some images | ||
# don't currently provide any, for instance quay.io/fedora/fedora-bootc:40) | ||
|
||
if ! find /usr/lib/bootc/install -mindepth 1 -maxdepth 1 | read; then | ||
# /usr/lib/bootc/install is empty | ||
|
||
cat >/usr/lib/bootc/install/00-crun-vm.toml <<EOF | ||
[install.filesystem.root] | ||
type = "xfs" | ||
EOF | ||
|
||
fi | ||
|
||
# build disk image using bootc-install | ||
|
||
PATH=/output/bin:$PATH bootc install to-disk \ | ||
--source-imgref docker-archive:/output/image.docker-archive \ | ||
--target-imgref "$image_name" \ | ||
--skip-fetch-check \ | ||
--generic-image \ | ||
--via-loopback \ | ||
--karg console=tty0 \ | ||
--karg console=ttyS0 \ | ||
--karg selinux=0 \ | ||
/output/image.raw | ||
|
||
# communicate success by creating a file, since krun always exits successfully | ||
|
||
touch /output/bootc-install-success |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,127 @@ | ||
#!/bin/bash | ||
# SPDX-License-Identifier: GPL-2.0-or-later | ||
|
||
set -o errexit -o pipefail -o nounset | ||
|
||
engine=$1 | ||
container_id=$2 | ||
original_root=$3 | ||
priv_dir=$4 | ||
disk_size=$5 | ||
|
||
__step() { | ||
printf "\033[36m%s\033[0m\n" "$*" | ||
} | ||
|
||
bootc_dir=$priv_dir/root/crun-vm/bootc | ||
|
||
mkfifo "$bootc_dir/progress" | ||
exec > "$bootc_dir/progress" 2>&1 | ||
|
||
# this blocks here until the named pipe above is opened by entrypoint.sh | ||
|
||
# get info about the container *image* | ||
|
||
image_info=$( | ||
"$engine" container inspect \ | ||
--format '{{.Config.Image}}'$'\t''{{.Image}}' \ | ||
"$container_id" | ||
) | ||
|
||
image_name=$( cut -f1 <<< "$image_info" ) | ||
# image_name=${image_name#sha256:} | ||
|
||
image_id=$( cut -f2 <<< "$image_info" ) | ||
|
||
# check if VM image is cached | ||
|
||
container_name=crun-vm-$container_id | ||
|
||
cache_image_label=containers.crun-vm.from=$image_id | ||
cache_image_id=$( "$engine" images --filter "label=$cache_image_label" --format '{{.Id}}' ) | ||
|
||
if [[ -n "$cache_image_id" ]]; then | ||
|
||
# retrieve VM image from cached containerdisk | ||
|
||
__step "Retrieving cached VM image..." | ||
|
||
trap '"$engine" rm --force "$container_name" >/dev/null 2>&1 || true' EXIT | ||
|
||
"$engine" create --quiet --name "$container_name" "$cache_image_id" >/dev/null | ||
"$engine" export "$container_name" | tar -C "$bootc_dir" -x image.qcow2 | ||
"$engine" rm "$container_name" >/dev/null 2>&1 | ||
|
||
trap '' EXIT | ||
|
||
else | ||
|
||
__step "Converting $image_name into a VM image..." | ||
|
||
# save container *image* as an archive | ||
|
||
echo -n 'Preparing container image...' | ||
|
||
"$engine" save --output "$bootc_dir/image.docker-archive" "$image_id" 2>&1 \ | ||
| sed -u 's/.*/./' \ | ||
| stdbuf -o0 tr -d '\n' | ||
|
||
echo | ||
|
||
# adjust krun config | ||
|
||
__sed() { | ||
sed -i "s|$1|$2|" "$bootc_dir/config.json" | ||
} | ||
|
||
__sed "<IMAGE_NAME>" "$image_name" | ||
__sed "<ORIGINAL_ROOT>" "$original_root" | ||
__sed "<PRIV_DIR>" "$priv_dir" | ||
|
||
# run bootc-install under krun | ||
|
||
if [[ -z "$disk_size" ]]; then | ||
container_image_size=$( | ||
"$engine" image inspect --format '{{.VirtualSize}}' "$image_id" | ||
) | ||
|
||
# use double the container image size to allow for in-place updates | ||
disk_size=$(( container_image_size * 2 )) | ||
|
||
# round up to 1 MiB | ||
alignment=$(( 2**20 )) | ||
disk_size=$(( (disk_size + alignment - 1) / alignment * alignment )) | ||
fi | ||
|
||
truncate --size "$disk_size" "$bootc_dir/image.raw" | ||
|
||
trap 'krun delete --force "$container_name" >/dev/null 2>&1 || true' EXIT | ||
krun run --config "$bootc_dir/config.json" "$container_name" </dev/ptmx | ||
trap '' EXIT | ||
|
||
[[ -e "$bootc_dir/bootc-install-success" ]] | ||
|
||
# convert image to qcow2 to get a lower file length | ||
|
||
qemu-img convert -f raw -O qcow2 "$bootc_dir/image.raw" "$bootc_dir/image.qcow2" | ||
rm "$bootc_dir/image.raw" | ||
|
||
# cache VM image file as containerdisk | ||
|
||
__step "Caching VM image as a containerdisk..." | ||
|
||
id=$( | ||
"$engine" build --quiet --file - --label "$cache_image_label" "$bootc_dir" <<-'EOF' | ||
FROM scratch | ||
COPY image.qcow2 / | ||
ENTRYPOINT ["no-entrypoint"] | ||
EOF | ||
) | ||
|
||
echo "Stored as untagged container image with ID $id" | ||
|
||
fi | ||
|
||
__step "Booting VM..." | ||
|
||
touch "$bootc_dir/success" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
File renamed without changes.
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -11,6 +11,7 @@ prepare: | |
- cargo | ||
- coreutils | ||
- crun | ||
- crun-krun | ||
- docker | ||
- genisoimage | ||
- grep | ||
|
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the bootable containers blocker at present for --emulated? Just curious...
I wonder could we apply --emulated flag automatically if we detect we are not on the same CPU architecture, I think podman qemu-user-static functionality does this and it makes sense.
There's another use-case emulated could be potentially useful, when EL2/KVM (or /dev/kvm) is not available, but it's probably not worth automatically applying --emulated in that case because it's so much slower than kvm or using containers with just plain-old crun.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We currently use libkrun to run a micro VM that generates the VM disk image from the bootable container. Since libkrun relies on KVM, this only works if the bootable container's arch is the same as the host's. We could potentially use some qemu-based micro VM alternative instead of libkrun to lift this limitation.
This could make sense. It would make --emulated's meaning more complicated and less intuitive, though. Right now the user knows that --emulated = use emulation, no --emulated = use KVM, without having to think about what the host and VM arches are.
Yes, I think it's best to fail here to alert the user to the fact that KVM is not available, instead of silently using slow emulation.