pkg/skopeo: Extend Image struct and add cross-architecture operations#1784
pkg/skopeo: Extend Image struct and add cross-architecture operations#1784DaliborKr wants to merge 8 commits intocontainers:mainfrom
Conversation
The existing RunContextWithExitCode() wraps all errors from exec.Command in generic "failed to invoke" messages, which prevents callers from distinguishing between actual error types. Add RunContextWithExitCode2() and RunWithExitCode2() that return errors with their original types intact, including for ExitError. This allows callers to use errors.Is() and errors.As() to handle specific failure modes. For example, detecting a missing skopeo binary (exec.ErrNotFound) or an ENOEXEC error from inside non native containers, when an emulator is not set correctly. These new functions are meant to replace its original versions in the future. containers#1780 Signed-off-by: Dalibor Kricka <dalidalk@seznam.cz>
In /src/cmd/create.go, the same pattern of spinner creation and nil-safe stopping is repeated. Extract this into startSpinner() and stopSpinner() helper functions so that future callers can use spinners without duplicating the code. Replace the existing inline spinner code in createContainer() and pullImage() with calls to these new helpers. containers#1781 Signed-off-by: Dalibor Kricka <dalidalk@seznam.cz>
…atching Add IsSupportedDistroImage(), which iterates over all supported distros and checks if the image basename matches any of them. This will be used by the architecture resolution code to decide whether to parse architecture suffixes from image tags, as this should be done only for natively supported images [1]. [1] Toolbx supported distributions: https://containertoolbx.org/distros/ containers#1781 Signed-off-by: Dalibor Kricka <dalidalk@seznam.cz>
Introduce the architecture package that represents the core of the Toolbx cross-architecture support, which is based on user-mode emulation using QEMU and binfmt_misc. The Architecture struct collects all per-architecture data (ELF magic/mask, OCI and binfmt naming, aliases, binfmt registration parameters) into a single map. Architectures present in the supportedArchitectures map represent the set of supported architectures within Toolbx. Define architecture ID constants NotSpecified, Aarch64, Ppc64le, and X86_64, along with their supportedArchitectures entries. Add core query functions: - ParseArgArchValue() for resolving user-supplied architecture strings - GetArchNameBinfmt() and GetArchNameOCI() for name lookups (one architecture can have multiple valid names [1]) - HasContainerNativeArch() for comparing against the host - ImageReferenceGetArchFromTag() for extracting architecture from image tag suffixes like "42-aarch64" for architecture detection Expose the HostArchID package variable set by cmd/root.go at startup, and the Config struct for preserving the architecture ID and the QEMU emulator path, through the call chain. [1] https://itsfoss.com/arm-aarch64-x86_64/ containers#1782 Signed-off-by: Dalibor Kricka <dalidalk@seznam.cz>
Cross-architecture containers need QEMU binfmt_misc handlers registered within the container so that non-native architecture binaries can be executed via the host's kernel. Add the Registration struct that models a binfmt_misc registration entry, including name, magic type, offset, ELF magic/mask bytes, interpreter path, and flags. Add functions: - MountBinfmtMisc() to mount the sanboxed binfmt_misc filesystem inside a container, which enables setting the C flag in binfmt_misc registration without affecting the host system. The C flag presents a threat of privilege escalation when registered on the host, that why we want to have it isolated [1] - getDefaultRegistration() to fill a Registration struct containing all necessary binfmt_misc information taken from the architecture.supportedArchitectures data - RegisterBinfmtMisc() to write the registration string to /proc/sys/fs/binfmt_misc/register, which makes the non-native binary perception active - bytesToEscapedString() helper that formats byte slices into the \xHH-escaped string format required by the binfmt_misc register interface [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=21ca59b365c0 containers#1782 Signed-off-by: Dalibor Kricka <dalidalk@seznam.cz>
Before creating or initializing a cross-architecture container, the system must be checked for the required QEMU emulator and binfmt_misc registration. This prevents users from creating or running non-native containers when their host system doesn't meet the requirements, and provides users with an informative error message referring to the problem. Add IsArchSupportedOnCreation(), which searches for a statically linked QEMU binary on the host using exec.LookPath() and verifies that a matching binfmt_misc registration exists. It returns the path to the QEMU binary for use during container creation, which is meant to be passed to the init-container and registered through sandboxed binfmt_misc within the container. Add IsArchSupportedOnInitialization() which performs similar checks from inside the container, looking at the interpreter path passed from the host and falling back to standard host-mounted locations under /run/host/usr/bin/. Add isStaticallyLinkedELF() helper that uses debug/elf to verify a binary is statically linked. Only a statically linked QEMU interpreter can be used, because a dynamically linked one would cause the kernel to attempt to resolve its host-native shared libraries (such as libc.so) within the container, resulting in an immediate crash. Add validateBinfmtRegistration(), which checks for the presence of qemu-<arch> entries in binfmt_misc (or qemu-<arch>-static, since it can differ based on the system). containers#1783 Signed-off-by: Dalibor Kricka <dalidalk@seznam.cz>
Add Architecture and NameFull fields to the Skopeo Image struct so that callers can inspect the architecture of a remote image. Move the image size computation from the /cmd layer into GetSize() and GetSizeHuman() methods on Image, since the skopeo package owns the layer data. Add VerifyArchitectureMatch() method to Image that validates the image's architecture field against an expected architecture ID. The purpose of this function is to check whether the image architecture matches the demanded architecture before it is pulled. Specifically, this verification applies to the images that support only a single architecture (they are not part of a multi-platform manifest list), because the skopeo inspect proceeds successfully even when the value of a flag --override-arch does not match the actual image architecture (for a multi-architecture image the skopeo inspect with not-matching --override-arch would fail). Like this, the user can be prevented from incompatible images. containers#1784 Signed-off-by: Dalibor Kricka <dalidalk@seznam.cz>
Change Inspect() to accept archID and authfile parameters. When the requested architecture differs from the host's, --override-arch is passed to skopeo, which then inspects the correct manifest in a multi-arch image (if it exists for the given architecture, otherwise the inspection fails). It also uses RunContextWithExitCode2() so callers can detect a missing skopeo binary via errors.Is(err, exec.ErrNotFound), which is only a soft dependency of the Toolbx package, as it is not required for running native containers. Add CopyOverrideArch(), which uses 'skopeo copy --override-arch' to pull a specific architecture variant of a multi-arch image into Podman's local container storage. This is used instead of 'podman pull' because Podman does not support pulling a foreign architecture image into a locally addressable name. The way in which the cross-arch extension chooses the name for non-native images (and also containers) is described in the discussion at [1] [1] containers/podman#27780 (comment) containers#1784 Signed-off-by: Dalibor Kricka <dalidalk@seznam.cz>
There was a problem hiding this comment.
Code Review
This pull request introduces comprehensive support for multi-architecture containers by adding a new architecture package for architecture detection and binfmt_misc registration. It refactors spinner logic, enhances skopeo integration to support architecture overrides and image size calculations, and adds utility functions for distro image validation. Feedback includes addressing a potential bug where unspecified architectures could pass invalid flags to skopeo, improving the safety of architecture name strings, refactoring duplicated shell execution logic to avoid non-descriptive naming, and adopting more idiomatic Go error returns for numeric values.
| if !architecture.HasContainerNativeArch(archID) { | ||
| archName := architecture.GetArchNameOCI(archID) | ||
| args = append(args, []string{"--override-arch", archName}...) | ||
| } |
There was a problem hiding this comment.
If archID is architecture.NotSpecified, HasContainerNativeArch will return false (assuming HostArchID is valid). This leads to GetArchNameOCI(NotSpecified) being called, which currently returns "arch_not_specified". Passing --override-arch arch_not_specified to skopeo will cause the command to fail. You should explicitly check that archID is not NotSpecified before adding the flag.
| if !architecture.HasContainerNativeArch(archID) { | |
| archName := architecture.GetArchNameOCI(archID) | |
| args = append(args, []string{"--override-arch", archName}...) | |
| } | |
| if archID != architecture.NotSpecified && !architecture.HasContainerNativeArch(archID) { | |
| archName := architecture.GetArchNameOCI(archID) | |
| args = append(args, []string{"--override-arch", archName}...) | |
| } |
| func GetArchNameOCI(arch int) string { | ||
| if arch == NotSpecified { | ||
| logrus.Warnf("Getting arch name for not specified architecture") | ||
| return "arch_not_specified" | ||
| } | ||
| if archObj, exists := supportedArchitectures[arch]; exists { | ||
| return archObj.NameOCI | ||
| } | ||
| return "" | ||
| } |
There was a problem hiding this comment.
Returning the string "arch_not_specified" for an OCI architecture name is problematic because it can be passed directly to external tools like skopeo, which will fail to recognize it. It would be safer to return an empty string or handle the NotSpecified case explicitly in callers.
func GetArchNameOCI(arch int) string {
if arch == NotSpecified {
return ""
}
if archObj, exists := supportedArchitectures[arch]; exists {
return archObj.NameOCI
}
return ""
}| func RunContextWithExitCode2(ctx context.Context, | ||
| name string, | ||
| stdin io.Reader, | ||
| stdout, stderr io.Writer, | ||
| arg ...string) (int, error) { | ||
|
|
||
| logLevel := logrus.GetLevel() | ||
| if stderr == nil && logLevel >= logrus.DebugLevel { | ||
| stderr = os.Stderr | ||
| } | ||
|
|
||
| cmd := exec.CommandContext(ctx, name, arg...) | ||
| cmd.Stdin = stdin | ||
| cmd.Stdout = stdout | ||
| cmd.Stderr = stderr | ||
|
|
||
| if err := cmd.Run(); err != nil { | ||
| exitCode := 1 | ||
|
|
||
| if ctxErr := ctx.Err(); ctxErr != nil { | ||
| return 1, ctxErr | ||
| } | ||
|
|
||
| var exitErr *exec.ExitError | ||
| if errors.As(err, &exitErr) { | ||
| exitCode = exitErr.ExitCode() | ||
| return exitCode, err | ||
| } | ||
|
|
||
| return exitCode, err | ||
| } | ||
|
|
||
| return 0, nil | ||
| } |
There was a problem hiding this comment.
The naming RunContextWithExitCode2 is non-descriptive and the implementation largely duplicates RunContextWithExitCode. The primary difference is returning the error on non-zero exit codes and omitting the exec.ErrNotFound special handling. It would be better to refactor the original function to support these needs (e.g., via a configuration struct or a more descriptive name like RunContextWithExitError) to maintain code quality and avoid duplication.
| var imageSizeFloat float64 | ||
|
|
||
| if image.LayersData == nil { | ||
| return -1, errors.New("'skopeo inspect' did not have LayersData") |
There was a problem hiding this comment.
In Go, it is more idiomatic to return a zero value (0) along with an error rather than -1 for numeric results, unless -1 has a specific sentinel meaning that the caller is expected to handle.
| return -1, errors.New("'skopeo inspect' did not have LayersData") | |
| return 0, errors.New("'skopeo inspect' did not have LayersData") |
|
Build succeeded. ✔️ unit-test SUCCESS in 2m 11s |
This is PR 5/10 in a series adding cross-architecture container support using QEMU and binfmt_misc.
Depends on: #1783 (PR 4)
Please review #1783 first. The new commits in this PR are:
Summary
The existing skopeo package only supports inspecting images for the host's native architecture. Cross-architecture container creation requires inspecting and pulling specific architecture variants from multi-arch OCI images. Also, it is necessary to adjust the pulled image name, so it doesn't overwrite the name of the same image based on a different architecture, which is already downloaded (described in greater detail in [1]).
Add:
Architecture metadata to the
Imagestruct withVerifyArchitectureMatch()for validating single-architecture images before pulling (needed becauseskopeo inspect --override-archsilently succeeds even when the flag doesn't match a single-arch image's actual architecture)GetSize()andGetSizeHuman()methods onImage, moving the image size computation out of the cmd layer into skopeo where the layer data livesArchitecture-aware
Inspect()with--override-archand--authfilesupport, usingRunContextWithExitCode2()to detect a missing skopeo binary (skopeo is a soft dependency, not required for native containers)CopyOverrideArch()for pulling a specific architecture variant viaskopeo copyinto Podman's local storage — used instead ofpodman pullbecause Podman does not support pulling foreign-architecture images into a locally addressable name (see [2])[1] containers/podman#27780
[2] containers/podman#27780 (comment))