Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Refactor init into test-distro #1160

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

dave-tucker
Copy link
Member

@dave-tucker dave-tucker commented Feb 2, 2025

The init module contains a small init system for running our integration tests against a kernel. While we don't need a full-blown linux distro, we do need some utilities.

Once such utility is modprobe which allows us to load kernel modules. Rather than create a new module for this utility, I've instead refactored init into test-distro which is a module that contains multiple binaries.

The xtask code has been adjusted to ensure these binaries are inserted into the correct places in our cpio archive, as well as bringing in the kernel modules.


This change is Reviewable

Copy link

netlify bot commented Feb 2, 2025

Deploy Preview for aya-rs-docs ready!

Name Link
🔨 Latest commit b95318a
🔍 Latest deploy log https://app.netlify.com/sites/aya-rs-docs/deploys/67b8c139bf4f6d00073bccef
😎 Deploy Preview https://deploy-preview-1160--aya-rs-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@mergify mergify bot added the test A PR that improves test cases or CI label Feb 2, 2025
Copy link
Member

@tamird tamird left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is neat!

Reviewed 7 of 9 files at r1, all commit messages.
Reviewable status: 7 of 9 files reviewed, 12 unresolved discussions (waiting on @dave-tucker)


test-distro/src/modprobe.rs line 64 at r1 (raw file):

fn main() {
    let args = Args::parse();

nit: let Args { quiet, name } = Parser::parse();


test-distro/src/modprobe.rs line 68 at r1 (raw file):

    let module = &args.name;

    // Modules are in `/lib/modules/kernel` or `/lib/modules/$(uname -r)/kernel`

wdym "or"?


test-distro/src/modprobe.rs line 99 at r1 (raw file):

        if m.is_some() {
            module_path = Some(m.unwrap());
        }

if let? plz no unwrap

Code quote:

        if m.is_some() {
            module_path = Some(m.unwrap());
        }

test-distro/src/modprobe.rs line 106 at r1 (raw file):

    }

    let module_path = module_path.unwrap();

ditto


test-distro/src/modprobe.rs line 109 at r1 (raw file):

    output!(args.quiet, "Loading module: {}", module_path.display());
    let mut f = std::fs::File::open(module_path.clone()).expect("Failed to open module file");

surely this can be &module_path


test/integration-test/src/tests/modprobe.rs line 7 at r1 (raw file):

use crate::utils::NetNsGuard;

// FIXME: Delete this test before merging.

Why delete it? Having a test might be useful.


test-distro/src/init.rs line 131 at r1 (raw file):

            let status = std::process::Command::new(&path)
                .args(&args)
                .env("PATH", "/sbin:/usr/sbin:/bin:/usr/bin")

🤔 do we have a shell?


.github/workflows/ci.yml line 307 at r1 (raw file):

          set -euxo pipefail
          find test/.tmp -name '*.deb' -print0 | xargs -t -0 -I {} \
            sh -c "dpkg --fsys-tarfile {} | tar -C test/.tmp --wildcards --extract '*vmlinuz*' '**/modules/*' --file -"

can you write some words about what's in modules and why we need it?


xtask/src/run.rs line 302 at r1 (raw file):

                // Our mininal test linux distro contains:
                // - 'init' which runs all binaries in /bin

nit: end list items with a period


xtask/src/run.rs line 319 at r1 (raw file):

                    .find(|(name, _)| name == "modprobe")
                    .map(|(_, path)| path)
                    .ok_or_else(|| anyhow!("modprobe not found"))?;

can we make this a bit stronger? we expect exactly these two binaries.

Code quote:

                let init = distro_binaries
                    .iter()
                    .find(|(name, _)| name == "init")
                    .map(|(_, path)| path)
                    .ok_or_else(|| anyhow!("init not found"))?;

                let modprobe = distro_binaries
                    .iter()
                    .find(|(name, _)| name == "modprobe")
                    .map(|(_, path)| path)
                    .ok_or_else(|| anyhow!("modprobe not found"))?;

xtask/src/run.rs line 364 at r1 (raw file):

                // file /bin/bar path-to-bar  755 0 0

                let mut input = vec![];

why do all this buffering?


xtask/src/run.rs line 556 at r1 (raw file):

}

fn cpioify_dir(dir: &Path, base: &str, input: &mut Vec<u8>) -> Result<()> {

i think this name could be a bit more descriptive. or it can be a closure which would make it a bit easier to read because all the cpio syntax would be in one place

@dave-tucker
Copy link
Member Author

Thanks for the review @tamird. I haven't forgotten to address your comments - I'll do that later this week - but I
pushed another revision since this is working well enough for me locally.

I solved the issues I was having re: module autoloading via the usermode-helper.
This was due to /dev not being correctly set up.
Another issue that surfaced was that my naive attempt at handling module aliases was incorrect.

To solve this we needed yet another binary .depmod is used to create a modules.alias file via decompressing the kmods and reading data from the .modinfo ELF section. This is slow - it adds about 10 seconds to the VM boot as it's run as part of init.

The code is, at best, alpha quality and I know for certain that I can improve it before I'll mark this as ready for review.
There is also the option to run this outside of the VM, since it can be re-used between test runs.

@dave-tucker dave-tucker force-pushed the modprobe branch 2 times, most recently from b313822 to 41d191b Compare February 5, 2025 14:21
@dave-tucker dave-tucker marked this pull request as ready for review February 5, 2025 14:21
Copy link
Member Author

@dave-tucker dave-tucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 1 of 13 files reviewed, 10 unresolved discussions (waiting on @tamird)


.github/workflows/ci.yml line 307 at r1 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

can you write some words about what's in modules and why we need it?

done


test-distro/src/modprobe.rs line 64 at r1 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

nit: let Args { quiet, name } = Parser::parse();

done


test-distro/src/modprobe.rs line 68 at r1 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

wdym "or"?

clarified this in the comment of the resolve_modules_dir fn


test-distro/src/modprobe.rs line 99 at r1 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

if let? plz no unwrap

done


test-distro/src/modprobe.rs line 106 at r1 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

ditto

done


test-distro/src/modprobe.rs line 109 at r1 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

surely this can be &module_path

done


xtask/src/run.rs line 302 at r1 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

nit: end list items with a period

done


xtask/src/run.rs line 319 at r1 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

can we make this a bit stronger? we expect exactly these two binaries.

since we now install init to /init and everything else into /sbin/{file} this is now outdated


xtask/src/run.rs line 364 at r1 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

why do all this buffering?

I've replaced the vec![] with String::new()- so one type of buffer for another.
The other difference this time is that we only call write_all once (with the whole buffer) which should perform better than calling it multiple times (as it was before my PR) or using writeln as I was in r1.

I don't see a way of doing constructing the string that's getting written to the stdin of gen_init_cpio without using a buffer, but I'm open to suggestions.


xtask/src/run.rs line 556 at r1 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

i think this name could be a bit more descriptive. or it can be a closure which would make it a bit easier to read because all the cpio syntax would be in one place

I've made this parse_dir - which arguably isn't much more descriptive - and moved it inline.
It can't be a closure because it's recursive.

It may be simpler to replace with WalkDir::new() (which recursively walks dirs where read_dir() does not) but given how fragile this is I decided to leave it as-is for now.


test/integration-test/src/tests/modprobe.rs line 7 at r1 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

Why delete it? Having a test might be useful.

I've moved the test to smoke.rs and added a description that explains what is being tested.


test-distro/src/init.rs line 131 at r1 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

🤔 do we have a shell?

done

Copy link
Member

@tamird tamird left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 5 of 9 files at r2, 6 of 9 files at r3, all commit messages.
Reviewable status: 7 of 13 files reviewed, 14 unresolved discussions (waiting on @dave-tucker)


.github/workflows/ci.yml line 307 at r1 (raw file):

Previously, dave-tucker (Dave Tucker) wrote…

done

any reason this isn't **/boot/*?


xtask/src/run.rs line 364 at r1 (raw file):

Previously, dave-tucker (Dave Tucker) wrote…

I've replaced the vec![] with String::new()- so one type of buffer for another.
The other difference this time is that we only call write_all once (with the whole buffer) which should perform better than calling it multiple times (as it was before my PR) or using writeln as I was in r1.

I don't see a way of doing constructing the string that's getting written to the stdin of gen_init_cpio without using a buffer, but I'm open to suggestions.

but we're writing to a stdin pipe, which is already buffered. I don't think the buffer here is worth it


test-distro/src/lib.rs line 5 at r3 (raw file):

use nix::sys::utsname::uname;

// Kernel modules are in `/lib/modules`.

doc comment for pub function


test-distro/src/lib.rs line 9 at r3 (raw file):

// or in subdirectory named after the kernel release.
pub fn resolve_modules_dir() -> anyhow::Result<PathBuf> {
    let mut modules_dir = PathBuf::from("/lib/modules");

avoid mut? doesn't seem hard


test-distro/src/lib.rs line 12 at r3 (raw file):

    if !modules_dir.join("kernel").exists() {
        let utsname = uname().expect("Failed to get uname");

return an error?


test-distro/src/lib.rs line 14 at r3 (raw file):

        let utsname = uname().expect("Failed to get uname");
        let release = utsname.release();
        if modules_dir.join(release).exists() {

anyhow::ensure! - and avoid calling join twice


.github/workflows/ci.yml line 309 at r3 (raw file):

          # The wildcard '**/modules/*' extracts kernel modules.
          # Modules are required since not all parts of the kernel we want to
          # test are built-in. See: `grep "=m" boot/config-6.10.9-cloud-amd64`.

I don't understand this fragment.

Code quote:

See: `grep "=m" boot/config-6.10.9-cloud-amd64`

xtask/src/run.rs line 309 at r3 (raw file):

                    .file_name()
                    .and_then(|name| name.to_str())
                    .ok_or_else(|| anyhow!("kernel_image file_name failed"))?;

emit the value of kernel_image


xtask/src/run.rs line 313 at r3 (raw file):

                let modules_dir = kernel_image
                    .parent()
                    .expect("unable to get parent directory from kernel_image")

nit: splitting kernel_image into file_name and everything can be done just once:

        let mut comps = kernel_image.components();
        let file_name = comps.next_back();
        let parent = comps.as_path();

that will allow you to drop this expect. though I wonder if we're doing too much magic here. can we just pass the modules as a separate flag?


test-distro/src/init.rs line 68 at r3 (raw file):

                | nix::mount::MsFlags::MS_NOEXEC
                | nix::mount::MsFlags::MS_RELATIME,
            data: Some("size=10m,nr_inodes=248418,mode=755"),

can you explain some of these choices?

@dave-tucker dave-tucker force-pushed the modprobe branch 2 times, most recently from ae46a58 to c389190 Compare February 5, 2025 18:40
Copy link
Member Author

@dave-tucker dave-tucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 5 of 14 files reviewed, 14 unresolved discussions (waiting on @tamird)


.github/workflows/ci.yml line 307 at r1 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

any reason this isn't **/boot/*?

Fixed


.github/workflows/ci.yml line 309 at r3 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

I don't understand this fragment.

Removed


test-distro/src/lib.rs line 5 at r3 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

doc comment for pub function

Done


test-distro/src/lib.rs line 9 at r3 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

avoid mut? doesn't seem hard

Also done


test-distro/src/lib.rs line 12 at r3 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

return an error?

Done


test-distro/src/lib.rs line 14 at r3 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

anyhow::ensure! - and avoid calling join twice

Done


xtask/src/run.rs line 364 at r1 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

but we're writing to a stdin pipe, which is already buffered. I don't think the buffer here is worth it

good point. fixed.


xtask/src/run.rs line 309 at r3 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

emit the value of kernel_image

done


xtask/src/run.rs line 313 at r3 (raw file):

though I wonder if we're doing too much magic here. can we just pass the modules as a separate flag?

I tend to agree. We're encoding the assumption that the modules directory is in /path/to/vmlinuz../lib/modules or path/to/vmlinux/../usr/lib/modules.

We could shift this to say, you have to provide --modules-dir as part of the args to cargo xtask integration test. That makes the process of running the tests a little more arduous - which is why I went in the opposite direction.

Moreover, we basically have to re-implement this assumption in bash to run the integration tests in CI... but perhaps that's for the best. wdyt?


test-distro/src/init.rs line 68 at r3 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

can you explain some of these choices?

Added a link to where I borrowed this from

Copy link
Member

@tamird tamird left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 4 of 5 files at r6, 1 of 3 files at r7.
Reviewable status: 9 of 14 files reviewed, 21 unresolved discussions (waiting on @dave-tucker)


test-distro/src/lib.rs line 12 at r3 (raw file):

Previously, dave-tucker (Dave Tucker) wrote…

Done

nit: context("uname()") - this phrasing is just hiding the facts


xtask/src/run.rs line 313 at r3 (raw file):

Previously, dave-tucker (Dave Tucker) wrote…

though I wonder if we're doing too much magic here. can we just pass the modules as a separate flag?

I tend to agree. We're encoding the assumption that the modules directory is in /path/to/vmlinuz../lib/modules or path/to/vmlinux/../usr/lib/modules.

We could shift this to say, you have to provide --modules-dir as part of the args to cargo xtask integration test. That makes the process of running the tests a little more arduous - which is why I went in the opposite direction.

Moreover, we basically have to re-implement this assumption in bash to run the integration tests in CI... but perhaps that's for the best. wdyt?

Yeah, we should add a flag. Implementing that assumption in bash seems preferable since it's closer to the truth. You can imagine us wanting to run non-debian kernels in CI at some point; such kernels may be packaged differently and require different logic. The CI scripts are the right place for that.


test-distro/src/init.rs line 68 at r3 (raw file):

Previously, dave-tucker (Dave Tucker) wrote…

Added a link to where I borrowed this from

Gross, this dates all the way back to linuxkit/linuxkit@d29be01 where it was added without detailed explanation.

Would you mind using a permalink please?


.github/workflows/ci.yml line 249 at r7 (raw file):

          echo $(brew --prefix gnu-tar)/libexec/gnubin >> $GITHUB_PATH
          echo $(brew --prefix llvm)/bin >> $GITHUB_PATH
          brew install filosottile/musl-cross/musl-cross

needs a comment - and the commit message should explain why it is newly necessary


xtask/src/run.rs line 44 at r6 (raw file):

        ///
        /// wget --accept-regex '.*/linux-image-[0-9\.-]+-cloud-.*-unsigned*' \
        ///   --recursive http://ftp.us.debian.org/debian/pool/main/l/linux/

does this command work for you after this change?


xtask/src/run.rs line 363 at r6 (raw file):

                // file /bin/bar path-to-bar  755 0 0

                writeln!(stdin, "dir /bin 755 0 0").expect("write");

time to introduce a tiny macro instead of repeating .expect everywhere?


xtask/src/run.rs line 368 at r6 (raw file):

                writeln!(stdin, "dir /lib/modules 755 0 0").expect("write");

                build(Some(&target), |cmd| {

put this back where it was? we want to do this fallible stuff sooner than later


xtask/src/run.rs line 375 at r6 (raw file):

                .for_each(|(name, path)| {
                    if name == "init" {
                        writeln!(stdin, "file /init {} 755 0 0", &path.to_string_lossy())

we can't be doing lossy conversions - like, ever


test-distro/src/lib.rs line 11 at r6 (raw file):

pub fn resolve_modules_dir() -> anyhow::Result<PathBuf> {
    let modules_dir = PathBuf::from("/lib/modules");
    if modules_dir.exists() && modules_dir.is_dir() {

both exists and is_dir call fs::metadata. can we do it in one call?


test-distro/src/lib.rs line 11 at r6 (raw file):

pub fn resolve_modules_dir() -> anyhow::Result<PathBuf> {
    let modules_dir = PathBuf::from("/lib/modules");
    if modules_dir.exists() && modules_dir.is_dir() {

didn't you want to use the kernel subdirectory?


test-distro/src/lib.rs line 19 at r6 (raw file):

    let modules_dir = modules_dir.join(release);
    anyhow::ensure!(
        modules_dir.join(release).exists(),

don't you want to ensure it's a directory?


test-distro/src/lib.rs line 20 at r6 (raw file):

    anyhow::ensure!(
        modules_dir.join(release).exists(),
        "No kernel modules found for release: {}",

why not say "{} doesn't exist", modules_dir.display()?


test-distro/src/depmod.rs line 27 at r6 (raw file):

fn main() -> anyhow::Result<()> {
    let Args { base_dir } = Args::parse();

nit: Parser::parse avoids naming Args again


test-distro/src/depmod.rs line 40 at r6 (raw file):

        .truncate(true)
        .open(modules_dir.join("modules.alias"))
        .context("Failed to open modules.alias file")?;

we should spit out the whole path when we fail like this


test-distro/src/depmod.rs line 54 at r6 (raw file):

                    .file_stem()
                    .expect("a file with no file stem?")
                    .to_string_lossy()

no lossy conversions!


test-distro/src/depmod.rs line 56 at r6 (raw file):

                    .to_string_lossy()
                    .replace(".ko", "");
                let mut f = File::open(path).context("Failed to open module file")?;

needs to say the path


test-distro/src/depmod.rs line 64 at r6 (raw file):

                    read_aliases_from_module(&decompressed, &module_name, &mut aliases)
                } else {
                    let mut buf = Vec::with_capacity(stat.len() as usize);

it doesn't seem necessary to do this either - i believe the object crate supports streaming parsing

https://docs.rs/object/latest/object/read/enum.File.html#method.parse


test-distro/src/depmod.rs line 75 at r6 (raw file):

        }
    }
    let mut f = BufWriter::new(&output);

it's a buffered writer anyway, why buffer in a string?

@dave-tucker dave-tucker force-pushed the modprobe branch 4 times, most recently from 8cf0394 to 47c5f3f Compare February 6, 2025 09:48
Copy link
Member Author

@dave-tucker dave-tucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 6 of 16 files reviewed, 19 unresolved discussions (waiting on @tamird)


.github/workflows/ci.yml line 249 at r7 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

needs a comment - and the commit message should explain why it is newly necessary

ack. please bear with me while i get the tests passing in CI.
once that's the case I'll make sure that code comments and commit messages explain any new requirements


test-distro/src/depmod.rs line 27 at r6 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

nit: Parser::parse avoids naming Args again

done


test-distro/src/depmod.rs line 40 at r6 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

we should spit out the whole path when we fail like this

done


test-distro/src/depmod.rs line 54 at r6 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

no lossy conversions!

done


test-distro/src/depmod.rs line 56 at r6 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

needs to say the path

done


test-distro/src/depmod.rs line 64 at r6 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

it doesn't seem necessary to do this either - i believe the object crate supports streaming parsing

https://docs.rs/object/latest/object/read/enum.File.html#method.parse

I'm not sure it does. ReadRef is only implemented for &[u8]. The File in the signature of parse is not std::File, but object::File which is a very different type.


test-distro/src/depmod.rs line 75 at r6 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

it's a buffered writer anyway, why buffer in a string?

done


test-distro/src/lib.rs line 12 at r3 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

nit: context("uname()") - this phrasing is just hiding the facts

done


test-distro/src/lib.rs line 11 at r6 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

didn't you want to use the kernel subdirectory?

I did, but that was before I needed also /lib/modules/$maybe_uname_r/modules.alias too.
*/kernel/* is part of the glob pattern used for finding the modules inside this directory.


test-distro/src/lib.rs line 11 at r6 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

both exists and is_dir call fs::metadata. can we do it in one call?

done


test-distro/src/lib.rs line 19 at r6 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

don't you want to ensure it's a directory?

done


test-distro/src/lib.rs line 20 at r6 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

why not say "{} doesn't exist", modules_dir.display()?

done


xtask/src/run.rs line 313 at r3 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

Yeah, we should add a flag. Implementing that assumption in bash seems preferable since it's closer to the truth. You can imagine us wanting to run non-debian kernels in CI at some point; such kernels may be packaged differently and require different logic. The CI scripts are the right place for that.

done


xtask/src/run.rs line 44 at r6 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

does this command work for you after this change?

Yes. I ran it in an ubuntu container to verify.
The previous URL (ftp://...) just hung indefinitely.


xtask/src/run.rs line 363 at r6 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

time to introduce a tiny macro instead of repeating .expect everywhere?

done


xtask/src/run.rs line 368 at r6 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

put this back where it was? we want to do this fallible stuff sooner than later

done


xtask/src/run.rs line 375 at r6 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

we can't be doing lossy conversions - like, ever

done


test-distro/src/init.rs line 68 at r3 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

Gross, this dates all the way back to linuxkit/linuxkit@d29be01 where it was added without detailed explanation.

Would you mind using a permalink please?

Would you believe it still works if you use None 😆
Removed the comment and updated the code.

@dave-tucker dave-tucker force-pushed the modprobe branch 4 times, most recently from c7a9477 to cee7747 Compare February 6, 2025 11:24
Copy link
Member

@tamird tamird left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 7 of 11 files at r9, all commit messages.
Reviewable status: 13 of 17 files reviewed, 31 unresolved discussions (waiting on @dave-tucker)


.github/workflows/ci.yml line 249 at r7 (raw file):

Previously, dave-tucker (Dave Tucker) wrote…

ack. please bear with me while i get the tests passing in CI.
once that's the case I'll make sure that code comments and commit messages explain any new requirements

I think I still don't understand. We just need a compiler, right? rust-lld is already a cross linker, and rust already distributes musl with the relevant target. I believe clang is natively a cross compiler, so what's missing?


test-distro/src/depmod.rs line 54 at r6 (raw file):

Previously, dave-tucker (Dave Tucker) wrote…

done

not done


test-distro/src/depmod.rs line 56 at r6 (raw file):

Previously, dave-tucker (Dave Tucker) wrote…

done

please always .display rather than using debug


test-distro/src/lib.rs line 12 at r3 (raw file):

Previously, dave-tucker (Dave Tucker) wrote…

done

not done


xtask/src/run.rs line 44 at r6 (raw file):

Previously, dave-tucker (Dave Tucker) wrote…

Yes. I ran it in an ubuntu container to verify.
The previous URL (ftp://...) just hung indefinitely.

Huh, didn't work for me yesterday but does now. OK.


test-distro/src/init.rs line 68 at r3 (raw file):

Previously, dave-tucker (Dave Tucker) wrote…

Would you believe it still works if you use None 😆
Removed the comment and updated the code.

We still want the comment for these flags, yeah? Unless empty works there too?


xtask/src/run.rs line 55 at r9 (raw file):

        /// `**/modules/*` is used to extract the kernel modules.
        /// Modules are required since not all parts of the kernel we want to
        /// test are built-in. See `grep "=m" boot/config-6.10.9-cloud-amd64`.

I still don't understand this grep


xtask/src/run.rs line 57 at r9 (raw file):

        /// test are built-in. See `grep "=m" boot/config-6.10.9-cloud-amd64`.
        #[clap(short, long, required = true)]
        image: Vec<PathBuf>,

how does this work when you pass multiple images? what do you pass for modules?


xtask/src/run.rs line 218 at r9 (raw file):

            anyhow::ensure!(
                image.len() == modules.len(),

this answers my question from above - can we take images and modules in pairs?


xtask/src/run.rs line 283 at r9 (raw file):

            let mut errors = Vec::new();

nit: spurious newline?


xtask/src/run.rs line 325 at r9 (raw file):

                    cmd.args(["--package", "test-distro", "--profile", "release"])
                })
                .context("building test-distro program failed")?

time to extract "test-distro" and .with_context(|| format!(...)) it in? i think "program" no longer makes sense. probably package now.


xtask/src/run.rs line 373 at r9 (raw file):

                        writeln_stdin!(
                            "file /init {} 755 0 0",
                            &path.to_str().expect("path to str")

hoist out of the if


xtask/src/run.rs line 373 at r9 (raw file):

                        writeln_stdin!(
                            "file /init {} 755 0 0",
                            &path.to_str().expect("path to str")

all these fallible to_str calls can be avoided if you restore OsStrExt and use to_bytes(). you might need to split the writes into separate calls. maybe a closure for dir and one for file are the way to go so you can keep it terse


xtask/src/run.rs line 389 at r9 (raw file):

                // `/init` is slow. It's faster to prepare it here, and it can
                // also be reused later.
                if !(modules_dir.join("modules.alias").exists()) {

do we need to bother with this check? cache invalidation is hard, does this do it properly?


xtask/src/run.rs line 400 at r9 (raw file):

                            "--",
                            "-b",
                            modules_dir.to_str().expect("path to str"),

you don't need this. just do it in a separate .arg call, then you don't need to_str


xtask/src/run.rs line 498 at r9 (raw file):

                // Heed the advice and boot with noapic. We don't know why this happens.
                kernel_args.push(" noapic");
                kernel_args.push(" init=/sbin/init");

how come this was newly needed?


test-distro/src/depmod.rs line 108 at r9 (raw file):

                let end = start + s.size() as usize;
                let sym_data = &data[start..end];
                if let Ok(cstr) = std::ffi::CStr::from_bytes_with_nul(sym_data) {

can we please avoid if let Ok? that's throwing away an error in a way i do not undertand.


test-distro/src/depmod.rs line 109 at r9 (raw file):

                let sym_data = &data[start..end];
                if let Ok(cstr) = std::ffi::CStr::from_bytes_with_nul(sym_data) {
                    let sym_str = cstr.to_string_lossy();

lossy


test-distro/src/lib.rs line 13 at r9 (raw file):

    let stat = modules_dir
        .metadata()
        .with_context(|| format!("{modules_dir:?} doesn't exist"))?;

use .display() rather than the debug impl please


.github/workflows/ci.yml line 315 at r9 (raw file):

        run: |
          set -euxo pipefail
          # Remove old images and modules.

can it be a separate step?

can we put the extracted artifacts outside the cache directory?


test-distro/src/modprobe.rs line 21 at r9 (raw file):

        if !$quiet {
            eprintln!($($arg)*);
            std::process::exit(1);

i'm not a fan of this. can we just pass errors around? you could have a run method at the top that returns an error, and main can swallow the error if need be.


test-distro/src/modprobe.rs line 51 at r9 (raw file):

    let Args { quiet, name } = Parser::parse();

    let mut module = name.to_string();

isn't it already a string?


test-distro/src/modprobe.rs line 63 at r9 (raw file):

    module = resolve_alias(quiet, &modules_dir, &module);

    let pattern = format!("{}/kernel/**/{}.ko*", modules_dir.to_string_lossy(), module);

lossy


test-distro/src/modprobe.rs line 124 at r9 (raw file):

        Ok(alias_file) => alias_file,
        Err(e) => {
            exit_with_error!(quiet, "Failed to open modules.alias file: {}", e);

emit the path


test-distro/src/modprobe.rs line 130 at r9 (raw file):

    for line in alias_file.lines() {
        let line = line.expect("read line");

can we avoid panics?


test-distro/src/modprobe.rs line 133 at r9 (raw file):

        if line.starts_with("alias ") {
            let mut parts = line.split_whitespace();
            let _ = parts.next(); // skip "alias "

check that it says "alias"?


.github/scripts/find_kernels.sh line 1 at r9 (raw file):

#!/usr/bin/env bash

would you consider (having an AI undertake) rewriting this in python? bash is so fickle.

I'm happy to do it if you're not comfortable with python (or with type annotations in python)

@dave-tucker dave-tucker force-pushed the modprobe branch 4 times, most recently from f590222 to 6cd67c0 Compare February 21, 2025 16:08
Copy link
Member Author

@dave-tucker dave-tucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 8 of 18 files reviewed, 30 unresolved discussions (waiting on @tamird and @vadorovsky)


.github/workflows/ci.yml line 249 at r7 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

I think I still don't understand. We just need a compiler, right? rust-lld is already a cross linker, and rust already distributes musl with the relevant target. I believe clang is natively a cross compiler, so what's missing?

I'll not rehash the whole of #908 here, but it's worth checking @vadorovsky's comments in that PR as to why this change is needed.

TL;DR: this PR introduces a dependency on liblzma-sys, which uses a build script to compile libzma (a C library). What I observed was that when using linker=rust-lld build scripts don't pick up the correct compiler (since I assume variables like CC aren't set properly). Using a C compiler driver as linker solves this problem - as well as the problem reported about not being able to find system-wide libraries.

I'm still debugging this on macOS, but I'm fairly certain that using rust-lld doesn't work there either for the reason stated above.


.github/workflows/ci.yml line 315 at r9 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

can it be a separate step?

can we put the extracted artifacts outside the cache directory?

what do you want to achieve by splitting it?

should the artifacts not be cached between runs?


test-distro/src/depmod.rs line 54 at r6 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

not done

sorry. it should now be done. the conversion isn't lossy


test-distro/src/depmod.rs line 56 at r6 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

please always .display rather than using debug

Done


test-distro/src/depmod.rs line 108 at r9 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

can we please avoid if let Ok? that's throwing away an error in a way i do not undertand.

Done


test-distro/src/depmod.rs line 109 at r9 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

lossy

Done


test-distro/src/lib.rs line 12 at r3 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

not done

How exactly would you like this resolved? What's missing from uname().context("failed to get kernel release")?;


test-distro/src/lib.rs line 13 at r9 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

use .display() rather than the debug impl please

Done


test-distro/src/modprobe.rs line 21 at r9 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

i'm not a fan of this. can we just pass errors around? you could have a run method at the top that returns an error, and main can swallow the error if need be.

Done


test-distro/src/modprobe.rs line 51 at r9 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

isn't it already a string?

Done


test-distro/src/modprobe.rs line 63 at r9 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

lossy

Done


test-distro/src/modprobe.rs line 124 at r9 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

emit the path

Done


test-distro/src/modprobe.rs line 130 at r9 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

can we avoid panics?

Done


test-distro/src/modprobe.rs line 133 at r9 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

check that it says "alias"?

Done


xtask/src/run.rs line 55 at r9 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

I still don't understand this grep

Done


xtask/src/run.rs line 218 at r9 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

this answers my question from above - can we take images and modules in pairs?

Its possible, yes. But the bigger question is how exactly would you like this to work?

  • Take a Vec<> of positional arguments and split them in to pairs? That's not going to be backwards compatible with now the tests were run previously. Maybe that's not a concern?
  • Take a Vec<> of positional arguments as before, but have them joined via separator? e.g "/path/to/vmlinuz-foo:/path/to/modules/foo"
  • Something else?

xtask/src/run.rs line 283 at r9 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

nit: spurious newline?

Done


xtask/src/run.rs line 325 at r9 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

time to extract "test-distro" and .with_context(|| format!(...)) it in? i think "program" no longer makes sense. probably package now.

I'm not sure I follow this comment. Error message changed to "package".
But we're only building the "test-distro" package in this step so injecting it with with_context doesn't seem to gain us anything.


xtask/src/run.rs line 373 at r9 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

all these fallible to_str calls can be avoided if you restore OsStrExt and use to_bytes(). you might need to split the writes into separate calls. maybe a closure for dir and one for file are the way to go so you can keep it terse

Done. Just added a couple more macros to handle the write file and write dir cases.


xtask/src/run.rs line 373 at r9 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

hoist out of the if

Done


xtask/src/run.rs line 389 at r9 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

do we need to bother with this check? cache invalidation is hard, does this do it properly?

Done


xtask/src/run.rs line 400 at r9 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

you don't need this. just do it in a separate .arg call, then you don't need to_str

Done


xtask/src/run.rs line 498 at r9 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

how come this was newly needed?

Oops. Reverted.


.github/scripts/find_kernels.sh line 1 at r9 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

would you consider (having an AI undertake) rewriting this in python? bash is so fickle.

I'm happy to do it if you're not comfortable with python (or with type annotations in python)

Done


test-distro/src/init.rs line 68 at r3 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

We still want the comment for these flags, yeah? Unless empty works there too?

Done. Empty worked there too

@dave-tucker dave-tucker force-pushed the modprobe branch 2 times, most recently from 7c14d58 to 2de85e1 Compare February 21, 2025 17:18
@tamird tamird requested a review from vadorovsky February 21, 2025 17:21
Copy link
Member

@tamird tamird left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 8 of 10 files at r10, 1 of 1 files at r11, all commit messages.
Reviewable status: 16 of 18 files reviewed, 28 unresolved discussions (waiting on @dave-tucker and @vadorovsky)


.github/workflows/ci.yml line 315 at r9 (raw file):

Previously, dave-tucker (Dave Tucker) wrote…

what do you want to achieve by splitting it?

should the artifacts not be cached between runs?

The name should be indicative of what the step is doing.


test-distro/src/depmod.rs line 64 at r6 (raw file):

Previously, dave-tucker (Dave Tucker) wrote…

I'm not sure it does. ReadRef is only implemented for &[u8]. The File in the signature of parse is not std::File, but object::File which is a very different type.

Ah you're right.


test-distro/src/lib.rs line 12 at r3 (raw file):

Previously, dave-tucker (Dave Tucker) wrote…

How exactly would you like this resolved? What's missing from uname().context("failed to get kernel release")?;

It's just obscuring what happened which is that uname failed; as in the grandparent comment: context("uname()") would be the right thing in my opinion


xtask/src/run.rs line 218 at r9 (raw file):

Previously, dave-tucker (Dave Tucker) wrote…

Its possible, yes. But the bigger question is how exactly would you like this to work?

  • Take a Vec<> of positional arguments and split them in to pairs? That's not going to be backwards compatible with now the tests were run previously. Maybe that's not a concern?
  • Take a Vec<> of positional arguments as before, but have them joined via separator? e.g "/path/to/vmlinuz-foo:/path/to/modules/foo"
  • Something else?

I think breaking backwards compatibility on the test porcelain is fine. Separating them with a colon is also a nice idea.


xtask/src/run.rs line 325 at r9 (raw file):

Previously, dave-tucker (Dave Tucker) wrote…

I'm not sure I follow this comment. Error message changed to "package".
But we're only building the "test-distro" package in this step so injecting it with with_context doesn't seem to gain us anything.

I was trying to avoid more churn on future renames. This is fine too.


test-distro/src/lib.rs line 13 at r10 (raw file):

    let stat = modules_dir
        .metadata()
        .with_context(|| format!("{} doesn't exist", modules_dir.display()))?;

"doesn't exist" isn't necessary the reason metadata failed.


test-distro/src/lib.rs line 23 at r10 (raw file):

    let stat = modules_dir
        .metadata()
        .with_context(|| format!("{} doesn't exist", modules_dir.display()))?;

ditto


.github/scripts/find_kernels.py line 13 at r10 (raw file):

def find_modules_directory(directory: str, kernel: str) -> str:
    matches = glob.glob(f"{directory}/**/modules/{kernel}", recursive=True)
    if len(matches) == 0:

can we assert it's exactly 1?


.github/scripts/find_kernels.py line 15 at r10 (raw file):

    if len(matches) == 0:
        print(f"ERROR! No modules directory found for kernel {kernel}")
        sys.exit(1)

maybe just raise exception instead of printing and existing?


.github/scripts/find_kernels.py line 26 at r10 (raw file):

        module_dir = find_modules_directory('test/.tmp', image_name)
        modules.append(module_dir[0])
        

weird whitespace going on


.github/scripts/find_kernels.py line 27 at r10 (raw file):

        modules.append(module_dir[0])
        
    if len(images) != len(modules):

this is statically not possible, maybe just print in the loop?


test-distro/src/modprobe.rs line 39 at r10 (raw file):

}

fn main() {

why not return a Result here?


test-distro/src/modprobe.rs line 66 at r10 (raw file):

    let module_path = glob(&pattern)
        .with_context(|| format!("Failed to glob: {}", pattern))?
        .filter_map(Result::ok)

find_map


test-distro/src/modprobe.rs line 97 at r10 (raw file):

    };

    if contents[0..4] != [0x7f, 0x45, 0x4c, 0x46] {

contents.starts_with


test-distro/src/modprobe.rs line 101 at r10 (raw file):

    }

    let res = init_module(&contents, &CString::new("").unwrap());

c""?


xtask/src/run.rs line 6 at r10 (raw file):

    fs::{copy, create_dir_all, OpenOptions},
    io::{BufRead as _, BufReader, Write as _},
    os::unix::prelude::OsStrExt as _,

std::os::unix::ffi::OsStrExt as _ is how it's imported elsewhere

but why move it at all?


xtask/src/run.rs line 54 at r10 (raw file):

        ///
        /// `**/boot/*` is used to extract the kernel image and config.
        /// `**/modules/*` is used to extract the kernel modules.

blank line after this


xtask/src/run.rs line 326 at r10 (raw file):

                })
                .context("building test-distro package failed")?
                .into_iter()

these two lines don't do anything, it's already a vector. i don't think you need the type ascription either


xtask/src/run.rs line 352 at r10 (raw file):

                let mut stdin = stdin.take().unwrap();

                macro_rules! writeln_stdin {

nit: i'd put these macros below the comment that explains the format.

i think the macros here don't get you anything either, they could just be closures. writeln_stdin is never called with more than one argument. it's overly general.

write_stdin should be two separate functions so it's clear one is for directory and the other for files.


xtask/src/run.rs line 448 at r10 (raw file):

                                    modules_dir.display()
                                )
                            })?,

this is identical in both arms. extract? you can defer checking the result.

Code quote:

                            path.strip_prefix(modules_dir).with_context(|| {
                                format!(
                                    "strip prefix {} failed for {}",
                                    path.display(),
                                    modules_dir.display()
                                )
                            })?,

test-distro/src/depmod.rs line 32 at r10 (raw file):

        base_dir
    } else {
        resolve_modules_dir().context("Failed to resolve modules dir")?

inconsistent capitalization: sometimes Failed sometimes failed


test-distro/src/depmod.rs line 57 at r10 (raw file):

                    .into_string()
                    .map_err(|_| anyhow::anyhow!("failed to convert to string"))?
                    .replace(".ko", "");

can you help me understand what this is trying to do? would it be simpler as:

let path = entry.path();

let module_name = path.to_str().ok_or_else(|| anyhow!("{} not valid unicode")?;
let (module_name, compressed) = if let Some(module_name) = module_name.strip_suffix(".xz") {
  (module_name, true)
} else {
  (module_name, false)
};
if let Some(module_name) = module_name.strip_suffix(".ko") {
  ...
  if compressed {
  }
}

Code quote:

                let module_name = path
                    .file_stem()
                    .ok_or(anyhow::anyhow!("failed to get file stem"))?
                    .to_os_string()
                    .into_string()
                    .map_err(|_| anyhow::anyhow!("failed to convert to string"))?
                    .replace(".ko", "");

test-distro/src/depmod.rs line 65 at r10 (raw file):

                if extension == "xz" {
                    let mut decoder = XzDecoder::new(f);
                    let mut decompressed = Vec::with_capacity(stat.len() as usize * 2);

does read_to_end not already preallocate efficiently?


test-distro/src/depmod.rs line 69 at r10 (raw file):

                    read_aliases_from_module(&decompressed, &module_name, &mut output)
                } else {
                    let mut buf = Vec::with_capacity(stat.len() as usize);

this arm could mmap the file and avoid reading at all


test-distro/src/depmod.rs line 89 at r10 (raw file):

    output: &mut BufWriter<&File>,
) -> Result<(), anyhow::Error> {
    let obj = object::read::File::parse(contents).context("not an object file")?;

why over-prescribe the reason? "failed to parse" seems more honest.


test-distro/src/depmod.rs line 94 at r10 (raw file):

        .sections()
        .filter_map(|s| {
            if let Ok(name) = s.name() {

if let ok here


test-distro/src/depmod.rs line 103 at r10 (raw file):

            None
        })
        .next()

find_map?


test-distro/src/depmod.rs line 106 at r10 (raw file):

        .context("no .modinfo section")?;

    obj.symbols()

could this be a loop rather than try_for_each?


test-distro/src/depmod.rs line 116 at r10 (raw file):

                    .context("failed to convert to cstr")?;
                let sym_str = cstr.to_str().context("failed to convert to str")?;
                let alias = sym_str.replace("alias=", "");

does this appear just anywhere, or is it at the front? if the latter then this can be strip_prefix?

The init module contains a small init system for running our integration
tests against a kernel. While we don't need a full-blown linux distro,
we do need some utilities.

Once such utility is `modprobe` which allows us to load kernel modules.
Rather than create a new module for this utility, I've instead
refactored `init` into `test-distro` which is a module that contains
multiple binaries.

The xtask code has been adjusted to ensure these binaries are inserted
into the correct places in our cpio archive, as well as bringing in the
kernel modules.

Signed-off-by: Dave Tucker <[email protected]>
@dave-tucker
Copy link
Member Author

dave-tucker commented Feb 21, 2025

@tamird thanks for the review! and so for bearing with me for quite a few revisions of this PR.
CI should hopefully]be green on that last run and the revision I've just pushed was just to rebase on the latest main.

Today I tested locally macOS (aarch64) and verified that the integration tests ran fine.

I plan to address your remaining comments early next week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
test A PR that improves test cases or CI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants