ci: move rust tests out of vms onto dedicated server #1384

JakeHillion · 2025-02-19T17:41:17Z

Move Rust tests out of VMs as they don't need to run with a specific kernel, and should run on any dev's machine. Nix'ify the build process and move them to the large machine.

Specific changes:

Move rust-tests to the big self-hosted machine when in the sched-ext org.
Nix'ify the dependencies. This uses a fixed version of cargo, rustc and clippy rather than the moving target of rust-toolchain.toml, but it is still stable.
Remove the dependency on build-kernel so the tests start sooner.
Add a new package.metadata.scx section to describe whether to run clippy, moving this out of the GitHub actions spec and into each package's spec.
Run for the entire workspace instead of a set of named packages. This means we can require rust-tests in the CI instead of requiring each individual package's matrix entry, reducing errors. It also significantly reduces duplicated work.

This runs much faster in repo and avoids repeated work. It takes longer out of repo on the smaller GitHub runners, but still takes far less machine time which seems like a fair compromise.

If we do require a specific kernel for cargo tests in the future, we should either implement this in the test or add a Cargo package.metadata field to opt into that behaviour where necessary, given how much slower tests in a VM run.

Test plan:

This PR's CI.
Ran in my fork with the usual GitHub Actions runners. It works.

Move Rust tests out of VMs as they don't need to run with a specific kernel, and should run on any dev's machine. Nix'ify the build process and move them to the large machine. Specific changes: - Move `rust-tests` to the big self-hosted machine when in the `sched-ext` org. - Nix'ify the dependencies. This uses a fixed version of `cargo`, `rustc` and `clippy` rather than the moving target of `rust-toolchain.toml`, but it is still `stable`. - Remove the dependency on `build-kernel` so the tests start sooner. - Add a new `package.metadata.scx` section to describe whether to run clippy, moving this out of the GitHub actions spec and into each package's spec. - Run for the entire workspace instead of a set of named packages. This means we can require `rust-tests` in the CI instead of requiring each individual package's matrix entry, reducing errors. It also significantly reduces duplicated work. This runs much faster in repo and avoids repeated work. It takes longer out of repo on the smaller GitHub runners, but still takes far less machine time which seems like a fair compromise. If we do require a specific kernel for cargo tests in the future, we should either implement this in the test or add a Cargo `package.metadata` field to opt into that behaviour where necessary, given how much slower tests in a VM run. Test plan: - This PR's CI. - Ran in my fork with the usual GitHub Actions runners. It works.

likewhatevs · 2025-02-19T18:34:13Z

This move is in the wrong direction IMO.

It presumes the following:

Perfect backwards compatibility (tests will never need a VM to be ran). This sets us up for a self-fulling prophecy (tests w/o such compatibility will never be wrote because it makes the bar for writing them dealing w/ nix and re-adding a virtualized test env). This is, IMO, bad.

It removes the following:

Actual reproducibility (like, the kind of reproducibility that actually matters, which nix does not actually provide) -- the ability for someone to copy-paste a couple lines of shell script and run exactly what is ran on CI for the purposes of getting kernel regressions fixed.

These aren't problems for some folks (who know how to like, work with this stuff from scratch, kinda), but I guess my point is that all this complex nix magic is setting the starting point for folks who want to contribute/get things done to "here's some code figure it out" from "yeah just run what CI runs/it's all standard/basic shell commands on ubuntu etc.".

htejun · 2025-02-19T19:56:00Z

We had kernel API breakages in the past will have them in the future too. They shouldn't be frequent but they will be there. Also, being able to target specific kernel versions with CI is great - we also want to use CI to verify kernel side after all. We can't really do perf testing in VMs but keeping correctness testing in VMs makes sense to me.

JakeHillion · 2025-02-20T13:26:33Z

I disagree here but I think the easiest way to show this works is to stack some PRs on top of it demonstrating it. Will follow up with a more concrete stack showing how this can work without compromises (leaving this as a draft for now).

likewhatevs requested a review from htejun February 19, 2025 18:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ci: move rust tests out of vms onto dedicated server #1384

ci: move rust tests out of vms onto dedicated server #1384

JakeHillion commented Feb 19, 2025

likewhatevs commented Feb 19, 2025

htejun commented Feb 19, 2025

JakeHillion commented Feb 20, 2025 •

edited

Loading

ci: move rust tests out of vms onto dedicated server #1384

Are you sure you want to change the base?

ci: move rust tests out of vms onto dedicated server #1384

Conversation

JakeHillion commented Feb 19, 2025

likewhatevs commented Feb 19, 2025

htejun commented Feb 19, 2025

JakeHillion commented Feb 20, 2025 • edited Loading

JakeHillion commented Feb 20, 2025 •

edited

Loading