-
Notifications
You must be signed in to change notification settings - Fork 2.2k
CI: switch to GHA for arm #4844
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
So
Not sure where to start debugging it. It appears that criu CI on ubuntu-24.04-arm (GHA) runs just fine since April (checkpoint-restore/criu#2566). Cc: @adrianreber @rst0git |
Correction: it was running fine, but no more. Opened checkpoint-restore/criu#2704 |
@rst0git can you help us here? Here's the situation with ubuntu-24.04-arm and criu:
Here you can find all the c/r logs for a test which timed out: test (ubuntu-24.04-arm, 1.24.x), raw logs Here is a job with criu-dev which doesn't fail: https://github.com/opencontainers/runc/actions/runs/16841823814/job/47714378171?pr=4844 I see there were two patches to Can you take a look? |
373b12e
to
c6b3d23
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
I've updated the OBS and Launchpad packages to 4.1.1 but have not been able to replicate the error yet.
We merged these patches because the 4.1 release fails to compile when building the deb packages. These patches were already included: https://github.com/rst0git/criu-deb-packages/commits/open-build-service/
If I remember correctly, we had to rename |
# (need to compile criu) and don't add much value/coverage. | ||
- criu: criu-dev | ||
go-version: 1.23.x | ||
os: ubuntu-24.04 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why was this exclusion added?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because we want ubuntu-24.04 to be run with criu from the package, but we want ubuntu-24.04-arm to be run with criu-dev (as criu package is not yet working, as described in the commit message).
Yes, I know, it is kind of complicated, I'd rather have criu package fixed.
os: actuated-arm64-6cpu-8gb | ||
- criu: criu-dev | ||
os: actuated-arm64-6cpu-8gb | ||
os: ubuntu-24.04 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And this one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same reason as above
sudo apt -y install criu | ||
- name: install CRIU (criu ${{ matrix.criu }}) | ||
- name: install CRIU (${{ matrix.criu }}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we remove this and use 'else' here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GHA workflow does not have else
for if
. Or do you mean something else? If yes, please show the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think we can use else
in bash
, or else we will see a log like this in CI:
Install CRIU ()
Since GHA now provides ARM, we can switch away from actuated. Many thanks to @alexellis (@self-actuated) for being the sponsor of this project. Signed-off-by: Kir Kolyshkin <[email protected]>
aaand it times out just like before, using criu_4.1.1-1_arm64.deb
OK, I'm finalizing this PR to use criu built from sources on gha arm, and opening an issue with criu. |
Currently, criu package from opensuse build farm times out on GHA arm, so let's only use criu-dev (i.e. compiled from source on CI machine). Once this is fixed, this patch can be reverted. Related to criu issue 2709. Signed-off-by: Kir Kolyshkin <[email protected]>
Yes, I was expecting this. The changes between 4.1 and 4.1.1 are unrelated to this problem. The newer release contains a patch only for checkpoint-restore/criu#2694. I would need to git bisect the criu-dev branch to find the commit fixing this. |
The issue is, when I use the same criu version but compile it from source right there in the CI job, it works. So my gut feeling this is caused by something in your (i.e. opensuse) build environment -- older compiler, some compiler flags, etc. Here's how we compile criu in CI (in this case, using ubuntu-24.04-arm): runc/.github/workflows/test.yml Lines 125 to 131 in 1398ba7
|
@kolyshkin I'm still investigating this. It seems to happen with both OBS and Launchpad packages. |
Thank you, but we're not blocked here (we can use criu-dev for now). The main reason for this PR is to switch away from actuated CI as we've been using it for free all this time. |
For the reference, criu timeout on gha arm is now tracked by checkpoint-restore/criu#2709 |
@lifubang PTAL |
This needs to be backported to the |
Since GHA now provides ARM, we can switch away from actuated.
Many thanks to @alexellis (@self-actuated) for supporting this project.
PS Currently, criu installed from opensuse build farm repo doesn't work
on GHA arm. While we investigate it, let's disable this combination.
Tracked in checkpoint-restore/criu#2709.