-
Notifications
You must be signed in to change notification settings - Fork 233
Parallelize memfile and rootfs processing during pause #1646
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| &RootfsDiffCreator{ | ||
| rootfs: s.rootfs, | ||
| closeHook: s.Close, | ||
| }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid closing sandbox before memory export completes
This rootfs diff runs in parallel with the memfile diff, but RootfsDiffCreator ultimately calls rootfs.ExportDiff, and the NBD implementation spawns a goroutine that invokes closeSandbox immediately (see packages/orchestrator/internal/sandbox/rootfs/nbd.go where closeSandbox(ctx) is called before diff export completes). Because pauseProcessMemory uses s.process.ExportMemory, calling s.Close early can stop the FC process while memory export is still running, leading to intermittent failed to export memory errors or incomplete memfile diffs when the rootfs path finishes first. This only happens with rootfs providers that close the sandbox during export, but those are in use here, so the parallelization introduces a real race.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will check this
| originalRootfs.Header(), | ||
| &RootfsDiffCreator{ | ||
| rootfs: s.rootfs, | ||
| closeHook: s.Close, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is running the cleanup in parallel while taking the memory alright?
| cleanup.AddNoContext(egCtx, rootfsDiff.Close) | ||
|
|
||
| return nil | ||
| }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Race condition: sandbox cleanup while memory export runs
The parallelization introduces a race condition. The rootfs processing goroutine passes s.Close as the closeHook to ExportDiff, which spawns a goroutine calling s.Close. This runs s.cleanup.Run() including sbx.Stop, which kills the firecracker process via s.process.Stop(). Meanwhile, the memory processing goroutine is concurrently calling s.process.ExportMemory() to read memory from the same process. Previously these ran sequentially, so memory export completed before s.Close was triggered. This is exactly the concern raised in the PR discussion by @dobrac.
Note
Optimizes snapshot post-processing by executing memory and rootfs diff generation in parallel, reducing pause time.
errgroup.WithContextto runpauseProcessMemoryandpauseProcessRootfsconcurrently and wait viaeg.Wait()buildID; useegCtxfor cancellationgolang.org/x/sync/errgroupand minor refactor around build ID parsing and resource handlingWritten by Cursor Bugbot for commit b03a332. This will update automatically on new commits. Configure here.