-
Notifications
You must be signed in to change notification settings - Fork 176
fix(libsinsp): Force CLONE_FILES for threads with missing flag from driver #2715
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: fremmi The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #2715 +/- ##
=======================================
Coverage 76.90% 76.90%
=======================================
Files 296 296
Lines 30875 30879 +4
Branches 4693 4694 +1
=======================================
+ Hits 23745 23749 +4
Misses 7130 7130
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
…river Add definitive thread detection in parse_clone_exit_child() using the Linux kernel's absolute thread definition: pid != tid. When a thread is detected but missing the CLONE_FILES flag, force the flag to ensure proper FD table sharing between threads. Signed-off-by: Francesco Emmi <francesco.as@gmail.com>
73ad4f0 to
dcdb0f3
Compare
|
Hi @fremmi, is this a fix to some issue or why is this change actually needed? |
|
Yes, @terror96 this is a fix for an issue I encountered with file descriptors. The problem happens when PPM_CL_CLONE_THREAD is not set on thread creation events arriving from the driver. When that flag is missing, PPM_CL_CLONE_FILES doesn't get set either at libs/userspace/libsinsp/parsers.cpp Line 1181 in a8fb3d2
This causes threads to use the wrong FD table at libs/userspace/libsinsp/threadinfo.h Line 443 in a8fb3d2
The fix uses pid != tid to detect threads in userspace (which is always correct in Linux) and forces PPM_CL_CLONE_FILES regardless of what the driver provided. This ensures threads share the correct FD table. I don't exclude there may be other consequences I'm missing. I'm also investigating why events sometimes arrive from the driver without the CLONE_THREAD flag in the first place. |
Co-authored-by: Tero Kauppinen <tero.kauppinen@est.tech> Signed-off-by: Francesco Emmi <francesco.as@gmail.com>
Great, thanks for the clarification. Could this also be related to the problem that was posted on Slack regarding a flood of: I don't know this part of the code, but I'll re-trigger the CI tests. |
I think there are good chances the two things could be related |
|
ping @irozzo-1A you were at least reacting on the message in Slack. |
|
Converting temporarily into draft. I'd like to check if there are better solutions addressed in the kernel space |
|
/milestone TBD |
|
@irozzo-1A: The provided milestone is not valid for this repository. Milestones in this repository: [ Use DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
@fremmi If I recall correctly from our offline discussion, the problem here is that the flags we get here from the clone exit child are not reliable. At the moment, we are not checking if I agree it would be better to address the issue on the kernel side, even if it represents more work. We could either:
AFAIU, the tp_btf/sched_process_fork tracepoint was introduced to overcome limitations with Feel free to add color if I missed some important detail. |
What type of PR is this?
/kind bug
Any specific area of the project related to this PR?
/area libsinsp
Does this PR require a change in the driver versions?
What this PR does / why we need it:
Add definitive thread detection in parse_clone_exit_child() using the Linux kernel's absolute thread definition: pid != tid. When a thread is detected but missing the CLONE_FILES flag, force the flag to ensure proper FD table sharing between threads.
Which issue(s) this PR fixes:
Fixes #2761
Special notes for your reviewer:
Does this PR introduce a user-facing change?: