Skip to content

Conversation

@misanjumn
Copy link

@misanjumn misanjumn commented Sep 29, 2025

Fix multiprocessing compatibility for Python 3.14

Avoid passing bound VirtTest.runTest method to multiprocessing.Process, because Python 3.14 no longer allows pickling objects containing locks or queues.
Use a small wrapper function that creates the VirtTest instance inside the child process. This preserves behavior and ensures cross-process safety.

Signed-off-by: Misbah Anjum N misanjum@linux.ibm.com

Summary by CodeRabbit

  • Bug Fixes

    • Corrected and clarified the cancellation message shown when attempting to run VT tests in parallel: “parallel run is not allowed for vt tests.”
  • Refactor

    • Introduced a lightweight wrapper to launch VT tests in a separate process, improving isolation and reliability for multiprocessing scenarios without changing commands or configuration.

@coderabbitai
Copy link

coderabbitai bot commented Sep 29, 2025

📝 Walkthrough

Walkthrough

Adds a top-level wrapper function run_vt_test(queue, runnable) to execute a VT test in a subprocess, updates VTTestRunner.run to use that wrapper as the multiprocessing.Process target, and fixes a parallel-run cancellation message string; queue/polling/process flow is unchanged.

Changes

Cohort / File(s) Summary
VT test runner subprocess wrapper
avocado_vt/plugins/vt_runner.py
Added run_vt_test(queue, runnable) wrapper that calls the runnable's runTest method. Modified VTTestRunner.run to start a multiprocessing.Process with (queue, self.runnable) as args and the new wrapper as target. Fixed concatenated cancellation message to "parallel run is not allowed for vt tests" and added a comment explaining the wrapper usage.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Pre-merge checks

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title “Fix multiprocessing compatibility for Python 3.14” concisely and accurately captures the primary change introduced by the pull request, which is to adjust the multiprocessing invocation to work around Python 3.14’s stricter pickling rules. It is clear, specific, and focused on the main objective without extraneous detail.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c879ec9 and 81963c3.

📒 Files selected for processing (1)
  • avocado_vt/plugins/vt_runner.py (2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Static checks

@misanjumn
Copy link
Author

Reference: https://docs.python.org/3.14/whatsnew/3.14.html

Python 3.14 tightened pickling rules, so bound methods capturing self with Queue or Lock fail.
The wrapper function (run_vt_test) only passes simple, picklable arguments (queue, runnable).
The VirtTest instance is now created inside the child process, so Python never needs to pickle it.
This approach is clean, safe, and forward-compatible for newer Python versions.

Before Patch

JOB ID     : 3e3abf0d9d14a431dd6a54b41b98258c44f101ad
JOB LOG    : /home/misanjumn/results/job-2025-09-29T07.42-3e3abf0/job.log
 (1/2) io-github-autotest-qemu.unattended_install.import.import.default_install.aio_native: STARTED
 (1/2) io-github-autotest-qemu.unattended_install.import.import.default_install.aio_native:  ERROR (0.00 s)
 (2/2) io-github-autotest-libvirt.remove_guest.without_disk: STARTED
 (2/2) io-github-autotest-libvirt.remove_guest.without_disk:  ERROR (0.00 s)
RESULTS    : PASS 0 | ERROR 2 | FAIL 0 | SKIP 0 | WARN 0 | INTERRUPT 0 | CANCEL 0
JOB HTML   : /home/misanjumn/results/job-2025-09-29T07.42-3e3abf0/results.html
JOB TIME   : 5.50 s

[stderr] Traceback (most recent call last):
[stderr]   File "/usr/local/lib/python3.14/site-packages/avocado_vt/plugins/vt_runner.py", line 163, in run
[stderr]     process.start()
[stderr]     ~~~~~~~~~~~~~^^
[stderr]   File "/usr/lib64/python3.14/multiprocessing/process.py", line 121, in start
[stderr]     self._popen = self._Popen(self)
[stderr]                   ~~~~~~~~~~~^^^^^^
[stderr]   File "/usr/lib64/python3.14/multiprocessing/context.py", line 224, in _Popen
[stderr]     return _default_context.get_context().Process._Popen(process_obj)
[stderr]            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^
[stderr]   File "/usr/lib64/python3.14/multiprocessing/context.py", line 288, in _Popen
[stderr]     return Popen(process_obj)
[stderr]   File "/usr/lib64/python3.14/multiprocessing/popen_spawn_posix.py", line 32, in __init__
[stderr]     super().__init__(process_obj)
[stderr]     ~~~~~~~~~~~~~~~~^^^^^^^^^^^^^
[stderr]   File "/usr/lib64/python3.14/multiprocessing/popen_fork.py", line 20, in __init__
[stderr]     self._launch(process_obj)
[stderr]     ~~~~~~~~~~~~^^^^^^^^^^^^^
[stderr]   File "/usr/lib64/python3.14/multiprocessing/popen_spawn_posix.py", line 47, in _launch
[stderr]     reduction.dump(process_obj, fp)
[stderr]     ~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
[stderr]   File "/usr/lib64/python3.14/multiprocessing/reduction.py", line 60, in dump
[stderr]     ForkingPickler(file, protocol).dump(obj)
[stderr]     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^
[stderr] TypeError: cannot pickle '_thread.lock' object
[stderr] when serializing dict item 'mutex'
[stderr] when serializing queue.Queue state
[stderr] when serializing queue.Queue object
[stderr] when serializing dict item 'error_events'
[stderr] when serializing virttest.error_event.EventBus state
[stderr] when serializing virttest.error_event.EventBus object
[stderr] when serializing dict item 'background_errors'
[stderr] when serializing avocado_vt.plugins.vt_runner.VirtTest state
[stderr] when serializing avocado_vt.plugins.vt_runner.VirtTest object
[stderr] when serializing tuple item 0
[stderr] when serializing method reconstructor arguments
[stderr] when serializing method object
[stderr] when serializing dict item '_target'
[stderr] when serializing multiprocessing.context.Process state
[stderr] when serializing multiprocessing.context.Process object

After patch - Python3.14

JOB ID     : 1eac3ae921ae6f91400ec39d1988a068f8bedb93
JOB LOG    : /home/misanjumn/results/job-2025-09-29T08.10-1eac3ae/job.log
 (1/2) io-github-autotest-qemu.unattended_install.import.import.default_install.aio_native: STARTED
 (1/2) io-github-autotest-qemu.unattended_install.import.import.default_install.aio_native:  PASS (40.64 s)
 (2/2) io-github-autotest-libvirt.remove_guest.without_disk: STARTED
 (2/2) io-github-autotest-libvirt.remove_guest.without_disk:  PASS (4.03 s)
RESULTS    : PASS 2 | ERROR 0 | FAIL 0 | SKIP 0 | WARN 0 | INTERRUPT 0 | CANCEL 0
JOB HTML   : /home/misanjumn/results/job-2025-09-29T08.10-1eac3ae/results.html
JOB TIME   : 53.83 s

After patch - Python3.13

JOB ID     : 7fa3ea049a6f3f4cc975fe9a4e79713e1a28e9b3
JOB LOG    : /home/kvmci/results/job-2025-09-29T08.23-7fa3ea0/job.log
 (1/2) io-github-autotest-qemu.unattended_install.import.import.default_install.aio_native: STARTED
 (1/2) io-github-autotest-qemu.unattended_install.import.import.default_install.aio_native:  PASS (66.00 s)
 (2/2) io-github-autotest-libvirt.remove_guest.without_disk: STARTED
 (2/2) io-github-autotest-libvirt.remove_guest.without_disk:  PASS (3.85 s)
RESULTS    : PASS 2 | ERROR 0 | FAIL 0 | SKIP 0 | WARN 0 | INTERRUPT 0 | CANCEL 0
JOB HTML   : /home/kvmci/results/job-2025-09-29T08.23-7fa3ea0/results.html
JOB TIME   : 78.65 s

Avoid passing bound VirtTest.runTest method to multiprocessing.Process,
because Python 3.14 no longer allows pickling objects containing
locks or queues. Use a small wrapper function that creates the VirtTest
instance inside the child process. This preserves behavior and ensures
cross-process safety.

Signed-off-by: Misbah Anjum N <misanjumn@ibm.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
avocado_vt/plugins/vt_runner.py (2)

117-123: Improve failure signaling and satisfy Ruff BLE001.

Catching a broad Exception here is pragmatic; add an inline Ruff suppression and enrich the FinishedMessage for easier triage.

Apply:

-    except Exception:  # pylint: disable=broad-exception-caught
-        queue.put(messages.StderrMessage.get(traceback.format_exc()))
-        queue.put(messages.FinishedMessage.get("error"))
+    except Exception:  # pylint: disable=broad-exception-caught  # noqa: BLE001
+        tb = traceback.format_exc()
+        queue.put(messages.StderrMessage.get(tb))
+        queue.put(
+            messages.FinishedMessage.get(
+                "error",
+                fail_reason="VirtTest initialization failed",
+                class_name="VirtTest",
+                fail_class="Exception",
+                traceback=tb,
+            )
+        )
         return

This keeps behavior while improving diagnostics and clearing the static‑analysis warning. [Based on static analysis hints]


174-183: Avoid hangs/zombies: check child liveness and join after finish.

If the child exits unexpectedly before emitting a FinishedMessage, the loop can spin forever; also, the process is never joined. Add a dead‑child check and always reap the process after breaking.

-                while True:
+                while True:
                     time.sleep(RUNNER_RUN_CHECK_INTERVAL)
-                    if queue.empty():
-                        yield messages.RunningMessage.get()
+                    if queue.empty():
+                        if not process.is_alive():
+                            yield messages.StderrMessage.get(
+                                "avocado-vt: child process exited without sending a FinishedMessage"
+                            )
+                            yield messages.FinishedMessage.get("error")
+                            break
+                        yield messages.RunningMessage.get()
                     else:
                         message = queue.get()
                         yield message
                         if message.get("status") == "finished":
                             break
+                # Reap the child process to avoid zombies
+                process.join(timeout=5)
+                if process.is_alive():
+                    process.terminate()
+                    process.join(timeout=1)

This keeps the polling model but prevents indefinite waits and cleans up the worker.

Run a quick kill test (send SIGKILL to the child) to confirm the runner now exits with an error instead of hanging.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 81963c3 and 36487f6.

📒 Files selected for processing (1)
  • avocado_vt/plugins/vt_runner.py (2 hunks)
🧰 Additional context used
🪛 Ruff (0.13.1)
avocado_vt/plugins/vt_runner.py

119-119: Do not catch blind exception: Exception

(BLE001)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: fedora_40 AVOCADO_SRC:avocado-framework<104.0 SETUP:-m pip install . VT_TYPE:qemu
  • GitHub Check: Static checks
🔇 Additional comments (2)
avocado_vt/plugins/vt_runner.py (2)

164-165: Cancel message fix looks good.

String now reads correctly and matches expected nrunner semantics.


168-173: Correct: use a top‑level wrapper as Process target (Py 3.14 compatible).

Switching to run_vt_test avoids pickling a bound method capturing non‑picklables. This addresses the Python 3.14 regression.

Please confirm the two previously failing tests still pass under Python 3.13 and 3.14 after this change.

Copy link
Contributor

@harihare harihare left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants