Skip to content

Conversation

@RomanPszonka
Copy link
Contributor

PR Summary: Scenario Lifecycle & UI Enhancements

This PR implements a robust "Stop Scenario" feature, introduces a "Waiting" state for improved execution feedback, and strengthens the underlying async task management.

Key Features

  • Stop Scenario Functionality: Added a "Stop" button in the Web UI that gracefully cancels both main and background execution tasks on the backend.
  • "Waiting" UI State: Introduced a new WAITING status (amber hourglass icon) for queued steps, providing better visual clarity during sequential execution.
  • Python Scenario Discovery: Implemented automatic discovery of Python-based scenarios via pkgutil, allowing them to be listed and run seamlessly alongside YAML definitions.

Improvements & Bug Fixes

  • Async Task Management: Refactored SessionManager to properly handle asyncio.CancelledError, ensuring clean exits without misreporting "Unexpected Errors" when a user stops a run.
  • SSE Connection Management: Fixed a client-side resource leak by explicitly closing EventSource connections in the useScenarioRunner hook.
  • Resource Cleanup: Added a cleanup_flight_declarations step to scenarios to remove "zombie" declarations from previous failed runs.
  • Temporal Synchronization: Synchronized flight declaration start times with telemetry generation using a shared reference_time.
  • Dev-Server Stability: Updated Vite proxy settings to include all async and event-based API endpoints.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements a comprehensive "Stop Scenario" feature, introduces a "Waiting" state for improved execution feedback, and adds Python-based scenario discovery. The changes span both frontend and backend with significant enhancements to async task management and lifecycle control.

Changes:

  • Added Stop Scenario functionality with backend endpoint and frontend UI button that cancels running scenarios and background tasks
  • Introduced WAITING status to provide better visual feedback for queued steps in the UI
  • Implemented automatic discovery of Python-based scenarios using pkgutil, allowing them to run alongside YAML definitions
  • Enhanced async task cancellation handling with proper CancelledError propagation throughout the stack
  • Synchronized flight declaration and telemetry start times using a shared reference_time parameter
  • Added cleanup_flight_declarations method to remove zombie declarations from previous failed runs
  • Fixed EventSource resource leaks by explicitly closing connections in React hooks

Reviewed changes

Copilot reviewed 22 out of 22 changed files in this pull request and generated 27 comments.

Show a summary per file
File Description
web-editor/vite.config.ts Added proxy routes for new async endpoints
web-editor/src/utils/layoutConfig.ts Reduced node spacing for more compact layouts
web-editor/src/types/scenario.ts Added 'waiting' status type
web-editor/src/styles/Node.module.css Added styling for waiting status nodes
web-editor/src/hooks/useScenarioRunner.ts Implemented stopScenario function and EventSource cleanup
web-editor/src/components/ScenarioEditor/tests/Header.test.tsx Updated test props to include onStop
web-editor/src/components/ScenarioEditor/Header.tsx Added Stop button UI component
web-editor/src/components/ScenarioEditor/CustomNode.tsx Added Hourglass icon for waiting status
web-editor/src/components/ScenarioEditor.tsx Integrated stop scenario functionality
tests/test_client_steps.py Updated test to verify reference_time parameter
src/openutm_verification/server/runner.py Implemented stop_scenario method with task cancellation and CancelledError handling
src/openutm_verification/server/main.py Added /stop-scenario endpoint and improved SSE stream completion logic
src/openutm_verification/scenarios/test_traffic_and_telemetry.py New Python scenario demonstrating concurrent traffic and telemetry submission
src/openutm_verification/scenarios/common.py Added reference_time parameter to generate_telemetry
src/openutm_verification/core/reporting/reporting_models.py Added WAITING status to enum
src/openutm_verification/core/execution/scenario_runner.py Added update_result method and CancelledError handling
src/openutm_verification/core/execution/execution.py Implemented Python scenario discovery using pkgutil
src/openutm_verification/core/execution/conditions.py Modified success() to include RUNNING status
src/openutm_verification/core/clients/flight_blender/flight_blender_client.py Multiple enhancements: cleanup method, cancellation handling, time synchronization, context manager improvements
scenarios/F1_happy_path.yaml Added cleanup step at scenario start
pytest.ini Added testpaths configuration

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.


except Exception as e:
logger.error(f"Error during flight declaration cleanup: {e}")
return {"cleaned": False, "error": str(e)}
Copy link

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cleanup method returns a dict indicating success/failure. However, as a @scenario_step decorated method, it will be wrapped to return a StepResult. The return dict becomes the result field of the StepResult. The status will be PASS unless an exception is raised, which means even if cleaned: False is returned, the step will show as PASS. Consider raising an exception if cleanup fails critically, or document that non-fatal cleanup failures are acceptable and won't fail the step.

Suggested change
return {"cleaned": False, "error": str(e)}
raise FlightBlenderError(f"Error during flight declaration cleanup: {e}") from e

Copilot uses AI. Check for mistakes.
mock_gen_fd.assert_called_with("fd_path")
mock_gen_tel.assert_called_with("traj_path")
mock_context.set_flight_declaration_data.assert_called_with({"fd": "data"})
mock_gen_tel.assert_called_with("traj_path", reference_time=ANY)
Copy link

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using ANY for the reference_time parameter is appropriate here since the exact value depends on when the test runs. However, consider adding an assertion to verify that the reference_time argument is actually passed and is a valid ISO format datetime string, rather than just checking that something was passed. This would catch potential issues with the time synchronization logic.

Copilot uses AI. Check for mistakes.
Comment on lines +35 to +44
def _import_python_scenarios():
"""Import all python scenarios to populate the registry."""
path = list(openutm_verification.scenarios.__path__)
prefix = openutm_verification.scenarios.__name__ + "."

for _, name, _ in pkgutil.iter_modules(path, prefix):
try:
importlib.import_module(name)
except Exception as e:
logger.warning(f"Failed to import scenario module {name}: {e}")
Copy link

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The _import_python_scenarios() function implements automatic discovery of Python scenario modules, which is a significant new feature. However, there's no test coverage for this functionality. Consider adding tests that verify: 1) Python scenarios are discovered and registered correctly, 2) Import errors are handled gracefully, 3) The registry is populated with the correct scenario metadata.

Copilot uses AI. Check for mistakes.
Comment on lines +993 to +995
result = await self.setup_flight_declaration(self.flight_declaration_path, self.trajectory_path)
if result.status == Status.FAIL:
raise FlightBlenderError(f"Setup Flight Declaration failed: {result.error_message}")
Copy link

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return type annotation on line 993 indicates setup_flight_declaration returns a result object with a status attribute, but the method signature declares it returns None (line 946, outside the diff). This works at runtime because the @scenario_step decorator wraps the method to return a StepResult, but the type annotations are inconsistent. Consider updating the return type annotation of methods decorated with @scenario_step to reflect they return StepResult[Any], or update the decorator to preserve proper typing.

Copilot uses AI. Check for mistakes.
Comment on lines +525 to +529
# If dependency already completed and recorded, continue ONLY if not RUNNING
if self.session_context and self.session_context.state and dep_id in self.session_context.state.step_results:
continue
# Check status
if self.session_context.state.step_results[dep_id].status != Status.RUNNING:
continue
Copy link

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic here seems inverted or incomplete. The code checks if a dependency is in step_results and if its status is NOT RUNNING, it continues (skips waiting). However, this means if a dependency exists with status RUNNING, it falls through to check if the task exists in session_tasks. But if a result already exists with status RUNNING, shouldn't we still wait for the task to complete? The logic should probably be: "If dependency is in step_results AND its status is a terminal state (PASS/FAIL/SKIP), continue. Otherwise, wait for the task." The current logic might cause race conditions where steps proceed while their dependencies are still running.

Copilot uses AI. Check for mistakes.
Comment on lines +43 to +44
except Exception as e:
logger.warning(f"Failed to import scenario module {name}: {e}")
Copy link

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error handling here logs a warning but continues importing other modules. While this is generally good for resilience, consider if there are scenarios where a failed import should halt execution (e.g., if a critical scenario module has syntax errors). Also, it would be helpful to distinguish between expected import issues (e.g., missing optional dependencies) and unexpected ones (e.g., syntax errors in scenario code).

Suggested change
except Exception as e:
logger.warning(f"Failed to import scenario module {name}: {e}")
except ModuleNotFoundError as e:
# Likely an optional dependency or scenario package that is not installed.
logger.warning(f"Optional dependency missing for scenario module {name}: {e}")
except Exception as e:
# Unexpected error (e.g., syntax error or runtime error in scenario code) - fail fast.
logger.error(f"Unexpected error while importing scenario module {name}; aborting verification run: {e}")
raise

Copilot uses AI. Check for mistakes.
yield f"data: {result.model_dump_json()}\n\n"

if status_payload.get("status") != "running":
if status_payload.get("status") != "running" and not runner.has_pending_tasks():
Copy link

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition now checks both that the status is not "running" AND that there are no pending tasks before sending the completion event. This is a good improvement for handling background tasks. However, consider if there's a race condition: what if has_pending_tasks() becomes false between the check and the event being sent, but then a new background task is immediately started? Also, document the expected behavior: should the stream remain open if there are pending tasks even after the main scenario completes?

Suggested change
if status_payload.get("status") != "running" and not runner.has_pending_tasks():
# Only send the final "done" event once the scenario is no longer running
# AND there are no pending background tasks. The stream is expected to
# remain open while background work is still in progress, even after
# the main scenario has completed. We re-check for pending tasks
# immediately before sending "done" to reduce race conditions.
if status_payload.get("status") != "running":
if runner.has_pending_tasks():
await asyncio.sleep(0.3)
continue

Copilot uses AI. Check for mistakes.
@RomanPszonka RomanPszonka marked this pull request as ready for review January 27, 2026 09:18
@RomanPszonka RomanPszonka merged commit 37bffae into main Jan 27, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant