Skip to content

Conversation

@mawad-amd
Copy link
Member

No description provided.

@mawad-amd mawad-amd requested a review from Copilot November 3, 2025 06:49
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors the optimization tracking system and improves error handling in the IntelliPerf GPU kernel optimization tool. The changes centralize best-iteration tracking logic into the OptimizationTracker class and enhance process monitoring/timeout handling.

  • Extracted duplicate kernel file finding logic into a shared find_kernel_file() method
  • Refactored OptimizationTracker to automatically calculate improvement metrics and determine success
  • Improved timeout handling with dynamic timeouts based on baseline performance and better error messages for crashes vs timeouts

Reviewed Changes

Copilot reviewed 7 out of 8 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
formula_base.py Added find_kernel_file() helper, enhanced OptimizationTracker with auto-calculation of improvement metrics, added compiler error filtering, improved Accordo timeout/crash detection
memory_access.py Removed duplicate tracking code, delegated to centralized OptimizationTracker methods, fixed metric names (latency→lat), updated failure metrics to use baseline values
bank_conflict.py Aligned with refactored tracker, changed maximize=False for minimize-type metrics, removed manual best-tracking code
atomic_contention.py Similar refactoring as memory_access and bank_conflict, added terminal access documentation to signature
swizzling.py Used shared find_kernel_file() helper, improved error message handling for iterations
communicate.py Enhanced IPC timeout logic with dynamic timeouts, added process crash detection, reduced debug log spam
pyproject.toml Lowered Python requirement from 3.10 to 3.9
.gitignore Added .rocprofv3/ to ignored files

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +142 to +145
if not self.maximize:
improvement = before / after if after != 0 else 1.0
else:
improvement = after / before if after != 0 else 1.0
Copy link

Copilot AI Nov 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The improvement calculation logic is identical for both branches. When maximize=False (minimize raw metric), improvement should be before / after (e.g., 10 conflicts → 5 conflicts = 10/5 = 2x improvement). When maximize=True (maximize raw metric), improvement should also be after / before (e.g., 50% → 90% coalescing = 90/50 = 1.8x improvement). However, both branches calculate the same formula, making the conditional redundant. The logic appears correct but the code structure suggests they should differ.

Copilot uses AI. Check for mistakes.
Comment on lines +205 to +206
# With proper improvement calculation, we always want higher improvement values
if new_val > cur_val:
Copy link

Copilot AI Nov 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment 'we always want higher improvement values' contradicts the maximize parameter in the tracker initialization. The code now ignores the maximize flag when selecting the best step, always preferring higher values. This is inconsistent with the tracker's design where maximize=False indicates we're minimizing the raw metric (though improvement ratio should still be maximized). Consider clarifying whether maximize applies to the raw metric or the improvement ratio.

Suggested change
# With proper improvement calculation, we always want higher improvement values
if new_val > cur_val:
# Select best step based on maximize flag: prefer higher values if maximizing, lower if minimizing
if (self.maximize and new_val > cur_val) or (not self.maximize and new_val < cur_val):

Copilot uses AI. Check for mistakes.
max_iterations=self.num_attempts,
primary_metric="coal_improvement",
maximize=True,
maximize=True, # True = maximize raw metric (coalescing %)
Copy link

Copilot AI Nov 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The inline comment states 'True = maximize raw metric (coalescing %)', but based on the add_step logic in formula_base.py, the maximize parameter doesn't actually affect the improvement calculation since both branches compute the same formula. This comment may mislead readers about the parameter's effect.

Suggested change
maximize=True, # True = maximize raw metric (coalescing %)
maximize=True, # Currently has no effect; improvement is always calculated as optimized_coal / unoptimized_coal

Copilot uses AI. Check for mistakes.
max_iterations=self.num_attempts,
primary_metric="conflict_improvement",
maximize=True,
maximize=False, # False = minimize raw metric (conflicts)
Copy link

Copilot AI Nov 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to memory_access.py, this comment suggests maximize=False has an effect on improvement calculation, but the actual code in formula_base.py line 142-145 shows both branches calculate the same formula. The comment should clarify that maximize is not currently used in improvement calculations.

Suggested change
maximize=False, # False = minimize raw metric (conflicts)
maximize=False, # Note: 'maximize' is not currently used in improvement calculations; improvement is always calculated as unoptimized_conflicts / optimized_conflicts.

Copilot uses AI. Check for mistakes.
max_iterations=self.num_attempts,
primary_metric="latency_improvement",
maximize=True,
maximize=False, # False = minimize raw metric (latency)
Copy link

Copilot AI Nov 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue as in bank_conflict.py and memory_access.py - the maximize parameter doesn't affect the improvement calculation as both branches in formula_base.py compute identical formulas. The comment should be updated to reflect the actual behavior.

Copilot uses AI. Check for mistakes.
Comment on lines +56 to +62
thread = threading.Thread(target=target, daemon=True)
thread.start()
thread.join(timeout=timeout_seconds)

if thread.is_alive():
# Thread is still running - timeout occurred
raise TimeoutError(f"Operation timed out after {timeout_seconds} seconds")
Copy link

Copilot AI Nov 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using daemon threads with timeout creates a resource leak - if the timeout occurs, the daemon thread continues running in the background and cannot be stopped (the function it's executing may hold resources). Consider using a non-daemon thread with proper cleanup, or document that this is acceptable for the IPC operations being performed.

Copilot uses AI. Check for mistakes.
time_after_ms = None
if metrics and "optimized_coal" in metrics:
# Only use the measured value if it actually improved (maximize=True, so higher is better)
if metrics["optimized_coal"] > self.baseline_coalesced_pct:
Copy link

Copilot AI Nov 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition checks if optimized_coal > baseline_coalesced_pct, but baseline_coalesced_pct is the same as unoptimized_coal (both represent the initial measurement). The comment on line 733 says 'Only use the measured value if it actually improved (maximize=True, so higher is better)', but the comparison should be against metrics['unoptimized_coal'] from the best step, not the baseline, to properly validate improvement in the best optimization attempt.

Suggested change
if metrics["optimized_coal"] > self.baseline_coalesced_pct:
if metrics["optimized_coal"] > metrics.get("unoptimized_coal", self.baseline_coalesced_pct):

Copilot uses AI. Check for mistakes.
Comment on lines +828 to +829
if metrics["optimized_conflicts"] < self.baseline_bank_conflicts:
metric_after = metrics["optimized_conflicts"]
Copy link

Copilot AI Nov 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar issue to memory_access.py - this compares against baseline_bank_conflicts instead of the unoptimized value from the best step. If there were multiple iterations, the best step's unoptimized_conflicts may differ from the initial baseline, making this comparison incorrect for determining whether the best step actually improved.

Copilot uses AI. Check for mistakes.
time_after_ms = None
if metrics and "optimized_lat" in metrics:
# Only use the measured value if it actually improved
if metrics["optimized_lat"] < self.baseline_atomic_latency:
Copy link

Copilot AI Nov 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue as memory_access.py and bank_conflict.py - comparing against baseline instead of the best step's unoptimized value. Should compare metrics['optimized_lat'] < metrics.get('unoptimized_lat', self.baseline_atomic_latency) to properly validate improvement in the best step.

Suggested change
if metrics["optimized_lat"] < self.baseline_atomic_latency:
if metrics["optimized_lat"] < metrics.get("unoptimized_lat", self.baseline_atomic_latency):

Copilot uses AI. Check for mistakes.
@mawad-amd mawad-amd merged commit f98f6a1 into main Nov 3, 2025
7 of 8 checks passed
@mawad-amd mawad-amd deleted the muhaawad/unify-counters-report branch November 3, 2025 07:00
mawad-amd added a commit that referenced this pull request Nov 3, 2025
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
mawad-amd added a commit that referenced this pull request Nov 3, 2025
- Add OptimizationTracker helper methods: is_successful(), get_best_code(), get_best_report(), get_best_metrics()
- Add auto-determination of success based on improvement and speedup
- Add _filter_compiler_errors() static method for cleaner compiler output
- Add process crash detection in Accordo IPC with process_pid
- Add enhanced timeout error messages with baseline context
- Maintain all Accordo API improvements from feature branch
mawad-amd added a commit that referenced this pull request Nov 3, 2025
**New Accordo API (clean abstraction):**
- New Accordo module structure with public API (__init__, config, result, snapshot, exceptions, validator)
- Internal implementation (_internal/codegen, hip_interop, ipc/communication)
- Snapshot-based caching: capture reference once, compare multiple optimized versions
- AccordoValidator with proper timeout handling and error detection
- KernelArg dataclass for cleaner argument specification

**Formula changes (preserves all main improvements):**
- Updated correctness_validation_pass() to use new Accordo API with caching
- Added write_and_log_optimized_code() to centralize iteration logging
- All formulas use write_and_log_optimized_code() for immediate LLM output capture
- find_kernel_file() already in main - formulas use it correctly

**Infrastructure:**
- Logger.save_iteration_code() for centralized iteration file management
- __main__.py passes trace_path to logger
- All examples/*.hip updated with hip_try macro for error handling
- .gitignore: added .rocprofv3/

**Preserved from main ("Unify counters report" #155):**
- OptimizationTracker helper methods (is_successful, get_best_code, etc.)
- Auto-success determination logic
- Proper tie-breaker with speedup
- _filter_compiler_errors() method
- Correct failure metrics (speedup 0.0 for build, 1.0 for correctness)
- Correct performance_validation flow (add_step first, query values, build report)
mawad-amd added a commit that referenced this pull request Nov 8, 2025
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
mawad-amd added a commit that referenced this pull request Nov 8, 2025
**New Accordo API (clean abstraction):**
- New Accordo module structure with public API (__init__, config, result, snapshot, exceptions, validator)
- Internal implementation (_internal/codegen, hip_interop, ipc/communication)
- Snapshot-based caching: capture reference once, compare multiple optimized versions
- AccordoValidator with proper timeout handling and error detection
- KernelArg dataclass for cleaner argument specification

**Formula changes (preserves all main improvements):**
- Updated correctness_validation_pass() to use new Accordo API with caching
- Added write_and_log_optimized_code() to centralize iteration logging
- All formulas use write_and_log_optimized_code() for immediate LLM output capture
- find_kernel_file() already in main - formulas use it correctly

**Infrastructure:**
- Logger.save_iteration_code() for centralized iteration file management
- __main__.py passes trace_path to logger
- All examples/*.hip updated with hip_try macro for error handling
- .gitignore: added .rocprofv3/

**Preserved from main ("Unify counters report" #155):**
- OptimizationTracker helper methods (is_successful, get_best_code, etc.)
- Auto-success determination logic
- Proper tie-breaker with speedup
- _filter_compiler_errors() method
- Correct failure metrics (speedup 0.0 for build, 1.0 for correctness)
- Correct performance_validation flow (add_step first, query values, build report)
mawad-amd added a commit that referenced this pull request Nov 8, 2025
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants