Unify counters report #155

mawad-amd · 2025-11-03T06:46:44Z

No description provided.

Copilot

Pull Request Overview

This PR refactors the optimization tracking system and improves error handling in the IntelliPerf GPU kernel optimization tool. The changes centralize best-iteration tracking logic into the OptimizationTracker class and enhance process monitoring/timeout handling.

Extracted duplicate kernel file finding logic into a shared find_kernel_file() method
Refactored OptimizationTracker to automatically calculate improvement metrics and determine success
Improved timeout handling with dynamic timeouts based on baseline performance and better error messages for crashes vs timeouts

Reviewed Changes

Copilot reviewed 7 out of 8 changed files in this pull request and generated 9 comments.

Show a summary per file

File	Description
formula_base.py	Added `find_kernel_file()` helper, enhanced `OptimizationTracker` with auto-calculation of improvement metrics, added compiler error filtering, improved Accordo timeout/crash detection
memory_access.py	Removed duplicate tracking code, delegated to centralized `OptimizationTracker` methods, fixed metric names (latency→lat), updated failure metrics to use baseline values
bank_conflict.py	Aligned with refactored tracker, changed `maximize=False` for minimize-type metrics, removed manual best-tracking code
atomic_contention.py	Similar refactoring as memory_access and bank_conflict, added terminal access documentation to signature
swizzling.py	Used shared `find_kernel_file()` helper, improved error message handling for iterations
communicate.py	Enhanced IPC timeout logic with dynamic timeouts, added process crash detection, reduced debug log spam
pyproject.toml	Lowered Python requirement from 3.10 to 3.9
.gitignore	Added `.rocprofv3/` to ignored files

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-11-03T06:53:21Z

src/intelliperf/formulas/formula_base.py

+				if not self.maximize:
+					improvement = before / after if after != 0 else 1.0
+				else:
+					improvement = after / before if after != 0 else 1.0


The improvement calculation logic is identical for both branches. When maximize=False (minimize raw metric), improvement should be before / after (e.g., 10 conflicts → 5 conflicts = 10/5 = 2x improvement). When maximize=True (maximize raw metric), improvement should also be after / before (e.g., 50% → 90% coalescing = 90/50 = 1.8x improvement). However, both branches calculate the same formula, making the conditional redundant. The logic appears correct but the code structure suggests they should differ.

Copilot · 2025-11-03T06:53:22Z

src/intelliperf/formulas/formula_base.py

+				# With proper improvement calculation, we always want higher improvement values
+				if new_val > cur_val:


The comment 'we always want higher improvement values' contradicts the maximize parameter in the tracker initialization. The code now ignores the maximize flag when selecting the best step, always preferring higher values. This is inconsistent with the tracker's design where maximize=False indicates we're minimizing the raw metric (though improvement ratio should still be maximized). Consider clarifying whether maximize applies to the raw metric or the improvement ratio.

Suggested change

# With proper improvement calculation, we always want higher improvement values

if new_val > cur_val:

# Select best step based on maximize flag: prefer higher values if maximizing, lower if minimizing

if (self.maximize and new_val > cur_val) or (not self.maximize and new_val < cur_val):

Copilot · 2025-11-03T06:53:22Z

src/intelliperf/formulas/memory_access.py

 			max_iterations=self.num_attempts,
 			primary_metric="coal_improvement",
-			maximize=True,
+			maximize=True,  # True = maximize raw metric (coalescing %)


The inline comment states 'True = maximize raw metric (coalescing %)', but based on the add_step logic in formula_base.py, the maximize parameter doesn't actually affect the improvement calculation since both branches compute the same formula. This comment may mislead readers about the parameter's effect.

Suggested change

maximize=True, # True = maximize raw metric (coalescing %)

maximize=True, # Currently has no effect; improvement is always calculated as optimized_coal / unoptimized_coal

Copilot · 2025-11-03T06:53:22Z

src/intelliperf/formulas/bank_conflict.py

 			max_iterations=self.num_attempts,
 			primary_metric="conflict_improvement",
-			maximize=True,
+			maximize=False,  # False = minimize raw metric (conflicts)


Similar to memory_access.py, this comment suggests maximize=False has an effect on improvement calculation, but the actual code in formula_base.py line 142-145 shows both branches calculate the same formula. The comment should clarify that maximize is not currently used in improvement calculations.

Suggested change

maximize=False, # False = minimize raw metric (conflicts)

maximize=False, # Note: 'maximize' is not currently used in improvement calculations; improvement is always calculated as unoptimized_conflicts / optimized_conflicts.

Copilot · 2025-11-03T06:53:22Z

src/intelliperf/formulas/atomic_contention.py

 			max_iterations=self.num_attempts,
 			primary_metric="latency_improvement",
-			maximize=True,
+			maximize=False,  # False = minimize raw metric (latency)


Same issue as in bank_conflict.py and memory_access.py - the maximize parameter doesn't affect the improvement calculation as both branches in formula_base.py compute identical formulas. The comment should be updated to reflect the actual behavior.

Copilot · 2025-11-03T06:53:23Z

src/accordo/python/communicate.py

+	thread = threading.Thread(target=target, daemon=True)
+	thread.start()
+	thread.join(timeout=timeout_seconds)
+
+	if thread.is_alive():
+		# Thread is still running - timeout occurred
+		raise TimeoutError(f"Operation timed out after {timeout_seconds} seconds")


Using daemon threads with timeout creates a resource leak - if the timeout occurs, the daemon thread continues running in the background and cannot be stopped (the function it's executing may hold resources). Consider using a non-daemon thread with proper cleanup, or document that this is acceptable for the IPC operations being performed.

Copilot · 2025-11-03T06:53:23Z

src/intelliperf/formulas/memory_access.py

+		time_after_ms = None
+		if metrics and "optimized_coal" in metrics:
+			# Only use the measured value if it actually improved (maximize=True, so higher is better)
+			if metrics["optimized_coal"] > self.baseline_coalesced_pct:


The condition checks if optimized_coal > baseline_coalesced_pct, but baseline_coalesced_pct is the same as unoptimized_coal (both represent the initial measurement). The comment on line 733 says 'Only use the measured value if it actually improved (maximize=True, so higher is better)', but the comparison should be against metrics['unoptimized_coal'] from the best step, not the baseline, to properly validate improvement in the best optimization attempt.

Suggested change

if metrics["optimized_coal"] > self.baseline_coalesced_pct:

if metrics["optimized_coal"] > metrics.get("unoptimized_coal", self.baseline_coalesced_pct):

Copilot · 2025-11-03T06:53:23Z

src/intelliperf/formulas/bank_conflict.py

+			if metrics["optimized_conflicts"] < self.baseline_bank_conflicts:
+				metric_after = metrics["optimized_conflicts"]


Similar issue to memory_access.py - this compares against baseline_bank_conflicts instead of the unoptimized value from the best step. If there were multiple iterations, the best step's unoptimized_conflicts may differ from the initial baseline, making this comparison incorrect for determining whether the best step actually improved.

Copilot · 2025-11-03T06:53:23Z

src/intelliperf/formulas/atomic_contention.py

+		time_after_ms = None
+		if metrics and "optimized_lat" in metrics:
+			# Only use the measured value if it actually improved
+			if metrics["optimized_lat"] < self.baseline_atomic_latency:


Same issue as memory_access.py and bank_conflict.py - comparing against baseline instead of the best step's unoptimized value. Should compare metrics['optimized_lat'] < metrics.get('unoptimized_lat', self.baseline_atomic_latency) to properly validate improvement in the best step.

Suggested change

if metrics["optimized_lat"] < self.baseline_atomic_latency:

if metrics["optimized_lat"] < metrics.get("unoptimized_lat", self.baseline_atomic_latency):

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

- Add OptimizationTracker helper methods: is_successful(), get_best_code(), get_best_report(), get_best_metrics() - Add auto-determination of success based on improvement and speedup - Add _filter_compiler_errors() static method for cleaner compiler output - Add process crash detection in Accordo IPC with process_pid - Add enhanced timeout error messages with baseline context - Maintain all Accordo API improvements from feature branch

**New Accordo API (clean abstraction):** - New Accordo module structure with public API (__init__, config, result, snapshot, exceptions, validator) - Internal implementation (_internal/codegen, hip_interop, ipc/communication) - Snapshot-based caching: capture reference once, compare multiple optimized versions - AccordoValidator with proper timeout handling and error detection - KernelArg dataclass for cleaner argument specification **Formula changes (preserves all main improvements):** - Updated correctness_validation_pass() to use new Accordo API with caching - Added write_and_log_optimized_code() to centralize iteration logging - All formulas use write_and_log_optimized_code() for immediate LLM output capture - find_kernel_file() already in main - formulas use it correctly **Infrastructure:** - Logger.save_iteration_code() for centralized iteration file management - __main__.py passes trace_path to logger - All examples/*.hip updated with hip_try macro for error handling - .gitignore: added .rocprofv3/ **Preserved from main ("Unify counters report" #155):** - OptimizationTracker helper methods (is_successful, get_best_code, etc.) - Auto-success determination logic - Proper tie-breaker with speedup - _filter_compiler_errors() method - Correct failure metrics (speedup 0.0 for build, 1.0 for correctness) - Correct performance_validation flow (add_step first, query values, build report)

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

**New Accordo API (clean abstraction):** - New Accordo module structure with public API (__init__, config, result, snapshot, exceptions, validator) - Internal implementation (_internal/codegen, hip_interop, ipc/communication) - Snapshot-based caching: capture reference once, compare multiple optimized versions - AccordoValidator with proper timeout handling and error detection - KernelArg dataclass for cleaner argument specification **Formula changes (preserves all main improvements):** - Updated correctness_validation_pass() to use new Accordo API with caching - Added write_and_log_optimized_code() to centralize iteration logging - All formulas use write_and_log_optimized_code() for immediate LLM output capture - find_kernel_file() already in main - formulas use it correctly **Infrastructure:** - Logger.save_iteration_code() for centralized iteration file management - __main__.py passes trace_path to logger - All examples/*.hip updated with hip_try macro for error handling - .gitignore: added .rocprofv3/ **Preserved from main ("Unify counters report" #155):** - OptimizationTracker helper methods (is_successful, get_best_code, etc.) - Auto-success determination logic - Proper tie-breaker with speedup - _filter_compiler_errors() method - Correct failure metrics (speedup 0.0 for build, 1.0 for correctness) - Correct performance_validation flow (add_step first, query values, build report)

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

mawad-amd and others added 13 commits October 20, 2025 09:20

Add timeout

b40ee5a

Update atomic contention formula

583bbe0

Update bank conflict formula

3c0f3a8

Update reporting in base

49f99f8

Upadte memory access formula

c7ab83c

Update swizzling formula

bdbcf9d

Update .gitignore

0484e17

Remove MCP call

9097b7f

Fix tracking bugs

75d569b

Lower python requirement

33f7022

Apply Ruff auto-fixes

824ca5a

Add missing error message

c661326

Apply Ruff auto-fixes

dcb866b

mawad-amd requested a review from coleramos425 as a code owner November 3, 2025 06:46

mawad-amd requested a review from Copilot November 3, 2025 06:49

Copilot AI reviewed Nov 3, 2025

View reviewed changes

mawad-amd merged commit f98f6a1 into main Nov 3, 2025
7 of 8 checks passed

mawad-amd deleted the muhaawad/unify-counters-report branch November 3, 2025 07:00

mawad-amd added a commit that referenced this pull request Nov 3, 2025

Unify counters report (#155)

2994f96

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

mawad-amd added a commit that referenced this pull request Nov 8, 2025

Unify counters report (#155)

95bd069

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

mawad-amd added a commit that referenced this pull request Nov 8, 2025

Unify counters report (#155)

5770e20

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Unify counters report #155

Unify counters report #155

Uh oh!

mawad-amd commented Nov 3, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Nov 3, 2025

Uh oh!

Copilot AI Nov 3, 2025

Uh oh!

Copilot AI Nov 3, 2025

Uh oh!

Copilot AI Nov 3, 2025

Uh oh!

Copilot AI Nov 3, 2025

Uh oh!

Copilot AI Nov 3, 2025

Uh oh!

Copilot AI Nov 3, 2025

Uh oh!

Copilot AI Nov 3, 2025

Uh oh!

Copilot AI Nov 3, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		# With proper improvement calculation, we always want higher improvement values
		if new_val > cur_val:

	maximize=True, # True = maximize raw metric (coalescing %)
	maximize=True, # Currently has no effect; improvement is always calculated as optimized_coal / unoptimized_coal

	maximize=False, # False = minimize raw metric (conflicts)
	maximize=False, # Note: 'maximize' is not currently used in improvement calculations; improvement is always calculated as unoptimized_conflicts / optimized_conflicts.

	if metrics["optimized_coal"] > self.baseline_coalesced_pct:
	if metrics["optimized_coal"] > metrics.get("unoptimized_coal", self.baseline_coalesced_pct):

		if metrics["optimized_conflicts"] < self.baseline_bank_conflicts:
		metric_after = metrics["optimized_conflicts"]

	if metrics["optimized_lat"] < self.baseline_atomic_latency:
	if metrics["optimized_lat"] < metrics.get("unoptimized_lat", self.baseline_atomic_latency):

Unify counters report #155

Unify counters report #155

Uh oh!

Conversation

mawad-amd commented Nov 3, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Nov 3, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 3, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 3, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 3, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 3, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 3, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 3, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 3, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 3, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants