I would like to confirm two points related to NPUEval reproducibility:
- Cycle count hardware independence
The paper reports results on a system with a Ryzen 9 7940HS. I’m testing on a Ryzen 7 8845HS
My question: Should the cycle counts (AIE clock cycles) be identical/reproducible across different devices.
- Average vectorization score calculation
In Figure 5 of the paper, the “average vectorization score” per model is shown. Could you clarify how this is calculated?
Two plausible methods:
1. Average over all 102 kernels (including those that failed/vectorization = 0)
2. Average only over kernels which succeeded (i.e., exclude failed/null scores)