Fix NaN propagation in rate averages and cost totals (v2.0.1) #59

damonrand · 2026-01-09T05:15:14Z

Summary

When simulating periods with sparse missing data (e.g., 30 half-hour slots out of 17,520 in a full year), the entire summary would show NaN values for rates and costs. This made annual simulations unusable when even 0.17% of imbalance pricing data was missing.

Root cause

Both aggregation paths used skipna=False, causing any NaN to propagate:

output.py: safe_average() used np.average() which returns NaN if any input value is NaN
breakdown.py: cost totals used .sum(skipna=False).sum(skipna=False) which returns NaN if any cell is NaN

Solution

Add a 5% NaN threshold to both aggregation functions:

safe_average(): Filter out NaN values and their weights before calculating weighted average. Only return NaN if >5% of data is missing.
safe_sum(): New helper function that uses np.nansum() to sum valid values. Only return NaN if >5% of data is missing.

This allows simulations to complete with valid results when small amounts of data are missing, while still flagging unreliable results when too much data (>5%) is absent.

Files changed

output.py: Enhanced safe_average() with NaN threshold
breakdown.py: Added safe_sum() helper, replaced double sum calls
pyproject.toml: Bump version to 2.0.1

Test plan

Full year 2025 Findhorn simulation now completes with valid BESS gains
All 4 scenarios (Baseline, Bess500_500, Bess1000_1000, Bess500_1000) produce valid results
Lint passes

Problem: When simulating periods with sparse missing data (e.g., 30 half-hour slots out of 17,520 in a full year), the entire summary would show NaN values for rates and costs. This made annual simulations unusable when even 0.17% of imbalance pricing data was missing. Root cause: Both aggregation paths used skipna=False, causing any NaN to propagate: 1. output.py: safe_average() used np.average() which returns NaN if any input value is NaN 2. breakdown.py: cost totals used .sum(skipna=False).sum(skipna=False) which returns NaN if any cell is NaN Solution: Add a 5% NaN threshold to both aggregation functions: - safe_average(): Filter out NaN values and their weights before calculating weighted average. Only return NaN if >5% of data is missing. - safe_sum(): New helper function that uses np.nansum() to sum valid values. Only return NaN if >5% of data is missing. This allows simulations to complete with valid results when small amounts of data are missing, while still flagging unreliable results when too much data (>5%) is absent. Files changed: - output.py: Enhanced safe_average() with NaN threshold - breakdown.py: Added safe_sum() helper, replaced double sum calls - pyproject.toml: Bump version to 2.0.1

damonrand merged commit b317512 into main Jan 9, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix NaN propagation in rate averages and cost totals (v2.0.1) #59

Fix NaN propagation in rate averages and cost totals (v2.0.1) #59

damonrand commented Jan 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix NaN propagation in rate averages and cost totals (v2.0.1) #59

Fix NaN propagation in rate averages and cost totals (v2.0.1) #59

Conversation

damonrand commented Jan 9, 2026

Summary

Root cause

Solution

Files changed

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants