Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
60 changes: 26 additions & 34 deletions .github/workflows/automated-experiment-result-checker.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,48 +23,40 @@ jobs:
ref: ${{ github.event.pull_request.head.sha }}
fetch-depth: 0

- name: Check for updated experiment result graphs and tables
- name: Install system dependencies
run: |
sudo apt-get update
sudo apt-get install -y libmagickwand-dev imagemagick

- name: Set up Ruby
uses: ruby/setup-ruby@v1.267.0
with:
ruby-version: "3.4"
bundler-cache: true

- name: Build semian native extension
run: |
set -e
cd "$(git rev-parse --show-toplevel)"

# TODO: Include lower bound windup experiment once we have a way to make it run in a reasonable time.
# Find all PNGs and CSV files, excluding those with "windup" in their filename
mapfile -t all_pngs < <(find experiments/results/main_graphs -type f -name '*.png' ! -name '*windup*.png' | sort)
mapfile -t all_csvs < <(find experiments/results/csv -type f -name '*.csv' ! -name '*windup*.csv' 2>/dev/null | sort)
echo "πŸ”§ Installing semian dependencies and building native extension..."
bundle install
bundle exec rake compile

# Combine all result files
all_files=("${all_pngs[@]}" "${all_csvs[@]}")

# Find all changed PNGs and CSVs in the latest commit
mapfile -t changed_files < <(git diff --name-only --diff-filter=AM HEAD~1..HEAD | grep -E '^experiments/results/main_graphs/.*\.png$|^experiments/results/csv/.*\.csv$' | grep -v windup | sort)
- name: Run experiments and check for performance regressions
run: |
set -e
cd "$(git rev-parse --show-toplevel)/experiments"

# Report any files that are not updated in the latest commit
declare -a not_updated=()
for file in "${all_files[@]}"; do
if ! printf "%s\n" "${changed_files[@]}" | grep -qx "$file"; then
not_updated+=("$file")
fi
done
echo "πŸ“Š Running all experiments..."
bundle install
bundle exec ruby run_all_experiments.rb

if [ ${#not_updated[@]} -gt 0 ]; then
echo "❌ The following result files have NOT been updated in the latest commit:"
for f in "${not_updated[@]}"; do
echo " - $f"
done
echo ""
echo "Every commit must update all non-windup experiment result graphs and CSV files. You may be missing updates."
echo "Run:"
echo ""
echo " cd experiments"
echo " bundle install"
echo " bundle exec ruby run_all_experiments.rb"
echo ""
echo "Commit the updated graphs and CSV files to resolve this check."
exit 1
fi
echo ""
echo "πŸ” Checking for performance regressions..."
ruby detect_regressions.rb

echo "βœ… All non-windup experiment result graphs and CSV files are up to date for this commit!"
echo "βœ… No performance regressions detected!"



145 changes: 145 additions & 0 deletions experiments/README_BASELINE_COLLECTION.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,145 @@
# Automated Baseline Collection Script

## Overview

Detailed guide for the `collect_baseline_data.rb` script - automates gathering baseline data for regression detection by running all experiments multiple times and organizing the results.

πŸ“‹ **For the complete regression detection system**: See [`REGRESSION_DETECTION.md`](REGRESSION_DETECTION.md)

## Usage

### Simple Usage
```bash
# Run all experiments 5 times (minimum recommended)
ruby collect_baseline_data.rb 5

# Run all experiments 10 times (better for stable baselines)
ruby collect_baseline_data.rb 10

# Run all experiments 10 times (most stable baselines)
ruby collect_baseline_data.rb 15
```


## What It Does

1. **Runs all experiments at once** using `run_all_experiments.rb` (already optimized with threading)
2. **Copies all generated CSV files** to their respective baseline directories
3. **Names files systematically**: `run_001_time_analysis.csv`, `run_002_time_analysis.csv`, etc.
4. **Repeats the process N times** (where N is your parameter)
5. **Leaves original CSV files unchanged** in `results/csv/` directory

## Example Output

```bash
$ ruby collect_baseline_data.rb 3

πŸš€ Starting baseline data collection...
🎯 Using run_all_experiments.rb (much simpler and faster!)

🎯 Collecting baseline data using run_all_experiments.rb
πŸ“Š Running all experiments 3 times
⏰ Estimated time: ~45 minutes (assuming ~15 min per run)
============================================================

πŸ”„ Run 1/3
----------------------------------------
πŸ§ͺ Running all experiments...
βœ… All experiments completed successfully
πŸ“‹ Copied 18/18 CSV files to baseline directories
βœ… Run 1: 18 experiments copied to baselines

πŸ”„ Run 2/3
----------------------------------------
πŸ§ͺ Running all experiments...
βœ… All experiments completed successfully
πŸ“‹ Copied 18/18 CSV files to baseline directories
βœ… Run 2: 18 experiments copied to baselines

πŸ”„ Run 3/3
----------------------------------------
πŸ§ͺ Running all experiments...
βœ… All experiments completed successfully
πŸ“‹ Copied 18/18 CSV files to baseline directories
βœ… Run 3: 18 experiments copied to baselines

============================================================
πŸ“ˆ BASELINE COLLECTION SUMMARY
βœ… Successful runs: 3/3
πŸ“Š Total experiment files copied: 54

πŸŽ‰ Baseline data collection completed!
πŸ“‹ Next steps:
1. Run: ruby compute_baselines.rb
2. Run: ruby detect_regressions.rb
```

## File Organization

After running, your structure will look like:

```
experiments/
└── results/
β”œβ”€β”€ csv/ # Original files (unchanged)
β”‚ β”œβ”€β”€ gradual_increase_adaptive_time_analysis.csv
β”‚ β”œβ”€β”€ gradual_increase_time_analysis.csv
β”‚ β”œβ”€β”€ sudden_error_spike_100_adaptive_time_analysis.csv
β”‚ └── [other experiments]_time_analysis.csv
└── baseline/ # Organized baseline data
β”œβ”€β”€ gradual_increase_adaptive/ # Folder name matches CSV name
β”‚ β”œβ”€β”€ run_001_time_analysis.csv # Run 1
β”‚ β”œβ”€β”€ run_002_time_analysis.csv # Run 2
β”‚ β”œβ”€β”€ run_003_time_analysis.csv # Run 3
β”‚ └── computed_baseline.txt # (after compute_baselines.rb)
β”œβ”€β”€ gradual_increase/ # Non-adaptive version
β”‚ β”œβ”€β”€ run_001_time_analysis.csv
β”‚ β”œβ”€β”€ run_002_time_analysis.csv
β”‚ └── run_003_time_analysis.csv
β”œβ”€β”€ sudden_error_spike_100_adaptive/
β”‚ β”œβ”€β”€ run_001_time_analysis.csv
β”‚ β”œβ”€β”€ run_002_time_analysis.csv
β”‚ └── run_003_time_analysis.csv
└── [other experiments]/
```

## Next Steps

After collecting baseline data:

1. **Compute baselines**: `ruby compute_baselines.rb`
2. **See full system documentation**: [`REGRESSION_DETECTION.md`](REGRESSION_DETECTION.md)

**πŸ’‘ Much faster and simpler!** Uses the existing `run_all_experiments.rb` which already handles parallel execution efficiently.

## Error Handling

The script will:
- βœ… Continue if individual experiment runs fail
- βœ… Report partial success (e.g., 4/5 runs successful)
- βœ… Create baseline directories automatically
- βœ… Skip copying CSV if experiment failed
- ❌ Exit with error code if no experiments succeeded

## Tips

1. **Run during off-hours** - Experiments are resource-intensive
2. **Start small** - Try 5 runs first, then increase if needed
3. **Monitor progress** - Each experiment takes ~15 minutes
4. **Stable baselines** - More runs = more stable percentile calculations
5. **One-time setup** - You only need to run this when initially setting up or updating baselines

## Troubleshooting

**"No experiment files found"**
- Make sure you're running from the `experiments/` directory
- Verify experiment files exist: `ls experiments/experiment_*_adaptive.rb`

**Experiments failing**
- Check individual experiment files can run: `ruby experiments/experiment_gradual_increase_adaptive.rb`
- Ensure dependencies are installed
- Check system resources (CPU/memory)

**CSV files not found**
- Verify experiments are generating `*_time_analysis.csv` files in `results/csv/`
- Check experiment output for errors
Loading