Shopify · nirmitparikh8 · Dec 19, 2025 · Dec 19, 2025 · Dec 19, 2025 · Dec 19, 2025
diff --git a/.github/workflows/automated-experiment-result-checker.yml b/.github/workflows/automated-experiment-result-checker.yml
@@ -23,48 +23,40 @@ jobs:
           ref: ${{ github.event.pull_request.head.sha }}
           fetch-depth: 0
 
-      - name: Check for updated experiment result graphs and tables
+      - name: Install system dependencies
+        run: |
+          sudo apt-get update
+          sudo apt-get install -y libmagickwand-dev imagemagick
+
+      - name: Set up Ruby
+        uses: ruby/setup-ruby@v1.267.0
+        with:
+          ruby-version: "3.4"
+          bundler-cache: true
+
+      - name: Build semian native extension
         run: |
           set -e
           cd "$(git rev-parse --show-toplevel)"
 
-          # TODO: Include lower bound windup experiment once we have a way to make it run in a reasonable time.
-          # Find all PNGs and CSV files, excluding those with "windup" in their filename
-          mapfile -t all_pngs < <(find experiments/results/main_graphs -type f -name '*.png' ! -name '*windup*.png' | sort)
-          mapfile -t all_csvs < <(find experiments/results/csv -type f -name '*.csv' ! -name '*windup*.csv' 2>/dev/null | sort)
+          echo "🔧 Installing semian dependencies and building native extension..."
+          bundle install
+          bundle exec rake compile
 
-          # Combine all result files
-          all_files=("${all_pngs[@]}" "${all_csvs[@]}")
-
-          # Find all changed PNGs and CSVs in the latest commit
-          mapfile -t changed_files < <(git diff --name-only --diff-filter=AM HEAD~1..HEAD | grep -E '^experiments/results/main_graphs/.*\.png$|^experiments/results/csv/.*\.csv$' | grep -v windup | sort)
+      - name: Run experiments and check for performance regressions
+        run: |
+          set -e
+          cd "$(git rev-parse --show-toplevel)/experiments"
 
-          # Report any files that are not updated in the latest commit
-          declare -a not_updated=()
-          for file in "${all_files[@]}"; do
-            if ! printf "%s\n" "${changed_files[@]}" | grep -qx "$file"; then
-              not_updated+=("$file")
-            fi
-          done
+          echo "📊 Running all experiments..."
+          bundle install
+          bundle exec ruby run_all_experiments.rb
 
-          if [ ${#not_updated[@]} -gt 0 ]; then
-            echo "❌ The following result files have NOT been updated in the latest commit:"
-            for f in "${not_updated[@]}"; do
-              echo "  - $f"
-            done
-            echo ""
-            echo "Every commit must update all non-windup experiment result graphs and CSV files. You may be missing updates."
-            echo "Run:"
-            echo ""
-            echo "    cd experiments"
-            echo "    bundle install"
-            echo "    bundle exec ruby run_all_experiments.rb"
-            echo ""
-            echo "Commit the updated graphs and CSV files to resolve this check."
-            exit 1
-          fi
+          echo ""
+          echo "🔍 Checking for performance regressions..."
+          ruby detect_regressions.rb
 
-          echo "✅ All non-windup experiment result graphs and CSV files are up to date for this commit!"
+          echo "✅ No performance regressions detected!"
 
 
 
diff --git a/experiments/README_BASELINE_COLLECTION.md b/experiments/README_BASELINE_COLLECTION.md
@@ -0,0 +1,145 @@
+# Automated Baseline Collection Script
+
+## Overview
+
+Detailed guide for the `collect_baseline_data.rb` script - automates gathering baseline data for regression detection by running all experiments multiple times and organizing the results.
+
+📋 **For the complete regression detection system**: See [`REGRESSION_DETECTION.md`](REGRESSION_DETECTION.md)
+
+## Usage
+
+### Simple Usage
+```bash
+# Run all experiments 5 times (minimum recommended)
+ruby collect_baseline_data.rb 5
+
+# Run all experiments 10 times (better for stable baselines)
+ruby collect_baseline_data.rb 10
+
+# Run all experiments 10 times (most stable baselines)
+ruby collect_baseline_data.rb 15
+```
+
+
+## What It Does
+
+1. **Runs all experiments at once** using `run_all_experiments.rb` (already optimized with threading)
+2. **Copies all generated CSV files** to their respective baseline directories
+3. **Names files systematically**: `run_001_time_analysis.csv`, `run_002_time_analysis.csv`, etc.
+4. **Repeats the process N times** (where N is your parameter)
+5. **Leaves original CSV files unchanged** in `results/csv/` directory
+
+## Example Output
+
+```bash
+$ ruby collect_baseline_data.rb 3
+
+🚀 Starting baseline data collection...
+🎯 Using run_all_experiments.rb (much simpler and faster!)
+
+🎯 Collecting baseline data using run_all_experiments.rb
+📊 Running all experiments 3 times
+⏰ Estimated time: ~45 minutes (assuming ~15 min per run)
+============================================================
+
+🔄 Run 1/3
+----------------------------------------
+   🧪 Running all experiments...
+   ✅ All experiments completed successfully
+   📋 Copied 18/18 CSV files to baseline directories
+   ✅ Run 1: 18 experiments copied to baselines
+
+🔄 Run 2/3
+----------------------------------------
+   🧪 Running all experiments...
+   ✅ All experiments completed successfully
+   📋 Copied 18/18 CSV files to baseline directories
+   ✅ Run 2: 18 experiments copied to baselines
+
+🔄 Run 3/3
+----------------------------------------
+   🧪 Running all experiments...
+   ✅ All experiments completed successfully
+   📋 Copied 18/18 CSV files to baseline directories
+   ✅ Run 3: 18 experiments copied to baselines
+
+============================================================
+📈 BASELINE COLLECTION SUMMARY
+   ✅ Successful runs: 3/3
+   📊 Total experiment files copied: 54
+
+🎉 Baseline data collection completed!
+📋 Next steps:
+   1. Run: ruby compute_baselines.rb
+   2. Run: ruby detect_regressions.rb
+```
+
+## File Organization
+
+After running, your structure will look like:
+
+```
+experiments/
+└── results/
+    ├── csv/                                    # Original files (unchanged)
+    │   ├── gradual_increase_adaptive_time_analysis.csv
+    │   ├── gradual_increase_time_analysis.csv
+    │   ├── sudden_error_spike_100_adaptive_time_analysis.csv
+    │   └── [other experiments]_time_analysis.csv
+    └── baseline/                              # Organized baseline data
+        ├── gradual_increase_adaptive/         # Folder name matches CSV name
+        │   ├── run_001_time_analysis.csv     # Run 1
+        │   ├── run_002_time_analysis.csv     # Run 2
+        │   ├── run_003_time_analysis.csv     # Run 3
+        │   └── computed_baseline.txt         # (after compute_baselines.rb)
+        ├── gradual_increase/                  # Non-adaptive version
+        │   ├── run_001_time_analysis.csv
+        │   ├── run_002_time_analysis.csv
+        │   └── run_003_time_analysis.csv
+        ├── sudden_error_spike_100_adaptive/
+        │   ├── run_001_time_analysis.csv
+        │   ├── run_002_time_analysis.csv
+        │   └── run_003_time_analysis.csv
+        └── [other experiments]/
+```
+
+## Next Steps
+
+After collecting baseline data:
+
+1. **Compute baselines**: `ruby compute_baselines.rb`
+2. **See full system documentation**: [`REGRESSION_DETECTION.md`](REGRESSION_DETECTION.md)
+
+**💡 Much faster and simpler!** Uses the existing `run_all_experiments.rb` which already handles parallel execution efficiently.
+
+## Error Handling
+
+The script will:
+- ✅ Continue if individual experiment runs fail
+- ✅ Report partial success (e.g., 4/5 runs successful)
+- ✅ Create baseline directories automatically
+- ✅ Skip copying CSV if experiment failed
+- ❌ Exit with error code if no experiments succeeded
+
+## Tips
+
+1. **Run during off-hours** - Experiments are resource-intensive
+2. **Start small** - Try 5 runs first, then increase if needed
+3. **Monitor progress** - Each experiment takes ~15 minutes
+4. **Stable baselines** - More runs = more stable percentile calculations
+5. **One-time setup** - You only need to run this when initially setting up or updating baselines
+
+## Troubleshooting
+
+**"No experiment files found"**
+- Make sure you're running from the `experiments/` directory
+- Verify experiment files exist: `ls experiments/experiment_*_adaptive.rb`
+
+**Experiments failing**
+- Check individual experiment files can run: `ruby experiments/experiment_gradual_increase_adaptive.rb`
+- Ensure dependencies are installed
+- Check system resources (CPU/memory)
+
+**CSV files not found**
+- Verify experiments are generating `*_time_analysis.csv` files in `results/csv/`
+- Check experiment output for errors