Skip to content

Conversation

@deronsmith
Copy link
Contributor

This pull request introduces several important updates to the ESAT project, focusing on build system improvements, documentation updates, and dependency upgrades. The most significant changes include enhancements to the build workflows for CPU and GPU wheels, updates to dependencies in the Rust backend, and refreshed documentation reflecting the new release year and improved installation instructions.

Build system and workflow improvements:

  • Refactored the GitHub Actions workflows in .github/workflows/python-package.yml to separately build and upload CPU and GPU wheels for all supported OSes and Python versions, improving clarity and artifact management. [1] [2]
  • Updated the documentation build and deployment process in .github/workflows/documentation.yml to use the latest version of the peaceiris/actions-gh-pages action and adjusted the Sphinx build and output directories for consistency.

Dependency upgrades (Rust backend):

  • Upgraded Rust dependencies in Cargo.toml, including numpy, nalgebra, ndarray, pyo3, indicatif, and console, and added candle-core for GPU support.

Documentation updates:

  • Updated documentation files to reflect the new release year (2025), improved installation instructions, added a JOSS publication badge, and clarified how to obtain development wheels. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11]
  • Added sphinx-book-theme as a documentation requirement in doc-requirements.txt.

Other improvements:

  • Updated Sphinx build configuration and static assets to ensure documentation consistency and freshness. [1] [2] [3] [4]
  • Minor documentation corrections and clarifications, such as default parameter values and copyright year.

These changes collectively improve the maintainability, usability, and clarity of the ESAT project for both developers and users.

Copilot AI review requested due to automatic review settings August 20, 2025 13:57
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request introduces version 2025.0.2 of the ESAT project with significant enhancements focused on GPU acceleration, improved testing infrastructure, and enhanced data handling capabilities. The most substantial changes include adding GPU support via the candle-core crate, implementing comprehensive test coverage for data imputation, and extending the data visualization and analysis capabilities.

  • GPU acceleration support for the Rust backend with CUDA/CANDLE integration
  • Comprehensive data imputation module with scikit-learn integration and test coverage
  • Enhanced data visualization capabilities including ridgeline plots, correlation heatmaps, and 2D histograms
  • Improved error handling and progress callback mechanisms

Reviewed Changes

Copilot reviewed 32 out of 54 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
tests/data/test_imputer.py New comprehensive test suite for data imputation functionality
rust/lib.rs Major refactor adding GPU support, new matrix operations, and modernized API
pyproject.toml Version bump and GPU feature flag addition
esat/data/impute.py New data imputation module with multiple strategies
esat/data/datahandler.py Enhanced with visualization capabilities and location-based data handling
esat/model/sa.py Updated to support GPU acceleration and improved callback handling
esat/model/batch_sa.py Enhanced multiprocessing with logging and progress reporting
esat/utils.py Improved timestep calculation with better edge case handling
eval/*.py Updated runtime evaluation scripts with GPU benchmarking
notebooks/*.ipynb New visualization examples and updated simulation parameters

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

let s_inv_data: Vec<f64> = s_data
.iter()
.map(|&x: &f64| if x > 1e-12 { 1.0 / x } else { 0.0 })
.collect();
Copy link

Copilot AI Aug 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Converting the tensor to F64 and then to a vector may be inefficient. Consider using the tensor's native dtype if possible, or perform the conversion only when necessary.

Suggested change
.collect();
let s_data = s.to_vec1::<f32>()?; // Extract as f32 vector
let s_data: Vec<f64> = s_data.iter().map(|&x| x as f64).collect(); // Convert to f64 if needed
let s_inv_data: Vec<f64> = s_data
.iter()
.map(|&x: &f64| if x > 1e-12 { 1.0 / x } else { 0.0 })
.collect();

Copilot uses AI. Check for mistakes.
&device
).map_err(|e| PyErr::new::<pyo3::exceptions::PyRuntimeError, _>(format!("{}", e)))?;

// Call GPU update function (to be implemented)
Copy link

Copilot AI Aug 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tensor creation pattern is repeated multiple times with similar parameters. Consider extracting this into a helper function to reduce code duplication and improve maintainability.

Suggested change
// Call GPU update function (to be implemented)
// Convert matrices to correct type for device
let result = if use_gpu {
// Convert to Candle Tensor using the selected device
let v_t = create_tensor_from_array(&v_arr, &device, false)?;
let u_t = create_tensor_from_array(&u_arr, &device, false)?;
let we_t = create_tensor_from_array(&we_arr, &device, false)?;
let w_t = create_tensor_from_array(&w_arr, &device, true)?;
let h_t = create_tensor_from_array(&h_arr, &device, true)?;
// Call GPU update function (to be implemented)

Copilot uses AI. Check for mistakes.
// Convert result to Python object
match result {
Ok((final_w, final_h, q, converged, converge_i, q_list_full)) => {
let result_w = final_w.t().to_pyarray(py).reshape(w.dims())?;
Copy link

Copilot AI Aug 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable w is used here but it's not in scope. This should likely be referencing the original w parameter from the function signature. This will cause a compilation error.

Copilot uses AI. Check for mistakes.
self.missing_value = missing_value

if sklearn is None:
raise ImportError("scikit-learn is required for data imputation. Import esat[data] to install it.")
Copy link

Copilot AI Aug 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message suggests importing 'esat[data]' but according to pyproject.toml, the optional dependency is defined as 'data = ["scikit-learn"]', so the correct installation command should be 'pip install esat[data]'.

Suggested change
raise ImportError("scikit-learn is required for data imputation. Import esat[data] to install it.")
raise ImportError("scikit-learn is required for data imputation. Run 'pip install esat[data]' to install it.")

Copilot uses AI. Check for mistakes.
logger.warning(f"Unknown file type provided. Ext: {ext}, file: {filepath}")
#TODO: Add custom exception for unknown file types.
return None
return data
Copy link

Copilot AI Aug 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The TODO comment indicates incomplete error handling. Consider implementing a custom exception for unknown file types instead of returning None, which could lead to unexpected behavior downstream.

Copilot uses AI. Check for mistakes.
progress_callback=cb
)
return model_i, sa

Copy link

Copilot AI Aug 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable self.progress_callback is accessed outside of the class context. This function appears to be a standalone function, so self is not available. This should reference the progress_callback parameter instead.

Copilot uses AI. Check for mistakes.
# q_targets[k] = v["pmf-Q"]
project_dir = os.path.join("D:\\", "git", "esat")

analysis_file_path = os.path.join(project_dir, "eval", "results", "runtime_analysis.json")
Copy link

Copilot AI Aug 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hardcoded absolute path makes this script non-portable. Consider using relative paths or environment variables to make the script work across different development environments.

Suggested change
analysis_file_path = os.path.join(project_dir, "eval", "results", "runtime_analysis.json")
# Dynamically determine the project directory as the parent of this script's directory
project_dir = os.path.abspath(os.path.join(os.path.dirname(__file__), ".."))
analysis_file_path = os.path.join(project_dir, "eval", "results", "runtime_analysis.json")

Copilot uses AI. Check for mistakes.
@deronsmith deronsmith merged commit 9146631 into main Aug 20, 2025
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants