-
Notifications
You must be signed in to change notification settings - Fork 10
Update main to 2025.0.2 #33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This pull request introduces version 2025.0.2 of the ESAT project with significant enhancements focused on GPU acceleration, improved testing infrastructure, and enhanced data handling capabilities. The most substantial changes include adding GPU support via the candle-core crate, implementing comprehensive test coverage for data imputation, and extending the data visualization and analysis capabilities.
- GPU acceleration support for the Rust backend with CUDA/CANDLE integration
- Comprehensive data imputation module with scikit-learn integration and test coverage
- Enhanced data visualization capabilities including ridgeline plots, correlation heatmaps, and 2D histograms
- Improved error handling and progress callback mechanisms
Reviewed Changes
Copilot reviewed 32 out of 54 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/data/test_imputer.py | New comprehensive test suite for data imputation functionality |
| rust/lib.rs | Major refactor adding GPU support, new matrix operations, and modernized API |
| pyproject.toml | Version bump and GPU feature flag addition |
| esat/data/impute.py | New data imputation module with multiple strategies |
| esat/data/datahandler.py | Enhanced with visualization capabilities and location-based data handling |
| esat/model/sa.py | Updated to support GPU acceleration and improved callback handling |
| esat/model/batch_sa.py | Enhanced multiprocessing with logging and progress reporting |
| esat/utils.py | Improved timestep calculation with better edge case handling |
| eval/*.py | Updated runtime evaluation scripts with GPU benchmarking |
| notebooks/*.ipynb | New visualization examples and updated simulation parameters |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
| let s_inv_data: Vec<f64> = s_data | ||
| .iter() | ||
| .map(|&x: &f64| if x > 1e-12 { 1.0 / x } else { 0.0 }) | ||
| .collect(); |
Copilot
AI
Aug 20, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Converting the tensor to F64 and then to a vector may be inefficient. Consider using the tensor's native dtype if possible, or perform the conversion only when necessary.
| .collect(); | |
| let s_data = s.to_vec1::<f32>()?; // Extract as f32 vector | |
| let s_data: Vec<f64> = s_data.iter().map(|&x| x as f64).collect(); // Convert to f64 if needed | |
| let s_inv_data: Vec<f64> = s_data | |
| .iter() | |
| .map(|&x: &f64| if x > 1e-12 { 1.0 / x } else { 0.0 }) | |
| .collect(); |
| &device | ||
| ).map_err(|e| PyErr::new::<pyo3::exceptions::PyRuntimeError, _>(format!("{}", e)))?; | ||
|
|
||
| // Call GPU update function (to be implemented) |
Copilot
AI
Aug 20, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tensor creation pattern is repeated multiple times with similar parameters. Consider extracting this into a helper function to reduce code duplication and improve maintainability.
| // Call GPU update function (to be implemented) | |
| // Convert matrices to correct type for device | |
| let result = if use_gpu { | |
| // Convert to Candle Tensor using the selected device | |
| let v_t = create_tensor_from_array(&v_arr, &device, false)?; | |
| let u_t = create_tensor_from_array(&u_arr, &device, false)?; | |
| let we_t = create_tensor_from_array(&we_arr, &device, false)?; | |
| let w_t = create_tensor_from_array(&w_arr, &device, true)?; | |
| let h_t = create_tensor_from_array(&h_arr, &device, true)?; | |
| // Call GPU update function (to be implemented) |
| // Convert result to Python object | ||
| match result { | ||
| Ok((final_w, final_h, q, converged, converge_i, q_list_full)) => { | ||
| let result_w = final_w.t().to_pyarray(py).reshape(w.dims())?; |
Copilot
AI
Aug 20, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The variable w is used here but it's not in scope. This should likely be referencing the original w parameter from the function signature. This will cause a compilation error.
| self.missing_value = missing_value | ||
|
|
||
| if sklearn is None: | ||
| raise ImportError("scikit-learn is required for data imputation. Import esat[data] to install it.") |
Copilot
AI
Aug 20, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error message suggests importing 'esat[data]' but according to pyproject.toml, the optional dependency is defined as 'data = ["scikit-learn"]', so the correct installation command should be 'pip install esat[data]'.
| raise ImportError("scikit-learn is required for data imputation. Import esat[data] to install it.") | |
| raise ImportError("scikit-learn is required for data imputation. Run 'pip install esat[data]' to install it.") |
| logger.warning(f"Unknown file type provided. Ext: {ext}, file: {filepath}") | ||
| #TODO: Add custom exception for unknown file types. | ||
| return None | ||
| return data |
Copilot
AI
Aug 20, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The TODO comment indicates incomplete error handling. Consider implementing a custom exception for unknown file types instead of returning None, which could lead to unexpected behavior downstream.
| progress_callback=cb | ||
| ) | ||
| return model_i, sa | ||
|
|
Copilot
AI
Aug 20, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The variable self.progress_callback is accessed outside of the class context. This function appears to be a standalone function, so self is not available. This should reference the progress_callback parameter instead.
| # q_targets[k] = v["pmf-Q"] | ||
| project_dir = os.path.join("D:\\", "git", "esat") | ||
|
|
||
| analysis_file_path = os.path.join(project_dir, "eval", "results", "runtime_analysis.json") |
Copilot
AI
Aug 20, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The hardcoded absolute path makes this script non-portable. Consider using relative paths or environment variables to make the script work across different development environments.
| analysis_file_path = os.path.join(project_dir, "eval", "results", "runtime_analysis.json") | |
| # Dynamically determine the project directory as the parent of this script's directory | |
| project_dir = os.path.abspath(os.path.join(os.path.dirname(__file__), "..")) | |
| analysis_file_path = os.path.join(project_dir, "eval", "results", "runtime_analysis.json") |
This pull request introduces several important updates to the ESAT project, focusing on build system improvements, documentation updates, and dependency upgrades. The most significant changes include enhancements to the build workflows for CPU and GPU wheels, updates to dependencies in the Rust backend, and refreshed documentation reflecting the new release year and improved installation instructions.
Build system and workflow improvements:
.github/workflows/python-package.ymlto separately build and upload CPU and GPU wheels for all supported OSes and Python versions, improving clarity and artifact management. [1] [2].github/workflows/documentation.ymlto use the latest version of thepeaceiris/actions-gh-pagesaction and adjusted the Sphinx build and output directories for consistency.Dependency upgrades (Rust backend):
Cargo.toml, includingnumpy,nalgebra,ndarray,pyo3,indicatif, andconsole, and addedcandle-corefor GPU support.Documentation updates:
sphinx-book-themeas a documentation requirement indoc-requirements.txt.Other improvements:
These changes collectively improve the maintainability, usability, and clarity of the ESAT project for both developers and users.