Skip to content

Power estimation tool for understanding the energy costs incurred by visual processing tools that run on NVIDIA GPUs. The development is supported by HEU project, EMERALD

License

Notifications You must be signed in to change notification settings

RCSL-TCD/gpu_power_profiler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GPU Power Consumption Profiler for visual processing algorithms

🖥️ GPU Profiler API

This CLI interface can be invoked from bash to invoke the application whose energy is to be estimated. For developing this tool, we have used NVIDIA Nsight Compute (https://developer.nvidia.com/nsight-compute) to extract the performance metrics and instruction-level statistics of the task being profiled. The extracted data from the profiler is processed into a required format and subsequently fed as input features for a pre-trained estimator, which then estimates the power consumption based on these low-level GPU metrics.

Key requirements and notes about the tool are as below:
● Supported OS: Windows, Linux
● NVIDIA Nsight Compute must be pre-installed.
● You can download it from: https://developer.nvidia.com/cuda-toolkit
● Currently tested on NVIDIA commercial GPUs, but the model could be adapted to any modern GPU based systems.
● Python version 3.x
● The tool has been verified in Windows 11, Ubantu, Rocky Linix 8 and Rocky Linux 9

💾 Installation Instructions

The profiling tool is openly made available via the GitHub repository, from which users can download and install the tool using standard git clone and pip install commands. It is expected that the user has installed Intel VTune Profiler from the link given above.

Creating a virtual environment is highly recommended to prevent dependency conflicts between existing projects and tools, and to safely test the profiling tool.

 
```power shell
git clone git@github.com:RCSL-TCD/gpu_power_profiler.git
cd gpu_power_profiler
pip install . 

📦 File Structure

The resulting file structure should be as shown below.

gpu_power_profiler/
├── gpu_profiler/                   # Package containing logic and model files
│   ├── __init__.py
│   ├── cli.py    
│   ├── gpu_profiler.py             # CLI tool entry point
│   ├── gpu_profiler3.py
│   ├── feature_extract.py
│   ├── predict_power.py
│   ├── model_min.joblib
│   ├── model_avg.joblib
│   └── model_max.joblib 
├── setup.py                    # Package installer script
└── MANIFEST.in                 # Includes .joblib files in the package

🛠️ CLI Options

Option Description
-m Specifies which power value to estimate.
By default, it estimates the average power consumption of the profiled task.
You can specify:
-m min
-m avg
-m peak
-m all
to estimate the minimum, peak, and average power values.
-a Specifies the application/task’s executable that needs to be profiled.
Provide the full path to the application.
Example:
python gpu_profiler -a C:\Nuke\Nuke15.0v4\Nuke.exe
-c Points to a previously generated NVIDIA Nsight Compute .csv file for power estimation.
Full path to the extracted data should be provided.
When this is used, the -a switch is ignored.
Example:
python gpu_profiler -c /usr/local/Nsight_profiled.csv
-s Passes argument to the tool under investigation.
For e.g., to load Nuke script to be profiled,
-s can be used to point to the location of .nk .
Example:
gpu_profiler -m all -a "c:\Program Files\Nuke15.0v4\Nuke15.0.exe" -s "D:/nuke_graphs/blur_image.nk"

The tool allows Three modes for testing
(a) Fully Automated mode
(b)Semi-manual mode

(a) Fully Automated Mode

In the simplest form, both steps are fully automated and handled by the API. The application/task to be profiled is specified using the -a configuration, and the -m switch specifies the estimator(s) that would be invoked post the profiling phase. With Nuke, the below examples open the application in profiling mode (through Vtune), and the specific task graph is loaded manually; however, this can also be automated using command line switches offered by Nuke.

Examples:

gpu_profiler -m min -a "C:\Program Files\Nuke15.0v4\Nuke15.0.exe"   
gpu_profiler -m avg -a "C:\Program Files\Nuke15.0v4\Nuke15.0.exe"   
gpu_profiler -m peak -a "C:\Program Files\Nuke15.0v4\Nuke15.0.exe" 
gpu_profiler -m all  -a "C:\Program Files\Nuke15.0v4\Nuke15.0.exe" 
gpu_profiler -m all -a "C:\Program Files\Nuke15.0v4\Nuke15.0.exe"  -s  "D:/nuke_graphs/blur_image.nk"

Note that the above cases use Windows as the host OS.

(b) Semi-manual Mode

This option can be exercised if the profiling of the tool has already been completed using VTune and the performance counter values are available in a CSV format. In this case, the -a switch should not be used, and the -c switch can be used to point to the .csv file for energy estimation. If the .csv file is available in the local directory, then the -c switch can also be ignored, as shown in the first example.

Examples

gpu_profiler -m min  
gpu_profiler -m avg  
gpu_profiler -m peak
gpu_profiler -m all -c <path\to\.csv\file.csv>

⚠️ "No kernel is profiled" Error.

The script is either running on the CPU, or it is not GPU-compatible.

🔄 Fixing Nuke CPU Fallback: Restore GPU Processing and Profiling

If Nuke falls back to executing scripts on the CPU even when the GPU option is selected, the GPU kernel cannot be profiled. To resolve the problem, re-enable or force Nuke to use the GPU, or remove a corrupted Nuke preference.nk file. Delete the corrupted preferences file, then restart Nuke and enable script execution on the GPU

🔄 Working with Rocky Linux 8

Upgrade default python version

 
sudo dnf module reset python36
sudo dnf module enable python39 
 
sudo dnf install python39 python39-devel python39-pip gcc gcc-c++ make pkgconfig 

Upgrade pip

 
python -m pip install --upgrade pip 

🔄 Nuke 15 tweaks for Rocky Linux 8

While browsing folders through Nuke, Home directories are not visible when Nuke is launched as root.

 
cd /home/USER_NAME
export HOME=/home/USER_NAME
export USER=USER_NAME
export LOGNAME=USER_NAME

About

Power estimation tool for understanding the energy costs incurred by visual processing tools that run on NVIDIA GPUs. The development is supported by HEU project, EMERALD

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages