First, create the q_mamba environment.
conda create --name q_mamba python=3.10
conda activate q_mambaThen install Pytorch with version 1.12+ and CUDA 11.6+ (see https://pytorch.org/ for more details). The cuda-toolkit is also required conda install nvidia::cuda-toolkit=12.1.
Next, install the mamba-ssm using pip install mamba-ssm (see https://github.com/state-spaces/mamba.git for more details).
Finally, install the other necessary libraries.
pip install -r requirements.txtOther requirements:
- Linux
- NVIDIA GPU
- PyTorch 1.12+
- CUDA 11.6+
To quickly start training, firstly, download the training trajectories from here. The directory could be set like this basic structure:
├── /trajectory_files/
│ ├── trajectory_set_0_Rand.pkl
│ ├── trajectory_set_0_CfgX.pkl
│ ├── trajectory_set_0_Unit.pkl
│ ├── trajectory_set_1_Unit.pkl
│ ├── trajectory_set_2_Unit.pkl Then we can train the main Q-Mamba agent using:
# train q_mamba with conservative_reg_loss
python run.py --train --trajectory_file_path './trajectory_files/trajectory_set_0_Unit.pkl' --has_conservative_reg_loss
# train q_mamba without conservative_reg_loss
python run.py --train --trajectory_file_path './trajectory_files/trajectory_set_0_Unit.pkl'
Taking the training on Alg0 as an example, the models in the training is saved at ./model/trajectory_set_0_Unit/YYMMDDTHHmmSS/ (where YYMMDDTHHmmSS is the time stamp of the run) and the tensorboard log is stored at ./log/trajectory_set_0_Unit/YYMMDDTHHmmSS
To test the trained model on the BBOB testing problems (Section 5.2, Table 1):
# test q_mamba
python run.py --test --algorithm_id 0 --load_path [MODEL_PATH]
For example, we have a pre-trained Q-Mamba model in ./model/qmamba.pth, which is trained with default settings in the paper on Alg0, its testing command is
# test q_mamba
python run.py --test --algorithm_id 0 --load_path ./model/qmamba.pth
The reward results are stored in ./log/test/q_mamba/YYMMDDTHHmmSS/test_rewards.pkl where YYMMDDTHHmmSS is the time stamp of the test.
@inproceedings{
qmamba,
title={Meta-Black-Box-Optimization through Offline Q-function Learning},
author={Zeyuan Ma, Zhiguang Cao, Zhou Jiang, Hongshu Guo, Yue-Jiao Gong},
booktitle={Forty-second International Conference on Machine Learning},
year={2025},
}