Skip to content

DeepVul: A Multi-Task Transformer Model for Joint Prediction of Gene Essentiality and Drug Response

Notifications You must be signed in to change notification settings

alaaj27/DeepVul

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🔬 DeepVul: Multi-Task Transformer for Gene Essentiality and Drug Response

DeepVul is a multi-task transformer-based model designed to jointly predict gene essentiality and drug response using gene expression data. The model uses a shared feature extractor to learn robust biological representations that can be fine-tuned for downstream tasks, such as gene knockout effect prediction or treatment sensitivity profiling.


📑 Table of Contents


🚀 Features

  • Joint prediction of gene essentiality and drug response
  • Shared transformer encoder for multi-task learning
  • Flexible modes: pre-training only, fine-tuning only, or both
  • Compatible with public omics and pharmacogenomic datasets
  • Fully configurable via command-line arguments

📦 Installation

Make sure you have conda installed. Then run:

conda env create --file condaenv.yml
conda activate condaenv

📊 Datasets

To run DeepVul, download the following datasets and place them in the data/ directory:

Dataset Description Source
Gene Expression TPM-log transformed gene expression data Download
Gene Essentiality CRISPR-Cas9 knockout effect scores Download
Drug Response PRISM log-fold change drug response Download
Sanger Essentiality CERES gene effect data from Sanger Download
Somatic Mutation Mutation profiles for CCLE lines Download

⚙️ Hyperparameters

DeepVul supports flexible training via CLI arguments:

Parameter Default Description
--pretrain_batch_size 20 Batch size during pre-training
--finetuning_batch_size 20 Batch size during fine-tuning
--hidden_state 500 Size of transformer hidden layers
--pre_train_epochs 20 Pre-training epochs
--fine_tune_epochs 20 Fine-tuning epochs
--opt Adam Optimizer type
--lr 0.0001 Learning rate
--dropout 0.1 Dropout rate
--nhead 2 Number of attention heads
--num_layers 2 Transformer encoder layers
--dim_feedforward 2048 Feedforward network size
--fine_tuning_mode freeze-shared Whether to freeze shared layers during fine-tuning
--run_mode pre-train / fine-tune / both Execution mode

🏃 Running the Model

Change directory into the src folder:

cd src

Pre-training

python run_deepvul.py --run_mode pre-train ...

Fine-tuning

python run_deepvul.py --run_mode fine-tune ...

Full Pipeline (Pre-train + Fine-tune)

python run_deepvul.py --run_mode both ...

Customize the CLI options as needed based on your experiment setup.


🧠 Additional Information

  • Source code for model architecture, training, and evaluation is located in the src/ directory.
  • If you encounter issues or have questions, please open a GitHub Issue or contact the maintainers.
  • Model interpretation and evaluation scripts are included in the repo.

📄 Citation

If you use DeepVul in your work, please cite:

@article {JararwehDeepVul,
	author = {Jararweh, Ala and Bach, My Nguyen and Arredondo, David and Macaulay, Oladimeji and Dicome, Mikaela and Tafoya, Luis and Hu, Yue and Virupakshappa, Kushal and Boland, Genevieve and Flaherty, Keith and Sahu, Avinash},
	title = {DeepVul: A Multi-Task Transformer Model for Joint Prediction of Gene Essentiality and Drug Response},
	elocation-id = {2024.10.17.618944},
	year = {2025},
	doi = {10.1101/2024.10.17.618944},
	publisher = {Cold Spring Harbor Laboratory},
	URL = {https://www.biorxiv.org/content/early/2025/10/15/2024.10.17.618944},
	eprint = {https://www.biorxiv.org/content/early/2025/10/15/2024.10.17.618944.full.pdf},
	journal = {bioRxiv}
}

About

DeepVul: A Multi-Task Transformer Model for Joint Prediction of Gene Essentiality and Drug Response

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages