interpretable-splicing-model

Scripts for preprocessing and training the interpretable splicing model.

Installation

No special hardware or GPU is required. A recent version of Python is required (version 3.8 tested). The following packages should be installed using "pip install", preferably under a virtual environment (tested version indicated):

tensorflow 2.10
numpy 1.22.4
pandas 1.5.0
joblib 1.2.0
sklearn 1.1.2

In addition, the following packages are required for generating figures:

matplotlib 3.6.0
seaborn 0.12.0
logomaker 0.8

Finally, the Vienna RNA package (version 2.4.17) should be installed and in the PATH. In Ubuntu, this can be done using "sudo apt install vienna-rna".

Running

Two scripts are provided:

preprocess.sh: This script takes the raw FASTQ files and converts them into a training and testing dataset. FASTQ files should be stored under the "fasta_files" folder, as explained in the readme.txt file. Typical running time is about 3 hours.
train_model.sh: This script reads the preprocessed datasets (the four pkl.gz files under the "data" folder), and trains the interpretable splicing model. Its output is the trained model, as well as two intermediate models generated as part of the custom training schedule. These files are stored in the "output" folder. Typical running time is about 2 hours.

Examples

The preprocessed datasets are provided in the "data" folder. Moreover, a trained model is included under the `output/' folder. This is the model used to generate all the figures in the paper.

Citation

Please cite: Liao SE, Sudarshan M, and Regev O. Machine learning for discovery: deciphering RNA splicing logic. In submission. https://www.biorxiv.org/content/10.1101/2022.10.01.510472v1

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
data		data
data_preprocessing		data_preprocessing
fasta_files		fasta_files
figures		figures
model_training		model_training
output		output
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
preprocess.sh		preprocess.sh
train_model.sh		train_model.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

interpretable-splicing-model

Installation

Running

Examples

Citation

About

Uh oh!

Releases

Packages

Languages

License

regev-lab/interpretable-splicing-model

Folders and files

Latest commit

History

Repository files navigation

interpretable-splicing-model

Installation

Running

Examples

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages