Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 14 additions & 19 deletions docs/devops.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,25 +16,6 @@ Content ![](https://img.shields.io/badge/status-WorkInProgress-yellow)
- Each time we merge a Pull request, we need to make a release
- Publish on Posit connect with a new name that matches the version

## Data management

### AWS

- Samplesheet input files for pipelines
- `pipelineName_PI_hbcNNNNNN`
- Have a copy in project folder in O2
- Manually removing weekly during platform meeting
- Raw data is under `input` folder
- Alex and Lorena and Emma can move data from O2/FAS to S3
- `pipelineName_PI_hbcNNNNNN`
- lifecycle 14 days
- Pipeline outputs are under `results`:
- `pipelineName_PI_hbcNNNNNN`
- lifecycle 14 days for bigger than 1gb
- Move output pipeline to project folder under `final` folder
- Data cleaning every platform meeting
- Quarterly Evaluation: RNAseq, CHIPseq

## Configure to use posit package manager

[source](https://packagemanager.posit.co/client/#/repos/bioconductor/setup?bioconductor_version=3.18)
Expand Down Expand Up @@ -107,4 +88,18 @@ BiocManager::install("BiocNeighbors")
install.packages('NMF')
install.packages("circlize")
devtools::install_github("jinworks/CellChat")
```

## Build environments

### scGPT in FAS

```
conda create -p ./scgpt-2 python=3.9 pip ipykernel -c conda-forge
conda install ipywidgets -c conda-forge
pip install torch==2.1.2
conda install numpy
pip install scgpt
conda install wandb -c conda-forge
python -m ipykernel install --prefix=/n/holylfs05/LABS/hsph_bioinfo/Lab/shared_resources/scgpt-2 --name 'dcgpt2' --display-name 'scgpt-2'
```
28 changes: 28 additions & 0 deletions docs/environments.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
# Environments availables

## scGPT

Only available at FAS computing resources.

Please, reach out to platform to access for the first time to this env:

- Add the environment first:
- Only the first time, add the env to the notebook kernels:
- `echo "n/holylfs05/LABS/hsph_bioinfo/Lab/shared_resources/scgpt-2" >> ~/.conda/environments.txt`

- Start a python notebook on the [ondemand web-page](https://rcood.rc.fas.harvard.edu/pun/sys/dashboard/batch_connect/sessions).
- You need to be connected to the VPN to be connected.
- Use `gpu` partition and if you need more than one GPU or `gpu_test` if you need something that won't take a long time
- add these to the sbatch options `--gres=gpu:n` in the advance options, they are below in the page.
- Add `gcc/13.2.0-fasrc01` to the list of modules to load
- Once connected, if you get asked to use a password, just close the windows and open again.
- If you didn't add module `gcc/13.2.0-fasrc01`, do this now:
- Initiate a terminal session
- `module load gcc/13.2.0-fasrc01`
- Then, start a notebook choosing as kernel `scgpt-2`
- Validate everything works with these commands in the python notebook
- `import torch`
- `torch.cuda.is_available()` -> Should return `True`
- `torch.cuda.device_count()` -> Should return `
- `import scgpt`

55 changes: 24 additions & 31 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,9 @@ Most analyses will follow the similar trajectory for set-up. We will note where

## Set up the package

<details>
<summary>O2 instructions- Click to expand!</summary>

Log onto O2 via the command line and check two things (first-time only):

* Remove `bcbio` from you `PATH` by commenting the line in your `.bashrc` if you have it
Expand All @@ -33,7 +36,9 @@ When the session is started, set your library path by typing this command in you
```
.libPaths("/n/app/bcbio/R4.3.1")
```
</details>

</p>
Next, load `bcbioR` with:

```
Expand Down Expand Up @@ -88,7 +93,20 @@ usethis::proj_activate(project_path)

> Note: This will restart the session in the project directory. This restart will clear the `.libPaths("/n/app/bcbio/R4.3.1")` and `library(bcbioR)` that we used earlier, so we will need to re-do them in the following steps.

### Setting up your workspace
## Using the template reports

Many analyses have template reports that you can use. You can use these by using the approriate `bcbioR::bcbio_templates()` command from the table below:

| Type of Analysis | `bcbioR::bcbio_templates()` command |
|:---:|:---|
| Bulk RNA-seq ![](https://img.shields.io/badge/status-stable-blue)| `bcbioR::bcbio_templates(type="rnaseq", outpath="reports")` |
| Single-cell RNA-seq ![](https://img.shields.io/badge/status-beta-yellow) | `bcbioR::bcbio_templates(type="singlecell", outpath="reports")` |
| ChIP-Seq ![](https://img.shields.io/badge/status-beta-yellow) | `bcbioR::bcbio_templates(type="chipseq", outpath="reports")` |
| CellChat ![](https://img.shields.io/badge/status-draft-grey)| **Under development:** `bcbioR::bcbio_templates(type="singlecell_delux", outpath="reports")` |
| COSMX ![](https://img.shields.io/badge/status-draft-grey)| `bcbioR::bcbio_templates(type="spatial", outpath="reports")` |
| DNA Methylation | **Under development** |

## Setting up your workspace in O2

We will now add the `.libPath()` that is appropriate for our type of analysis. You can use the table below to determine which `.libPath()` is appropriate for your analysis:

Expand Down Expand Up @@ -129,7 +147,7 @@ Now, we will use `bcbioR` to set-up the directory structure that we will be usin
bcbioR::bcbio_templates(type="base", outpath=".", org="hcbc")
```

### Setting up GitHub and RStudio
## Setting up GitHub and RStudio

Now, we will connect O2 with GitHub. First, check in your Home directory if a `.gitconfig` file exists. ***You should only need to do this once.*** The contents should look like:

Expand Down Expand Up @@ -169,7 +187,7 @@ Note: In order to see hidden file in your file browser on the O2 Portal, you wil
<hr />
</details>

#### Getting the Git tab
### Getting the Git tab

Now, we would like to get the Git tab into our Workspace Browser (where `Environment`, `History`, `Connections` and `Tutorial` tabs are located). We show this transition below:

Expand Down Expand Up @@ -219,7 +237,7 @@ Restart now?

We will need to restart R in order to get the Git tab in our R Studio, so select `For sure`, `Yeah` or some other option for answering in the affirmative.

#### Creating the first commit
### Creating the first commit

Now, we are going to create our first commit. In order to do this, we need to:

Expand All @@ -234,7 +252,7 @@ These steps are summarized in the GIF below:

<p align="center"><img src="./img/Initial_commit.gif" width="1000"></p>

#### Pushing our initial commit
### Pushing our initial commit

Now we will use the function to push these changes to GitHub with the following command:

Expand All @@ -260,7 +278,7 @@ If the push is successful, then it will look like this GIF below:
> Note: You might get a GitHub 404 error page (see image below) when you do your first push to GitHub. Just refresh the page in your browser and it should be resolve itself.
> <p align="center"><img src="./img/GitHub_404_error_with_label.png" width="700"></p>

##### Expired or non-existent GitHub token
#### Expired or non-existent GitHub token

However, if your token is expired or this is your first time using GitHub from O2, then you will get this message:

Expand Down Expand Up @@ -356,18 +374,6 @@ You should now see the HBC code as the header to the `README.md` on GitHub. Thes

<p align="center"><img src="./img/Guideline_push.gif" width="1000"></p><br>

## Using the template reports

Many analyses have template reports that you can use. You can use these by using the approriate `bcbioR::bcbio_templates()` command from the table below:

| Type of Analysis | `bcbioR::bcbio_templates()` command |
|:---:|:---|
| Bulk RNA-seq| `bcbioR::bcbio_templates(type="rnaseq", outpath="reports")` |
| Single-cell RNA-seq | `bcbioR::bcbio_templates(type="singlecell", outpath="reports")` |
| ChIP-Seq | `bcbioR::bcbio_templates(type="chipseq", outpath="reports")` |
| CellChat | **Under development:** `bcbioR::bcbio_templates(type="singlecell_delux", outpath="reports")` |
| DNA Methylation | **Under development** |

# Tips for Moving Forward

Now that we've gotten set-up for our project, here are a few last tips to try to make your experience smooth:
Expand All @@ -376,19 +382,6 @@ Now that we've gotten set-up for our project, here are a few last tips to try to
- Try to avoid editing files directly on GitHub. If you do, it will be important that you `Pull` the repository onto O2 before continuing on with your work on O2. If you forget to do this pull and make commits on O2, you can fix it, but it is beyond the scope of this guide.
- Use the checklist in the `README.md` to help keep track of your progress.

# bcbioR supported templates

We used `bcbioR` to deploy folders and code to our project directories to improve robustness in our analysis.

You can install `bcbioR` as indicated here: `https://github.com/bcbio/bcbioR/tree/main`

- RNAseq ![](https://img.shields.io/badge/status-stable-blue)
- ChipSeq ![](https://img.shields.io/badge/status-beta-yellow)
- scRNAseq ![](https://img.shields.io/badge/status-beta-yellow)
- CELLCHAT ![](https://img.shields.io/badge/status-draft-grey)
- TEASeq ![](https://img.shields.io/badge/status-draft-grey)
- COSMX ![](https://img.shields.io/badge/status-draft-grey)


# Note
>These materials have been developed by members of the teaching and platform team at the Harvard Chan Bioinformatics Core (HBC) RRID:SCR_025373.
33 changes: 33 additions & 0 deletions docs/pipelines.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,39 @@

Content - ![](https://img.shields.io/badge/status-WorkInProgress-yellow)

## Data Management

### AWS

- Samplesheet input files for pipelines
- `pipelineName_PI_hbcNNNNNN`
- Have a copy in project folder in O2
- Manually removing weekly during platform meeting
- Raw data is under `input` folder
- See instructions below to move data in/out
- `pipelineName_PI_hbcNNNNNN`
- lifecycle 14 days
- Pipeline outputs are under `results`:
- `pipelineName_PI_hbcNNNNNN`
- lifecycle 14 days for bigger than 1gb
- Move output pipeline to project folder under `final` folder

### Move that in/out of AWS

Follow this to copy data in and out of our AWS space:

- Log in into transfer node in O2
- Type `sudo -su bcbio` to be bcbio user
- Use this command to copy data to AWS:
```
/usr/local/bin/aws s3 sync $FOLDER_WITH_FASTQ s3://hcbc-seqera/input/rnaseq_piname_hbcNNNN
```
- Use this command to copy data from AWS:
```
/usr/local/bin/aws s3 sync s3://hcbc-seqera/results/rnaseq_piname_hbcNNNN $FOLDER_PROJECT
```
**Make sure bcbio group has read/write access to the folders otherwise `aws` command won't work, but won't error either.**

## Parameters

### RNAseq
Expand Down
1 change: 1 addition & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ site_name: HCBC Platform
nav:
- Home: index.md
- Pipelines: pipelines.md
- Tools environments: environments.md
- Platform members: devops.md

theme:
Expand Down