CellBioStats is an application that simplifies the creation of publication-quality SuperPlots and performs statistical analysis on hierarchical data from common experimental results in cell biology, molecular biology, and biochemistry, helping you visualize complex datasets and avoid pitfalls like pseudoreplication.
The current version is still limited in terms of case uses, but the idea is the project to grow to be more comprehensive.
- Interactive SuperPlots: Generates publication-quality SuperPlots, showing individual data points, replicate means, and overall treatment means with standard error (SEM).
- Automated Statistical Analysis: Automatically performs normality checks, variance homogeneity tests, and selects the appropriate statistical test for your data.
- Hierarchical Statistics: Avoids pseudoreplication by correctly performing statistical tests on replicate means, not on raw technical measurements.
- Paired & Unpaired Data: Handles both independent (unpaired) and repeated measures (paired) experimental designs.
- Data Upload: Supports both
.csvand.xlsxfile formats. - Plot Customization: Allows customization of plot aesthetics (font size, color schemes, marker size, and plot templates).
- Downsampling for plot visualization: Option to display a random subset of your data points to keep plots clean and responsive with very large datasets.
- Export Results: Download the detailed statistical summary as a
.txtfile and the plot as a.png, .jpeg, .svg or .pdffile directly from the app.
You can run CellBioStats in three different ways.
This is the easiest method for most users. No installation is required.
- Go to the dist page of this repository.
- Download the latest
.exefile. - Double-click the file to run the application. BE AWARE THAT IT MIGHT BE FLAGGED AS A VIRUS BY SOME ANTI VIRUS SOFTWARE AND YOU MIGHT NEED TO TEMPORARILY DISABLE THE DEFENSE SYSTEM TO RUN THE APP!.
Run the app in your browser without any local installation.
- Click the "Open in Colab" badge at the top of this README.
- Run the cells in the notebook to start the application.
Note that in Colab it won't be possible to download the plots as PNG and JPEG by clicking on the "Download Plot" button. But, you are still going to be able to download the PNG file by clicking on the camera icon. After choosing SVG or PDF extensions, the file can be found in the Colab folder panel.
If you have Python installed, you can run the app from the source code.
-
Clone the repository as below or just directly download the app.py file:
git clone [https://github.com/brunicardoso/CellBioStats.git](https://github.com/brunicardoso/CellBioStats.git) cd CellBioStats -
Create a virtual environment and install dependencies. Choose either Conda or venv.
Using Conda:
# Create a new conda environment conda create --name cellbiostats python=3.10 # Activate the environment conda activate cellbiostats # Install the required packages pip install -r requirements.txt
Using venv:
# Create a virtual environment python -m venv venv # Activate the environment # On macOS/Linux: source venv/bin/activate # On Windows: venv\Scripts\activate # Install the required packages pip install -r requirements.txt
-
Run the application:
python app.py
-
Open your web browser and navigate to
http://127.0.0.1:8050.
- Upload Data: Click the "Upload Data" button and select your
.csvor.xlsxfile. You can try it on our sample data - Map Columns: Select the appropriate columns from your file for Treatment, Value, and Replicate.
- Select Test Type: (Optional) If your experiment uses a repeated measures or paired design, check the "Use paired/repeated statistical tests" box. Read this if you are not sure about choosing paired or unpaired tests
- Generate: Click the "Generate Analysis/Update Plot" button.
- Customize: Use the sliders and dropdowns in the "Plot Customization" panel to adjust the appearance of your plot. Click the "Generate Analysis/Update Plot" button to update the plot.
- Download: Click the "Download Summary" button to save the statistical report, or use the camera icon on the plot to save it as a PNG image or the "Download Plot" button to save the plot with different resolutions and file extensions (.png, .jpeg, .svg, or .pdf).
The application expects your data to be in a specific format, like in the example below. You must have at least three columns:
- Treatment Column: Identifies the different experimental groups (e.g., 'Control', 'Drug A').
- Value Column: Contains the numeric measurement data (the dependent variable; e.g., cell size, expression level, etc).
- Replicate Column: Identifies the independent experimental replicates (e.g., experiment number of independent replications done in different days or animal ID).
Here is an example of a valid data structure:
| Treatment | Cell_size | Replicate |
|---|---|---|
| Control | 10.5 | 1 |
| Control | 11.2 | 1 |
| Control | 12.1 | 2 |
| Control | 11.8 | 2 |
| Drug A | 15.3 | 1 |
| Drug A | 14.9 | 1 |
| Drug A | 16.5 | 2 |
| Drug A | 17.1 | 2 |
CellBioStats is designed to perform statistically sound analysis by respecting the hierarchical nature of typical biological data.
- Data Aggregation: All primary statistical tests are performed on the means of each replicate, not on the raw technical measurements. This avoids pseudoreplication and ensures that the statistical power reflects the number of independent experiments.
- Assumption Checks:
- Normality: The Shapiro-Wilk test is run on the replicate means for each treatment group.
- Homoscedasticity (Equal Variances): Levene's test is run to check for equality of variances across groups.
- Automated Test Selection: Based on the number of groups, the experimental design (paired/unpaired), and the results of the assumption checks, the app automatically selects the most appropriate statistical test.
| # of Groups | Design | Assumptions Met (Normal & Homoscedastic) | Assumptions Not Met |
|---|---|---|---|
| 2 | Unpaired | Student's t-test | Mann-Whitney U test |
| >2 | Unpaired | One-way ANOVA + Tukey HSD post-hoc | Kruskal-Wallis + Mann-Whitney U post-hoc (Bonferroni) |
| 2 | Paired | Paired t-test | Wilcoxon signed-rank test |
| >2 | Paired | Repeated Measures ANOVA + Paired t-test (Bonferroni) | Friedman test + Wilcoxon post-hoc (Bonferroni) |
- Dash
- Plotly
- Pandas
- SciPy & Statsmodels
- PyInstaller - For packaging the standalone
.exeapplication.
-
Lord, S. J., Velle, K. B., Mullins, R. D., & Fritz-Laylin, L. K. (2020). SuperPlots: Communicating reproducibility and variability in cell biology. Journal of Cell Biology, 219(6). https://doi.org/10.1083/jcb.202001064
-
Pollard, D. A., Pollard, T. D., & Pollard, K. S. (2019). Empowering statistical methods for cellular and molecular biologists. Molecular Biology of the Cell, 30(12), 1359–1368. https://doi.org/10.1091/mbc.e15-02-0076
Bruni-Cardoso, A. (2025). CellBioStats, an application for robust visualization and statistical analysis of data from cell and molecular biology experiments
This project is licensed under the MIT License - see the LICENSE file for details.
