diff --git a/.github/workflows/apple.yml b/.github/workflows/apple.yml index 69b4683..8fb3e21 100644 --- a/.github/workflows/apple.yml +++ b/.github/workflows/apple.yml @@ -18,7 +18,7 @@ jobs: fail-fast: false matrix: #python-version: ["3.14"] - python-version: ["3.10", "3.11", "3.12", "3.13"] + python-version: ["3.11", "3.12", "3.13"] env: MPLBACKEND: Agg # https://github.com/orgs/community/discussions/26434 diff --git a/.github/workflows/linux.yml b/.github/workflows/linux.yml index 4884b40..8c0a067 100644 --- a/.github/workflows/linux.yml +++ b/.github/workflows/linux.yml @@ -18,7 +18,7 @@ jobs: fail-fast: false matrix: #python-version: ["3.14"] - python-version: ["3.10", "3.11", "3.12", "3.13"] + python-version: ["3.11", "3.12", "3.13"] env: MPLBACKEND: Agg # https://github.com/orgs/community/discussions/26434 diff --git a/.github/workflows/windows.yml b/.github/workflows/windows.yml index d114976..6d8260b 100644 --- a/.github/workflows/windows.yml +++ b/.github/workflows/windows.yml @@ -18,7 +18,7 @@ jobs: fail-fast: false matrix: #python-version: ["3.14"] - python-version: ["3.10", "3.11", "3.12", "3.13"] + python-version: ["3.11", "3.12", "3.13"] env: MPLBACKEND: Agg # https://github.com/orgs/community/discussions/26434 diff --git a/README.md b/README.md index 68d626c..a80679f 100644 --- a/README.md +++ b/README.md @@ -2,9 +2,12 @@ ## Abstract: -physicelldataloader (pcdl) provides a platform independent, python3 based, [pip](https://en.wikipedia.org/wiki/Pip_(package_manager)) installable interface +physicelldataloader (pcdl) provides a platform-independent (Windows, MacOSX, Linux), python3 based, [pip](https://en.wikipedia.org/wiki/Pip_(package_manager))-installable set of commands to load output, generated with the [PhysiCell](https://github.com/MathCancer/PhysiCell) agent-based modeling and diffusion solver framework, -into [python3](https://en.wikipedia.org/wiki/Python_(programming_language)). +into [python3](https://en.wikipedia.org/wiki/Python_(programming_language)) or transform PhysiCell output into more widely used data formats. +pcdl can be loaded as a python3 module or run straight from the command line. + +![pcdl concept](man/img/physicelldataloader_concept_v4.0.0.png) pcdl was forked from the original [PhysiCell-Tools](https://github.com/PhysiCell-Tools) [python-loader](https://github.com/PhysiCell-Tools/python-loader) implementation. @@ -19,8 +22,8 @@ The pcdl python3 library maintains four branches: ## Header: -+ Language: python [>= 3.10](https://devguide.python.org/versions/) -+ Library dependencies: anndata, bioio, matplotlib, numpy, pandas, (requests), scipy, vtk ++ Language: python [>= 3.11](https://devguide.python.org/versions/) ++ Library dependencies: anndata, bioio, geopandas, matplotlib, neuroglancer, numpy, pandas, (requests), scikit-image, scipy, shapely, spatialdata, vtk + Date of origin original PhysiCell-Tools python-loader: 2019-09-02 + Date of origin pcdl fork: 2022-08-30 + Doi: https://doi.org/10.5281/ZENODO.8176399 @@ -92,7 +95,7 @@ Within the pcdl library, we tried to stick to the documentation policy laid out + original PhysiCell-Tools python-loader implementation: Patrick Wall, Randy Heiland, Paul Macklin + fork pcdl implementation: Elmar Bucher + fork pcdl co-programmer: Furkan Kurtoglu, Heber Rocha, Jennifer Eng -+ fork pcdl continuous testing and feedbacks: Aneequa Sundus, John Metzcar ++ fork pcdl continuous testing and feedbacks: Aneequa Sundus (python), John Metzcar (python), Raquel Arroya (matlab) + student prj on pcdl: Benjamin Jacobs (make\_graph\_gml), Jason Lu (render\_neuroglancer), @@ -110,7 +113,7 @@ Developers, please make pull requests to the https://github.com/elmbeech/physice ```bibtex @Misc{bucher2023, - author = {Bucher, Elmar and Wall, Patrick and Rocha, Heber and Kurtoglu, Furkan and Eng, Jennifer and Sundus, Aneequa, and Metzcar, John and Heiland, Randy and Macklin, Paul}, + author = {Bucher, Elmar and Wall, Patrick and Rocha, Heber and Kurtoglu, Furkan and Eng, Jennifer and Sundus, Aneequa, and Metzcar, John and Arroya, Raquel and Heiland, Randy and Macklin, Paul}, title = {elmbeech/physicelldataloader: pcdl platform-independent, pip-installable interface to load PhysiCell agent-based modeling framework output into python3.}, year = {2023}, copyright = {Open Access}, @@ -124,7 +127,11 @@ Developers, please make pull requests to the https://github.com/elmbeech/physice + evt generate lineage tree graph output files. + ## Release Notes: ++ version 4.1.0 (2025-12-31): elmbeech/physicelldataloader + + new TimeStep class and TimeSeris class function **get_spatialdata** and command line command **pcdl_get_spatialdata**. + + version 4.0.5 (2025-10-22): elmbeech/physicelldataloader + **settingxml** default is now set to False, because the cell\_type id label mapping can, in recent PhysiCell output, be retrieved from output\*.xml too. + **plot_scatter** and **plot_timeseries** now additionally have a cat\_drop and cat\_keep argument to filter categorical data. @@ -135,7 +142,7 @@ Developers, please make pull requests to the https://github.com/elmbeech/physice + command line commands now return **error code 0** if the command runs successfully. + version 4.0.3 (2025-07-20): elmbeech/physicelldataloader - + timestep and timeseries **plot_contour**, **plot_scatter**, and **plot_timeseries** handle now **kwargs** arguments. + + TimeStep and TimeSeris **plot_contour**, **plot_scatter**, and **plot_timeseries** handle now **kwargs** arguments. + minor bugfixes. + version 4.0.2 (2025-06-29): elmbeech/physicelldataloader diff --git a/man/TUTORIAL_commandline.md b/man/TUTORIAL_commandline.md index ae15560..2ac3aaa 100644 --- a/man/TUTORIAL_commandline.md +++ b/man/TUTORIAL_commandline.md @@ -13,7 +13,6 @@ Please spend some time to learn about each of the about 20 commands, by studying This will truly make you a power user! - ## Preparation To runs this tutorial, @@ -31,10 +30,8 @@ python3 -c"import pathlib, pcdl, shutil; pcdl.install_data(); s_ipath=str(pathli ``` - ## Metadata related commands - ### ✨ pcdl\_get\_version Outputs PhysiCell, MCDS, and pcdl version on screen. @@ -49,7 +46,6 @@ pcdl_get_version output/output00000000.xml pcdl_get_version -h ``` - ### ✨ pcdl\_get\_unit\_dict Generate a [csv](https://en.wikipedia.org/wiki/Comma-separated_values) file that maps attribute and units, as specified in the settings.xml. @@ -65,10 +61,8 @@ pcdl_get_unit_dict -h ``` - ## Microenvironment related commands - ### ✨ pcdl\_get\_substrate\_list Outputs all substrates modeled in the microenvironment on screen. @@ -83,7 +77,6 @@ pcdl_get_substrate_list output/output00000000.xml pcdl_get_substrate_list -h ``` - ### ✨ pcdl\_get\_conc\_attribute Generate a [json](https://en.wikipedia.org/wiki/JSON) file, that lists all substrate attributes. @@ -108,7 +101,6 @@ Further readings: + [TUTORIAL_r.md](https://github.com/elmbeech/physicelldataloader/blob/master/man/TUTORIAL_r.md) + [TUTORIAL_julia.md](https://github.com/elmbeech/physicelldataloader/blob/master/man/TUTORIAL_julia.md) - ### ✨ pcdl\_get\_conc\_df Generate a dataframe [csv](https://en.wikipedia.org/wiki/Comma-separated_values) file that lists one voxel per row, @@ -133,7 +125,6 @@ Further readings: + [TUTORIAL_r.md](https://github.com/elmbeech/physicelldataloader/blob/master/man/TUTORIAL_r.md) + [TUTORIAL_julia.md](https://github.com/elmbeech/physicelldataloader/blob/master/man/TUTORIAL_julia.md) - ### ✨ pcdl\_plot\_contour For oxygen generate a [jpeg](https://en.wikipedia.org/wiki/JPEG) file @@ -149,7 +140,6 @@ pcdl_plot_contour output/output00000000.xml oxygen pcdl_plot_contour -h ``` - ### ✨ pcdl\_make\_conc\_vtk Generate a rectilinear grid [vtk](https://en.wikipedia.org/wiki/VTK) file from a single time step, @@ -179,7 +169,6 @@ Further readings: ## Cell agent related commands - ### ✨ pcdl\_get\_celltype\_list Output all cell types modeled. @@ -232,7 +221,6 @@ Further readings: + [TUTORIAL_r.md](https://github.com/elmbeech/physicelldataloader/blob/master/man/TUTORIAL_r.md) + [TUTORIAL_julia.md](https://github.com/elmbeech/physicelldataloader/blob/master/man/TUTORIAL_julia.md) - ### ✨ pcdl\_get\_cell\_df Generate a dataframe [csv](https://en.wikipedia.org/wiki/Comma-separated_values) file that lists one cell per row, @@ -244,7 +232,11 @@ In the example below, the generated csv contains: ```bash pcdl_get_cell_df output 2 +``` +```bash pcdl_get_cell_df output/output00000000.xml +``` +```bash pcdl_get_cell_df -h ``` @@ -253,17 +245,15 @@ Further readings: + [TUTORIAL_r.md](https://github.com/elmbeech/physicelldataloader/blob/master/man/TUTORIAL_r.md) + [TUTORIAL_julia.md](https://github.com/elmbeech/physicelldataloader/blob/master/man/TUTORIAL_julia.md) - ### ✨ pcdl\_get\_anndata -From the whole time series or from a single time step generate h5ad [anndata](https://anndata.readthedocs.io/en/latest/) [hd5](https://en.wikipedia.org/wiki/Hierarchical_Data_Format) files. +From the whole time series or from a single time step, generate h5ad [anndata](https://anndata.readthedocs.io/en/latest/) [hd5](https://en.wikipedia.org/wiki/Hierarchical_Data_Format) files. Anndata is the standard data format in the python single cell community. Data stored in this format can be analyzed the same way as usually sc RNA seq data is analyzed. ```bash pcdl_get_anndata output/output00000000.xml -pcdl_get_anndata -h ``` ```bash pcdl_get_anndata output @@ -277,7 +267,6 @@ Further readings: + [TUTORIAL_r.md](https://github.com/elmbeech/physicelldataloader/blob/master/man/TUTORIAL_r.md) + [TUTORIAL_julia.md](https://github.com/elmbeech/physicelldataloader/blob/master/man/TUTORIAL_julia.md) - ### ✨ pcdl\_make\_graph\_gml Generate [gml](https://github.com/elmbeech/physicelldataloader/blob/master/man/publication/himsolt1996gml_a_portable_graph_file_format.pdf) files. @@ -300,7 +289,6 @@ Further readings: + [TUTORIAL_r.md](https://github.com/elmbeech/physicelldataloader/blob/master/man/TUTORIAL_r.md) + [TUTORIAL_julia.md](https://github.com/elmbeech/physicelldataloader/blob/master/man/TUTORIAL_julia.md) - ### ✨ pcdl\_plot\_scatter Generate a [jpeg](https://en.wikipedia.org/wiki/JPEG) file that displaying all cells. @@ -319,7 +307,6 @@ pcdl_plot_scatter output pcdl_plot_scatter -h ``` - ### ✨ pcdl\_make\_cell\_vtk Generate a 3D glyph [vtk](https://en.wikipedia.org/wiki/VTK) file from a single time step, @@ -348,9 +335,28 @@ Further readings: + [TUTORIAL_julia.md](https://github.com/elmbeech/physicelldataloader/blob/master/man/TUTORIAL_julia.md) - ## Microenvironment and cell agent related commands +### ✨ pcdl\_get\_spatialdata + +From a single time step, generate [spatialdata](https://spatialdata.scverse.org/en/stable/) [zarr](https://zarr.dev/) files. +The spatialdata format should, in the long run, become comaptibel with the [OME-NGFF](https://ngff.openmicroscopy.org/latest/index.html) data format. + +Spatialdata is the standard data format in the python spatial single cell community. +Data stored in this format can be analyzed the same way as spatial sc RNA seq data is analyzed. + +```bash +pcdl_get_spatialdata output/output00000000.xml +``` +```bash +pcdl_get_spatialdata output +``` +```bash +pcdl_get_spatialdata -h +``` + +Further readings: ++ [TUTORIAL_python3_scverse.md](https://github.com/elmbeech/physicelldataloader/blob/master/man/TUTORIAL_python3_scverse.md) ### ✨ pcdl\_plot\_timeseries @@ -376,11 +382,9 @@ pcdl_plot_timeseries output none ```bash pcdl_plot_timeseries output cell_type ``` - ```bash pcdl_plot_timeseries output cell_type oxygen ``` - ```bash pcdl_plot_timeseries output cell_type oxygen max ``` @@ -388,7 +392,6 @@ pcdl_plot_timeseries output cell_type oxygen max ```bash pcdl_plot_timeseries output none oxygen ``` - ```bash pcdl_plot_timeseries output none oxygen --frame conc ``` @@ -397,7 +400,6 @@ pcdl_plot_timeseries output none oxygen --frame conc pcdl_plot_timeseries -h ``` - ### ✨ pcdl\_make\_ome\_tiff Generate an [ome.tiff](https://ome-model.readthedocs.io/en/stable/index.html) file, @@ -428,7 +430,6 @@ Further readings: + [TUTORIAL_neuroglancer.md](https://github.com/elmbeech/physicelldataloader/blob/master/man/TUTORIAL_neuroglancer.md) + [TUTORIAL_blender.md](https://github.com/elmbeech/physicelldataloader/blob/master/man/TUTORIAL_blender.md) - ### ✨ pcdl\_render\_neuroglancer With this command, you can render a time step ome.tiff file or a time step from a whole time series ome.tiff file straight into [Neuroglancer](https://research.google/blog/an-interactive-automated-3d-reconstruction-of-a-fly-brain/), which is a [WebGL](https://en.wikipedia.org/wiki/WebGL)-based viewer that will render the ome.tiff straight in your browser. @@ -477,6 +478,7 @@ pcdl_make_movie output/cell_cell_type_z0.0/ pcdl_make_movie -h ``` + ## Data Clean Up After you are done checking out the 2D unit test dataset, diff --git a/man/TUTORIAL_julia.md b/man/TUTORIAL_julia.md index b766b79..3d642fe 100644 --- a/man/TUTORIAL_julia.md +++ b/man/TUTORIAL_julia.md @@ -3,6 +3,37 @@ [Julia](https://julialang.org/) is a scientific computing language. +## ✨ Run pcdl within Julia + +We are using the [PyCall.js](https://github.com/JuliaPy/PyCall.jl) library +to run pcdl within Julia. + +Make sure that the python3 environment is activated, which has pcdl installed. + +Fire up a Juila shell. +```bash +julia +``` + +Pakage installation. + +```julia +using Pkg +Pkdg.add("PyCall") +``` + +Run pcdl. + +```julia +using PyCall + +pcdl = pyimport("pcdl") # import the pcdl module. +mcdsts = pcdl.TimeSeries("path/to/PhysiCell/output/") # load an mcds time series. + +?mcdsts.get_cell_df() # retrieve a function's docstring. +df = mcdsts.get_cell_df() # retrieve the cell dataframe. +``` + ## ✨ Handle csv files ### Save pcdl data structures as csv files from the command line @@ -98,7 +129,7 @@ pcdl_make_graph_gml output/output00000024.xml neighbor --node_attribute cell_typ ### Load gml files into a julia data structures -⚠ **bue 2024-09-04:** this is currently not working, since, for now, GraphIO cannot handle the graph, node, or edge metadata in the file. +⚠ **bue 2024-09-04:** this is currently not working, since, for now, GraphIO cannot handle the graph, node, or edge metadata in the file ( https://github.com/JuliaGraphs/GraphIO.jl/issues/46 ). We will use the [GraphIO.js](https://github.com/JuliaGraphs/GraphIO.jl) library, to load gml files. @@ -142,7 +173,7 @@ pcdl_get_anndata output/ ### Load h5ad files into a julia data structures -We will use scver's [Muon.jl](https://github.com/scverse/Muon.jl) library, +We will use scverse's [Muon.jl](https://github.com/scverse/Muon.jl) library, to load h5ad files. Package installation. @@ -203,7 +234,6 @@ Load image file. ```julia using FileIO -using Images ``` ```julia omeimg = load("output/timeseries_ID.ome.tiff") diff --git a/man/TUTORIAL_python3_scverse.md b/man/TUTORIAL_python3_scverse.md index ef55b99..349c4cb 100644 --- a/man/TUTORIAL_python3_scverse.md +++ b/man/TUTORIAL_python3_scverse.md @@ -1,15 +1,14 @@ # PhysiCell Data Loader Tutorial: pcdl and Python and the scVerse -[AnnData](https://anndata.readthedocs.io/en/latest/) is the data standard from the python single cell community. -This means, PhysiCell output transformed into an AnnData object can be analyzed the same way sc RNA seq data is analyzed. +[AnnData](https://anndata.readthedocs.io/en/latest/) and [SpatialData](https://spatialdata.scverse.org/en/stable/) are data standards from the python single cell community. +This means, PhysiCell output transformed into an AnnData and SpatialData objects can be analyzed the same way sc RNA seq data is analyzed. The whole [scverse](https://scverse.org/) (single cell univers) becomes accessible. This includes: + [scanpy](https://scanpy.readthedocs.io/en/latest/): for classic single cell analysis. + [squidpy](https://squidpy.readthedocs.io/en/stable/): for spatial single cell analysis. + [scvi-tools](https://scvi-tools.org/): for single cell machine learning. -+ [muon](https://muon.readthedocs.io/en/latest/): for multimodal omics analysis. -And there is a whole [ecosystem](https://scverse.org/packages/#ecosystem) of libraries, compatible with the AnnData format. +And there is a whole [ecosystem](https://scverse.org/packages/#ecosystem) of libraries, compatible with the AnnData (and SpatialData) format. Whatever you d'like to do with your physicell data, it most probably was already done with single cell wet lab data. That's being said: PhysiCell data is different scdata than scRNA seq data! diff --git a/man/TUTORIAL_r.md b/man/TUTORIAL_r.md index 33d1d9a..1b0d5f0 100644 --- a/man/TUTORIAL_r.md +++ b/man/TUTORIAL_r.md @@ -4,6 +4,38 @@ which, because of its library collection, is very popular among bioinformatician. +## ✨ Run pcdl within R + +We are using the [reticulate](https://github.com/rstudio/reticulate) library +to run pcdl within R. + +Make sure that the python3 environment is activated, which has pcdl installed. + +Fire up an R shell. +```bash +R +``` + +Package installation. + +```R +install.packages("reticulate") +``` + +Run pcdl. + +```R +library("reticulate") + +pcdl <- import("pcdl") # import the pcdl module. +mcdsts <- pcdl$TimeSeries("path/to/PhysiCell/output/") # load an mcds time series. + +py_help(mcdsts$get_cell_df) # retrieve a function's docstring. +df <- mcdsts$get_cell_df() # retrieve the cell dataframe. +str(df) +``` + + ## ✨ Handle csv files ### Save pcdl data structures as csv files from the command line @@ -134,6 +166,7 @@ pcdl_get_anndata output/ ### AnnData, SingleCellExperiment and Seurat We will use the [schard](https://github.com/cellgeni/schard) R package +or the [SeuratDisk](https://github.com/mojaveazure/seurat-disk) R package to translate the h5ad file into R data structures that can be analyzed by [singlecellexperiment](https://bioconductor.org/packages/release/bioc/html/SingleCellExperiment.html) and [seurat](https://satijalab.org/seurat/). @@ -145,12 +178,26 @@ Special thanks to Marcello Hurtado from the Pancald Lab, who told me that such t ```R install.packages("devtools") ``` + +Install the schard package. + ```R +library("devtools") devtools::install_github("cellgeni/schard") ``` -Homepage: +Install the SeuratDisk package. + +```R +library("devtools") +devtools::install_github("mojaveazure/seurat-disk") +``` + +Homepaged: + https://github.com/cellgeni/schard ++ https://github.com/mojaveazure/seurat-disk ++ https://mojaveazure.github.io/seurat-disk/articles/convert-anndata.html + #### Data analysis with SingleCellExperiment R bioconductor package @@ -161,7 +208,7 @@ install.packages("BiocManager") BiocManager::install("SingleCellExperiment") ``` -Translate the h5ad file to a sce R object +Translate the h5ad file to a sce R object by shard. ```R cell.sce = schard::h5ad2sce("output/timeseries_cell_maxabs.h5ad") @@ -169,7 +216,6 @@ str(cell.sce) ``` For how to analyse with singlecellexperiment, please study the official publication and documentation. - + https://github.com/cellgeni/schard + https://pubmed.ncbi.nlm.nih.gov/31792435/ + https://bioconductor.org/packages/release/bioc/html/SingleCellExperiment.html @@ -182,8 +228,11 @@ Install the Seurat software. install.packages('Seurat') ``` -Translate the h5ad file to a seurat compatible R objects +Translate the h5ad file to a seurat compatible R objects by schard. +```R +library("schard") +``` ```R cell.seurat = schard::h5ad2seurat("output/timeseries_cell_maxabs.h5ad") str(cell.seurat) @@ -193,10 +242,109 @@ cell.seurat_spatial = schard::h5ad2seurat_spatial("output/timeseries_cell_maxabs str(cell.seurat_spatial) ``` -For how to analyze with seurat, please study the official documentation. +Or translate the h5ad file to a seurat compatible R objects by SeuratDisk. + +```R +library("SeuratDisk") +``` +```R +SeuratDisk::Convert("output/timeseries_cell_maxabs.h5ad", dest="h5seurat", overwrite=TRUE) +cell.seurat_disk <- SeuratDisk::LoadH5Seurat("output/timeseries_cell_maxabs.h5seurat") +``` +For how to analyze with seurat, please study the official documentation. + https://github.com/cellgeni/schard ++ https://github.com/mojaveazure/seurat-disk ++ https://mojaveazure.github.io/seurat-disk/articles/convert-anndata.html + https://satijalab.org/seurat/ +## ✨ Handle ome.tiff, tiff, png, and jpeg file format + +### Save pcdl data structures as jpeg, png, tiff, and ome.tiff files from the command line + +```bash +pcdl_plot_timeseries output/ --ext jpeg +``` +```bash +pcdl_plot_timeseries output/ --ext png +``` +```bash +pcdl_plot_timeseries output/ --ext tiff +``` +```bash +pcdl_make_ome_tiff('output/') +``` + + +### Load jpeg, png, tiff, and ometiff files into a R data structures + +We will use the [jpeg](https://cran.r-project.org/web/packages/jpeg/index.html), [png](https://cran.r-project.org/web/packages/png/index.html), [tiff](https://cran.r-project.org/web/packages/tiff/index.html) and [RBioFormats](https://bioconductor.org/packages/release/bioc/vignettes/RBioFormats/inst/doc/RBioFormats.html) libraries to load this images into R. + +#### Jpeg images + +Install the required R package. + +```R +install.packages("jpeg") +``` + +Load the image. + +```R +library("jpeg") +img <- readJPEG("output/timeseries_cell_total_count.jpeg") +str(img) +``` + +#### Png images + +Install the required R package. + +```R +install.packages("png") +``` + +Load the image. + +```R +library("png") +img <- readPNG("output/timeseries_cell_total_count.png") +str(img) +``` + +#### Tiff images + +Install the required R package. + +```R +install.packages("tiff") +``` + +Load the image. + +```R +library("tiff") +img <- readTIFF("output/timeseries_cell_total_count.tiff") +str(img) +``` + +#### Ome.tiff images + +Install the required R package. + +```R +install.packages("BiocManager") +BiocManager::install("RBioFormats") +``` + +Load the image. + +```R +library("RBioFormats") +omeimg <- read.image(""output/timeseries_ID.ome.tiff"") +str(img) +``` + + That's it diff --git a/man/docstring/mcds.get_spatialdata.md b/man/docstring/mcds.get_spatialdata.md new file mode 100644 index 0000000..bb817e1 --- /dev/null +++ b/man/docstring/mcds.get_spatialdata.md @@ -0,0 +1,61 @@ +# mcds.get_spatialdata() + + +## input: +``` + images: set of string; default {'subs'} + specify if from the subs or cell dataset + a multichannel image should be generate. + so far, only the subs image element is implemented. + + labels: set of strings; default is an empty set + specify if from the subs or cell dataset + a label element should be generated. + so far, neither subs nor cell label elements are implemented. + + points: set of string; default {'subs'} + specify if from the subs or cell dataset + a points element should be generated. + both, subs and cell point elements, are implemented. + + shapes: set of string; default {'cell'} + specify if from the subs or cell dataset + a shape element should be generated. + so far, only the cell shape element is implemented. + + values: integer; default is 1 + minimal number of values a variable has to have to be outputted. + variables that have only 1 state carry no information. + None is a state too. + + drop: set of strings; default is an empty set + set of column labels to be dropped for the dataframe. + don't worry: essential columns like ID, coordinates + and time will never be dropped. + Attention: when the keep parameter is given, then + the drop parameter has to be an empty set! + + keep: set of strings; default is an empty set + set of column labels to be kept in the dataframe. + don't worry: essential columns like ID, coordinates + and time will always be kept. + + scale: string; default 'maxabs' + specify how the data should be scaled. + possible values are None, maxabs, minmax, std. + for more input, check out: help(pcdl.scaler) + +``` + +## output: +``` + self.l_sdmcds: list of spatialdata objects. + +``` + +## description: +``` + function to transform a mcds time step into + a spatialdata object for downstream analysis. + +``` \ No newline at end of file diff --git a/man/docstring/mcdsts.get_spatialdata.md b/man/docstring/mcdsts.get_spatialdata.md new file mode 100644 index 0000000..bfc6cf1 --- /dev/null +++ b/man/docstring/mcdsts.get_spatialdata.md @@ -0,0 +1,65 @@ +# mcdsts.get_spatialdata() + + +## input: +``` + images: set of string; default {'subs'} + specify if from the subs or cell dataset + a multichannel image should be generate. + so far, only the subs image element is implemented. + + labels: set of strings; default is an empty set + specify if from the subs or cell dataset + a label element should be generated. + so far, neither subs nor cell label elements are implemented. + + points: set of string; default {'subs'} + specify if from the subs or cell dataset + a points element should be generated. + both, subs and cell point elements, are implemented. + + shapes: set of string; default {'cell'} + specify if from the subs or cell dataset + a shape element should be generated. + so far, only the cell shape element is implemented. + + values: integer; default is 1 + minimal number of values a variable has to have to be outputted. + variables that have only 1 state carry no information. + None is a state too. + + drop: set of strings; default is an empty set + set of column labels to be dropped for the dataframe. + don't worry: essential columns like ID, coordinates + and time will never be dropped. + Attention: when the keep parameter is given, then + the drop parameter has to be an empty set! + + keep: set of strings; default is an empty set + set of column labels to be kept in the dataframe. + don't worry: essential columns like ID, coordinates + and time will always be kept. + + scale: string; default 'maxabs' + specify how the data should be scaled. + possible values are None, maxabs, minmax, std. + for more input, check out: help(pcdl.scaler) + + keep_mcds: boole; default True + should the loaded original mcds be kept in memory + after transformation? + +``` + +## output: +``` + self.l_sdmcds: list of spatialdata objects. + +``` + +## description: +``` + function to transform mcds time steps into + spatialdata objects for downstream analysis. + +``` \ No newline at end of file diff --git a/man/docstring/pcdl_get_spatialdata.md b/man/docstring/pcdl_get_spatialdata.md new file mode 100644 index 0000000..0219e24 --- /dev/null +++ b/man/docstring/pcdl_get_spatialdata.md @@ -0,0 +1,99 @@ +``` +usage: pcdl_get_spatialdata [-h] [--custom_data_type [CUSTOM_DATA_TYPE ...]] + [--microenv MICROENV] [--graph GRAPH] + [--physiboss PHYSIBOSS] [--settingxml SETTINGXML] + [-v VERBOSE] [--images [IMAGES ...]] + [--labels [LABELS ...]] [--points [POINTS ...]] + [--shapes [SHAPES ...]] [--drop [DROP ...]] + [--keep [KEEP ...]] [--scale SCALE] + [path] [values] + +function to transform mcds time steps into spatialdata objects for downstream +analysis. + +positional arguments: + path path to the PhysiCell output directory or a + outputnnnnnnnn.xml file. default is . . + values minimal number of values a variable has to have in any + of the mcds time steps to be outputted. variables that + have only 1 state carry no information. None is a + state too. default is 1. + +options: + -h, --help show this help message and exit + --custom_data_type [CUSTOM_DATA_TYPE ...] + parameter to specify custom_data variable types other + than float (namely: int, bool, str) like this + var:dtype myint:int mybool:bool mystr:str . downstream + float and int will be handled as numeric, bool as + Boolean, and str as categorical data. default is an + empty string. + --microenv MICROENV should the microenvironment be extracted and loaded + into the spatialdata object? setting microenv to False + will use less memory and speed up processing. default + is True. + --graph GRAPH should neighbor graph, attach graph, and attached + spring graph be extracted and loaded into the + spatialdata object? default is True. + --physiboss PHYSIBOSS + if found, should physiboss state data be extracted and + loaded into the spatialdata object? default is True. + --settingxml SETTINGXML + the settings.xml that is loaded, from which the cell + type ID label mapping, is extracted, if this + information is not found in the output xml file. set + to None or False if the xml file is missing! default + is False. + -v VERBOSE, --verbose VERBOSE + setting verbose to False for less text output, while + processing. default is True. + --images [IMAGES ...] + specify if from the subs or cell dataset a + multichannel image should be generate. so far, only + the subs image element is implemented. + --labels [LABELS ...] + specify if from the subs or cell dataset a label + element should be generated. so far, neither subs nor + cell label elements are implemented. + --points [POINTS ...] + specify if from the subs or cell dataset a points + element should be generated. both, subs and cell point + elements, are implemented. + --shapes [SHAPES ...] + specify if from the subs or cell dataset a shape + element should be generated. so far, only the cell + shape element is implemented. + --drop [DROP ...] set of column labels to be dropped for the dataframe. + don't worry: essential columns like ID, coordinates + and time will never be dropped. Attention: when the + keep parameter is given, then the drop parameter has + to be an empty string! default is an empty string. + --keep [KEEP ...] set of column labels to be kept in the dataframe. set + values=1 to be sure that all variables are kept. don't + worry: essential columns like ID, coordinates and time + will always be kept. default is an empty string. + --scale SCALE specify how the data should be scaled. possible values + are None, maxabs, minmax, std. None: no scaling. set + scale to None if you would like to have raw data or + entirely scale, transform, and normalize the data + later. maxabs: maximum absolute value distance scaler + will linearly map all values into a [-1, 1] interval. + if the original data has no negative values, the + result will be the same as with the minmax scaler + (except with attributes with only one value). if the + attribute has only zeros, the value will be set to 0. + minmax: minimum maximum distance scaler will map all + values linearly into a [0, 1] interval. if the + attribute has only one value, the value will be set to + 0. std: standard deviation scaler will result in + sigmas. each attribute will be mean centered around 0. + ddof delta degree of freedom is set to 1 because it is + assumed that the values are samples out of the + population and not the entire population. it is + incomprehensible to me that the equivalent sklearn + method has ddof set to 0. if the attribute has only + one value, the value will be set to 0. default is + maxabs + +homepage: https://github.com/elmbeech/physicelldataloader +``` diff --git a/man/img/physicelldataloader_concept_v4.0.0.jpg b/man/img/physicelldataloader_concept_v4.0.0.jpg new file mode 100644 index 0000000..79ba3c3 Binary files /dev/null and b/man/img/physicelldataloader_concept_v4.0.0.jpg differ diff --git a/man/img/physicelldataloader_concept_v4.0.0.png b/man/img/physicelldataloader_concept_v4.0.0.png new file mode 100644 index 0000000..4b4f72e Binary files /dev/null and b/man/img/physicelldataloader_concept_v4.0.0.png differ diff --git a/man/img/physicelldataloader_github_qr.png b/man/img/physicelldataloader_github_qr.png new file mode 100644 index 0000000..683f92a Binary files /dev/null and b/man/img/physicelldataloader_github_qr.png differ diff --git a/man/img/dendrogram_mobile_rabbits.png b/man/img/tutorial/dendrogram_mobile_rabbits.png similarity index 100% rename from man/img/dendrogram_mobile_rabbits.png rename to man/img/tutorial/dendrogram_mobile_rabbits.png diff --git a/man/lecture/20251114_fertig_lab_pcdl_for_wetlab_scientists_and_bioinformaticians.pdf b/man/lecture/20251114_fertig_lab_pcdl_for_wetlab_scientists_and_bioinformaticians.pdf new file mode 100644 index 0000000..11c1b59 Binary files /dev/null and b/man/lecture/20251114_fertig_lab_pcdl_for_wetlab_scientists_and_bioinformaticians.pdf differ diff --git a/man/scarab.py b/man/scarab.py index 5ed20a8..6ed706e 100644 --- a/man/scarab.py +++ b/man/scarab.py @@ -299,6 +299,11 @@ def docstring_md(s_function, ls_doc, s_header=None, s_opath='man/docstring/'): ) # write TimeStep microenvironment and cells function markdown files +docstring_md( + s_function = 'mcds.get_spatialdata', + ls_doc = pcdl.TimeStep.get_spatialdata.__doc__.split('\n'), +) + docstring_md( s_function = 'mcds.make_ome_tiff', ls_doc = pcdl.TimeStep.make_ome_tiff.__doc__.split('\n'), @@ -400,6 +405,10 @@ def docstring_md(s_function, ls_doc, s_header=None, s_opath='man/docstring/'): ) # write TimeSeries microenvironment and cells function makdown files +docstring_md( + s_function = 'mcdsts.get_spatialdata', + ls_doc = pcdl.TimeSeries.get_spatialdata.__doc__.split('\n'), +) docstring_md( s_function = 'mcdsts.make_ome_tiff', ls_doc = pcdl.TimeSeries.make_ome_tiff.__doc__.split('\n'), @@ -447,6 +456,7 @@ def docstring_md(s_function, ls_doc, s_header=None, s_opath='man/docstring/'): help_md(s_command='pcdl_plot_scatter') help_md(s_command='pcdl_make_cell_vtk') # substrate and cell agent +help_md(s_command='pcdl_get_spatialdata') help_md(s_command='pcdl_plot_timeseries') help_md(s_command='pcdl_make_ome_tiff') help_md(s_command='pcdl_render_neuroglancer') diff --git a/pcdl/VERSION.py b/pcdl/VERSION.py index 02c2261..fa721b4 100644 --- a/pcdl/VERSION.py +++ b/pcdl/VERSION.py @@ -1 +1 @@ -__version__ = '4.0.5' +__version__ = '4.1.0' diff --git a/pcdl/commandline.py b/pcdl/commandline.py index 079449b..51af143 100644 --- a/pcdl/commandline.py +++ b/pcdl/commandline.py @@ -1943,6 +1943,204 @@ def make_cell_vtk(): # substrate and cell agent command line function # ################################################### +def get_spatialdata(): + # argv + parser = argparse.ArgumentParser( + prog = 'pcdl_get_spatialdata', + description = 'function to transform mcds time steps into spatialdata objects for downstream analysis.', + epilog = 'homepage: https://github.com/elmbeech/physicelldataloader', + ) + + # TimeSeries path + parser.add_argument( + 'path', + nargs = '?', + default = '.', + help = 'path to the PhysiCell output directory or a outputnnnnnnnn.xml file. default is . .' + ) + # TimeSeries output_path '.' + # TimeSeries custom_data_type + parser.add_argument( + '--custom_data_type', + nargs = '*', + default = [], + help = 'parameter to specify custom_data variable types other than float (namely: int, bool, str) like this var:dtype myint:int mybool:bool mystr:str . downstream float and int will be handled as numeric, bool as Boolean, and str as categorical data. default is an empty string.', + ) + # TimeSeries microenv + parser.add_argument( + '--microenv', + default = 'true', + help = 'should the microenvironment be extracted and loaded into the spatialdata object? setting microenv to False will use less memory and speed up processing. default is True.' + ) + # TimeSeries graph + parser.add_argument( + '--graph', + default = 'true', + help = 'should neighbor graph, attach graph, and attached spring graph be extracted and loaded into the spatialdata object? default is True.' + ) + # TimeSeries physiboss + parser.add_argument( + '--physiboss', + default = 'true', + help = 'if found, should physiboss state data be extracted and loaded into the spatialdata object? default is True.' + ) + # TimeSeries settingxml + parser.add_argument( + '--settingxml', + default = 'false', + help = 'the settings.xml that is loaded, from which the cell type ID label mapping, is extracted, if this information is not found in the output xml file. set to None or False if the xml file is missing! default is False.', + ) + # TimeSeries verbose + parser.add_argument( + '-v', '--verbose', + default = 'true', + help = 'setting verbose to False for less text output, while processing. default is True.', + ) + # get_spatialdata points + parser.add_argument( + '--images', + nargs = '*', + default = ['subs'], + help = 'specify if from the subs or cell dataset a multichannel image should be generate. so far, only the subs image element is implemented.' + ) + # get_spatialdata points + parser.add_argument( + '--labels', + nargs = '*', + default = [], + help = 'specify if from the subs or cell dataset a label element should be generated. so far, neither subs nor cell label elements are implemented.' + ) + # get_spatialdata points + parser.add_argument( + '--points', + nargs = '*', + default = ['subs'], + help = 'specify if from the subs or cell dataset a points element should be generated. both, subs and cell point elements, are implemented.' + ) + # get_spatialdata shapes + parser.add_argument( + '--shapes', + nargs = '*', + default = ['cell'], + help = 'specify if from the subs or cell dataset a shape element should be generated. so far, only the cell shape element is implemented.' + ) + # get_spatialdata values + parser.add_argument( + 'values', + nargs = '?', + default = 1, + type = int, + help = 'minimal number of values a variable has to have in any of the mcds time steps to be outputted. variables that have only 1 state carry no information. None is a state too. default is 1.' + ) + # get_spatialdata drop + parser.add_argument( + '--drop', + nargs = '*', + default = [], + help = "set of column labels to be dropped for the dataframe. don't worry: essential columns like ID, coordinates and time will never be dropped. Attention: when the keep parameter is given, then the drop parameter has to be an empty string! default is an empty string." + ) + # get_spatialdata keep + parser.add_argument( + '--keep', + nargs = '*', + default = [], + help = "set of column labels to be kept in the dataframe. set values=1 to be sure that all variables are kept. don't worry: essential columns like ID, coordinates and time will always be kept. default is an empty string." + ) + # get_spatialata scale + parser.add_argument( + '--scale', + default = 'maxabs', + help = "specify how the data should be scaled. possible values are None, maxabs, minmax, std. None: no scaling. set scale to None if you would like to have raw data or entirely scale, transform, and normalize the data later. maxabs: maximum absolute value distance scaler will linearly map all values into a [-1, 1] interval. if the original data has no negative values, the result will be the same as with the minmax scaler (except with attributes with only one value). if the attribute has only zeros, the value will be set to 0. minmax: minimum maximum distance scaler will map all values linearly into a [0, 1] interval. if the attribute has only one value, the value will be set to 0. std: standard deviation scaler will result in sigmas. each attribute will be mean centered around 0. ddof delta degree of freedom is set to 1 because it is assumed that the values are samples out of the population and not the entire population. it is incomprehensible to me that the equivalent sklearn method has ddof set to 0. if the attribute has only one value, the value will be set to 0. default is maxabs" + ) + + # parse arguments + args = parser.parse_args() + print(args) + + # process arguments + s_path = args.path.replace('\\','/') + while (s_path.find('//') > -1): + s_path = s_path.replace('//','/') + if (s_path.endswith('/')) and (len(s_path) > 1): + s_path = s_path[:-1] + s_pathfile = s_path + if not s_pathfile.endswith('.xml'): + s_pathfile = s_pathfile + '/initial.xml' + else: + s_path = '/'.join(s_path.split('/')[:-1]) + if not os.path.exists(s_pathfile): + sys.exit(f'Error @ pcdl_get_spatialdata : {s_pathfile} path does not look like a outputnnnnnnnn.xml file or physicell output directory ({s_path}/initial.xml is missing).') + + # custom_data_type + d_vartype = {} + for vartype in args.custom_data_type: + s_var, s_type = vartype.split(':') + if s_type in {'bool'}: o_type = bool + elif s_type in {'int'}: o_type = int + elif s_type in {'float'}: o_type = float + elif s_type in {'str'}: o_type = str + else: + sys.exit(f'Error @ pcdl_get_spatialdata : {s_var} {s_type} has an unknowen data type. knowen are bool, int, float, str.') + d_vartype.update({s_var : o_type}) + + # run + if os.path.isfile(args.path): + mcds = pcdl.TimeStep( + xmlfile = s_pathfile, + output_path = '.', + custom_data_type = d_vartype, + microenv = False if args.microenv.lower().startswith('f') else True, + graph = False if args.graph.lower().startswith('f') else True, + physiboss = False if args.physiboss.lower().startswith('f') else True, + settingxml = None if ((args.settingxml.lower() == 'none') or (args.settingxml.lower() == 'false')) else args.settingxml, + verbose = False if args.verbose.lower().startswith('f') else True + ) + sd_mcds = mcds.get_spatialdata( + images = set(args.images), + labels = set(args.labels), + points = set(args.points), + shapes = set(args.shapes), + values = args.values, + drop = set(args.drop), + keep = set(args.keep), + scale = None if (args.scale.lower() == 'none') else args.scale, + ) + # going home + s_opathfile = s_pathfile.replace('.xml', f'_{args.scale}.zarr') + sd_mcds.write(s_opathfile) + print(s_opathfile) + + else: + mcdsts = pcdl.TimeSeries( + output_path = s_path, + custom_data_type = d_vartype, + load = True, + microenv = False if args.microenv.lower().startswith('f') else True, + graph = False, + physiboss = False if args.physiboss.lower().startswith('f') else True, + settingxml = None if ((args.settingxml.lower() == 'none') or (args.settingxml.lower() == 'false')) else args.settingxml, + verbose = False if args.verbose.lower().startswith('f') else True, + ) + sd_mcdsts = mcdsts.get_spatialdata( + images = set(args.images), + labels = set(args.labels), + points = set(args.points), + shapes = set(args.shapes), + values = args.values, + drop = set(args.drop), + keep = set(args.keep), + scale = None if (args.scale.lower() == 'none') else args.scale, + ) + # going home + ls_opathfile = [f"{s_path}/{s_xmlfile.replace('.xml', '_{}.zarr'.format(args.scale))}" for s_xmlfile in mcdsts.get_xmlfile_list()] + for i, sd_mcds in enumerate(sd_mcdsts): + sd_mcds.write(ls_opathfile[i]) + print(ls_opathfile) + + # going home + return 0 + + def plot_timeseries(): # argv parser = argparse.ArgumentParser( diff --git a/pcdl/timeseries.py b/pcdl/timeseries.py index d4956ae..22c5f1d 100644 --- a/pcdl/timeseries.py +++ b/pcdl/timeseries.py @@ -231,6 +231,7 @@ def __init__(self, output_path='.', custom_data_type={}, load=True, microenv=Tru else: self.l_mcds = None self.l_annmcds = None + self.l_sdmcds = None def set_verbose_false(self): @@ -1718,3 +1719,111 @@ def get_annmcds_list(self): function returns a binding to the self.l_annmcds list of anndata mcds objects. """ return self.l_annmcds + + + def get_spatialdata(self, images={'subs'}, labels=set(), points={'subs'}, shapes={'cell'}, values=1, drop=set(), keep=set(), scale='maxabs', keep_mcds=True): + """ + input: + images: set of string; default {'subs'} + specify if from the subs or cell dataset + a multichannel image should be generate. + so far, only the subs image element is implemented. + + labels: set of strings; default is an empty set + specify if from the subs or cell dataset + a label element should be generated. + so far, neither subs nor cell label elements are implemented. + + points: set of string; default {'subs'} + specify if from the subs or cell dataset + a points element should be generated. + both, subs and cell point elements, are implemented. + + shapes: set of string; default {'cell'} + specify if from the subs or cell dataset + a shape element should be generated. + so far, only the cell shape element is implemented. + + values: integer; default is 1 + minimal number of values a variable has to have to be outputted. + variables that have only 1 state carry no information. + None is a state too. + + drop: set of strings; default is an empty set + set of column labels to be dropped for the dataframe. + don't worry: essential columns like ID, coordinates + and time will never be dropped. + Attention: when the keep parameter is given, then + the drop parameter has to be an empty set! + + keep: set of strings; default is an empty set + set of column labels to be kept in the dataframe. + don't worry: essential columns like ID, coordinates + and time will always be kept. + + scale: string; default 'maxabs' + specify how the data should be scaled. + possible values are None, maxabs, minmax, std. + for more input, check out: help(pcdl.scaler) + + keep_mcds: boole; default True + should the loaded original mcds be kept in memory + after transformation? + + output: + self.l_sdmcds: list of spatialdata objects. + + description: + function to transform mcds time steps into + spatialdata objects for downstream analysis. + """ + # variable triage + es_keep = set(self.get_cell_attribute(values=values, drop=drop, keep=keep, allvalues=False).keys()) + + # processing + lsd_mcds = [] + i_mcds = len(self.l_mcds) + for i in range(i_mcds): + # fetch mcds + if keep_mcds: + mcds = self.l_mcds[i] + else: + mcds = self.l_mcds.pop(0) + + # extract time and dataframes + r_time = round(mcds.get_time(),9) + if self.verbose: + print(f'\nprocessing: {i+1}/{i_mcds} {r_time}[min] mcds into spatialdata obj.') + + # get spatialdata object + sd_mcds = mcds.get_spatialdata( + points = points, + shapes = shapes, + #values = 1, + #drop = set(), + keep = es_keep, + scale = scale, + ) + lsd_mcds.append(sd_mcds) + + # output + self.l_sdmcds = lsd_mcds + return self.l_sdmcds + + + + def get_sdmcds_list(self): + """ + input: + self: TimeSeries class instance. + + output: + self.l_sdmcds: list of chronologically ordered spatialdata mcds objects. + watch out, this is a pointer to the + self.l_sdmcds list of spdata mcds objects, not a copy of self.l_sdmcds! + + description: + function returns a binding to the self.l_sdmcds list of spdata mcds objects. + """ + return self.l_sdmcds + diff --git a/pcdl/timestep.py b/pcdl/timestep.py index fde35d7..05e1a84 100644 --- a/pcdl/timestep.py +++ b/pcdl/timestep.py @@ -18,6 +18,7 @@ import anndata as ad import bioio_base from bioio.writers import OmeTiffWriter +import geopandas as gpd import matplotlib.pyplot as plt from matplotlib import cm from matplotlib import colors @@ -30,6 +31,8 @@ from pcdl import neuromancer from scipy import io from scipy import sparse +import shapely +import spatialdata as sd import sys import vtk import warnings @@ -460,7 +463,7 @@ def _anndextract(df_cell, scale='maxabs', graph_attached={}, graph_neighbor={}, elif str(se_cell.dtype).startswith('object'): des_type['str'].add(se_cell.name) else: - print(f'Error @ TimeSeries.get_anndata : column {se_cell.name} detected with unknown dtype {str(se_cell.dtype)}.') + print(f'Error @ TimeSeries._anndextract : column {se_cell.name} detected with unknown dtype {str(se_cell.dtype)}.') # build on obs and X anndata object df_cat = df_cell.loc[:,sorted(des_type['str'])].copy() @@ -2348,6 +2351,261 @@ def get_anndata(self, values=1, drop=set(), keep=set(), scale='maxabs'): return annmcds + def get_spatialdata(self, images={'subs'}, labels={}, points={'subs'}, shapes={'cell'}, values=1, drop=set(), keep=set(), scale='maxabs'): + """ + input: + images: set of string; default {'subs'} + specify if from the subs or cell dataset + a multichannel image should be generate. + so far, only the subs image element is implemented. + + labels: set of strings; default is an empty set + specify if from the subs or cell dataset + a label element should be generated. + so far, neither subs nor cell label elements are implemented. + + points: set of string; default {'subs'} + specify if from the subs or cell dataset + a points element should be generated. + both, subs and cell point elements, are implemented. + + shapes: set of string; default {'cell'} + specify if from the subs or cell dataset + a shape element should be generated. + so far, only the cell shape element is implemented. + + values: integer; default is 1 + minimal number of values a variable has to have to be outputted. + variables that have only 1 state carry no information. + None is a state too. + + drop: set of strings; default is an empty set + set of column labels to be dropped for the dataframe. + don't worry: essential columns like ID, coordinates + and time will never be dropped. + Attention: when the keep parameter is given, then + the drop parameter has to be an empty set! + + keep: set of strings; default is an empty set + set of column labels to be kept in the dataframe. + don't worry: essential columns like ID, coordinates + and time will always be kept. + + scale: string; default 'maxabs' + specify how the data should be scaled. + possible values are None, maxabs, minmax, std. + for more input, check out: help(pcdl.scaler) + + output: + self.l_sdmcds: list of spatialdata objects. + + description: + function to transform a mcds time step into + a spatialdata object for downstream analysis. + """ + # set table spatial element links + s_region_subs = None + s_region_cell = None + + # check input + if len(points.intersection(shapes)) > 0: + sys.exit(f'Error @ : TimeStep.get_spatialdata : {sorted(points.intersection(shapes))} can only be on, either point or shape element.') + + # handle substrate + b_subs = len(self.get_substrate_list()) > 0 + + # handle domain dimension + b_2d = False + if len(self.get_voxel_ijk_axis()[2]) == 1: + b_2d = True + + # images + dax_image={} + for s_image in images: + s_element = f'{s_image}_image' + if self.verbose: + print(f'processing: {s_element} ...') + # substrate + if (s_image in {'subs'}): + if not b_subs: + pass # microenv not loaded + else: + # get image + a_image = self.make_ome_tiff(focus=self.get_substrate_list(), file=False) + # processing model + if b_2d : + ax_image = sd.models.Image2DModel.parse( + data=a_image[:,0,:,:], # a_cyx + dims=['c','y','x'], + c_coords=self.get_substrate_list(), + scale_factors=None, + ) + else: + ax_image = sd.models.Image3DModel.parse( + data=a_image, # a_czyx + dims=['c','z','y','x'], + c_coords=self.get_substrate_list(), + scale_factors=None, + ) + # update output + dax_image.update({s_element: ax_image}) + # error + else: + sys.exit(f'Error @ TimeStep.get_spatialdata : {s_image} cannot be transformed to an image element.') + + # labels + dax_label = {} + for s_label in labels: + # error + sys.exit(f'Error @ TimeStep.get_spatialdata : {s_label} cannot be transformed to a label element.') + + # points + ddfd_point = {} + for s_point in points: + s_element = f'{s_point}_point' + if self.verbose: + print(f'processing: {s_element} ...') + # substrat + if (s_point in {'subs'}): + if not b_subs: + pass # microenv not loaded + else: + s_region_subs = s_element + #df_point = self.get_conc_df().loc[:,['mesh_center_m','mesh_center_n','mesh_center_p']].reset_index().rename({'index':'subs_idx'}, axis=1) + df_point = self.get_conc_df().loc[:,['mesh_center_m','mesh_center_n','mesh_center_p']] + df_point.index.name = 'subs_idx' + df_point.loc[:, 'mesh_center_m'] = (df_point.loc[:, 'mesh_center_m'] - self.get_xyz_range()[0][0]) + df_point.loc[:, 'mesh_center_n'] = (df_point.loc[:, 'mesh_center_n'] - self.get_xyz_range()[1][0]) + # proceeing model + if b_2d : + dfd_point = sd.models.PointsModel.parse( + df_point.loc[:,['mesh_center_m','mesh_center_n']], # subs_idx + coordinates={'x':'mesh_center_m', 'y':'mesh_center_n'} + ) + else: + dfd_point = sd.models.PointsModel.parse( + df_point, + coordinates={'x':'mesh_center_m', 'y':'mesh_center_n', 'z':'mesh_center_p'} + ) + # update output + ddfd_point.update({s_element: dfd_point}) + # cell + elif s_point in {'cell'}: + s_region_cell = s_element + #df_point = self.get_cell_df().loc[:,['position_x','position_y','position_z']].reset_index().rename({'ID':'cell_idx'}, axis=1) + df_point = self.get_cell_df().loc[:,['position_x','position_y','position_z']] + df_point.index.name = 'cell_idx' + df_point.loc[:, 'position_x'] = (df_point.loc[:, 'position_x'] - self.get_xyz_range()[0][0]) + df_point.loc[:, 'position_y'] = (df_point.loc[:, 'position_y'] - self.get_xyz_range()[1][0]) + # proceeing model + if b_2d : + dfd_point = sd.models.PointsModel.parse( + df_point.loc[:,['position_x','position_y']], # cell_idx + coordinates={'x':'position_x', 'y':'position_y'} + ) + else: + dfd_point = sd.models.PointsModel.parse( + df_point, + coordinates={'x':'position_x', 'y':'position_y', 'z':'position_z'} + ) + # update output + ddfd_point.update({s_element: dfd_point}) + # error + else: + sys.exit(f'Error @ TimeStep.get_spatialdata : {s_point} cannot be transformed to a point element.') + + # shapes + ddfg_shape = {} + for s_shape in shapes: + s_element = f'{s_shape}_shape' + if self.verbose: + print(f'processing: {s_element} ...') + # cell center transcript location + if s_shape in {'cell'}: + #df_shape = self.get_cell_df().loc[:,['position_x','position_y','position_z','radius']].reset_index().rename({'ID':'cell_idx'}, axis=1) + df_shape = self.get_cell_df().loc[:,['position_x','position_y','position_z','radius']] + df_shape.index.name = 'cell_idx' + df_shape.loc[:, 'position_x'] = (df_shape.loc[:, 'position_x'] - self.get_xyz_range()[0][0]) + df_shape.loc[:, 'position_y'] = (df_shape.loc[:, 'position_y'] - self.get_xyz_range()[1][0]) + if b_2d : + lo_point = [shapely.geometry.Point(row.position_x, row.position_y) for row in df_shape.itertuples()] + else: + lo_point = [shapely.geometry.Point(row.position_x, row.position_y, row.position_z) for row in df_shape.itertuples()] + df_shape.drop({'position_x','position_y','position_z'}, axis=1, inplace=True) + gdf_shape = gpd.GeoDataFrame(df_shape, geometry=lo_point) + s_region_cell = s_element + # error + else: + sys.exit(f'Error @ TimeStep.get_spatialdata : {s_shape} cannot be transformed to a shape element.') + # processing model + gdf_shape = sd.models.ShapesModel.parse(gdf_shape) + # update output + ddfg_shape.update({s_element: gdf_shape}) + + # tables + dad_table = {} + # substrate + if b_subs: + s_element = 'subs_table' + if self.verbose: + print(f'processing: {s_element} ...') + # generate anndata object + df_subs = self.get_conc_df().reset_index().rename({'index':'subs_idx'}, axis=1) # index + df_subs['region'] = pd.Categorical([s_region_subs] * df_subs.shape[0]) + df_count = df_subs.loc[:,self.get_substrate_list()] + df_count.index = df_count.index.astype(str) + df_obs = df_subs.drop(self.get_substrate_list(), axis=1) + df_obs.index = df_obs.index.astype(str) + # link to spatial elelment + ad_subs = ad.AnnData(df_count, obs=df_obs) + if not (s_region_subs is None): + ad_subs.uns['spatialdata_attrs'] = { + 'region': s_region_subs, # name of the linked elemenat + 'region_key': 'region', # column in obs that links element and observations. + 'instance_key': 'subs_idx', # column in obs the links element index and observations. + } + # parse model + ad_subs = sd.models.TableModel.parse(adata=ad_subs) + dad_table.update({s_element: ad_subs}) + + # cell + s_element = 'cell_table' + if self.verbose: + print(f'processing: {s_element} ...') + # generate anndata object + ad_cell = self.get_anndata(values=values, drop=drop, keep=keep, scale=scale) + ls_position = ['position_x','position_y','position_z'] + for i_position in range(ad_cell.obsm['spatial'].shape[1]): + ad_cell.obs[ls_position[i_position]] = ad_cell.obsm['spatial'][:,i_position] + del ad_cell.obsm['spatial'] + # link to spatial element + ad_cell.obs['region'] = pd.Categorical([s_region_cell] * ad_cell.shape[0]) + ad_cell.obs['cell_idx'] = ad_cell.obs_names.astype(int) + if not (s_region_cell is None): + ad_cell.uns['spatialdata_attrs'] = { + 'region': s_region_cell, # name of the linked elemenat + 'region_key': 'region', # column in obs that links element and observations. + 'instance_key': 'cell_idx', # column in obs the links element index and observations. + } + # parse model + ad_cell = sd.models.TableModel.parse(adata=ad_cell) + dad_table.update({s_element: ad_cell}) + + # glue output together + if self.verbose: + print(f'processing: glue spatialdata object together ...') + sdata = sd.SpatialData( + images=dax_image, # glue by xyz coordinates + labels=dax_label, # glue by xyz coordinates + points=ddfd_point, # glue by index to anndata + shapes=ddfg_shape, # glue by index to anndata + tables=dad_table, # anndata + ) + if self.verbose: + print('done!') + return sdata + + ## LOAD DATA ## def _read_xml(self, xmlfile, output_path='.'): diff --git a/pyproject.toml b/pyproject.toml index 23190fc..f52cd2c 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -41,7 +41,7 @@ dynamic = ["version"] description = "physicell data loader (pcdl) provides a platform independent, python3 based, pip installable interface to load output, generated with the PhysiCell agent based modeling framework, into python3." readme = "README.md" -requires-python = ">=3.10, <4" +requires-python = ">=3.11, <4" license = "BSD-3-Clause" #license-files = {paths = ["LICENSE"]} @@ -73,6 +73,7 @@ dependencies = [ "anndata>=0.10.8", "bioio>=1.2.1", # needs numpy < 2.0.0 "bioio-ome-tiff", + "geopandas>=0.14", # spatialdata==0.6.0 "matplotlib", "neuroglancer", "numpy", @@ -80,6 +81,8 @@ dependencies = [ "requests", "scikit-image>=0.24.0", "scipy>=1.13.0", + "shapely>=2.0.1", # spatialdata==0.6.0 + "spatialdata>=0.6.0", "vtk", ] @@ -105,6 +108,7 @@ pcdl_make_graph_gml = "pcdl.commandline:make_graph_gml" pcdl_plot_scatter = "pcdl.commandline:plot_scatter" pcdl_make_cell_vtk = "pcdl.commandline:make_cell_vtk" # substrate and cell agent +pcdl_get_spatialdata = "pcdl.commandline:get_spatialdata" pcdl_plot_timeseries = "pcdl.commandline:plot_timeseries" pcdl_make_ome_tiff = "pcdl.commandline:make_ome_tiff" pcdl_render_neuroglancer = "pcdl.commandline:render_neuroglancer" diff --git a/test/test_commandline_2d.py b/test/test_commandline_2d.py index e4ca9e2..b3b8ea8 100644 --- a/test/test_commandline_2d.py +++ b/test/test_commandline_2d.py @@ -885,6 +885,15 @@ def test_pcdl_get_anndata_timestep(self): os.remove(f'{s_path_2d}/output00000024_cell_maxabs.h5ad') assert o_result.returncode == 0 + def test_pcdl_get_anndata_timestep_customtype(self): + o_result = subprocess.run(['pcdl_get_anndata', s_pathfile_2d, '--custom_data_type', 'sample:bool'], check=False, capture_output=True) + print(f'o_result: {o_result}\n') + print(f'o_result.returncode: {o_result.returncode}\n') + print(f'o_result.stdout: {o_result.stdout}\n') + print(f'o_result.stderr: {o_result.stderr}\n') + os.remove(f'{s_path_2d}/output00000024_cell_maxabs.h5ad') + assert o_result.returncode == 0 + def test_pcdl_get_anndata_timestep_microenv(self): o_result = subprocess.run(['pcdl_get_anndata', s_pathfile_2d, '--microenv', 'false'], check=False, capture_output=True) print(f'o_result: {o_result}\n') @@ -1392,6 +1401,275 @@ def test_pcdl_make_cell_vtk_timestep_attribute_many(self): # substrate and cell agenat test code # ####################################### +class TestCommandLineInterfaceSpatialdata(object): + ''' tests for one pcdl command line interface function. ''' + + # timestep and timeseries: + # + path nop + # + customtype ([], _sample:bool_) ok + # + microenv (true, _false_) ok + # + graph (true, _false_) ok + # + physiboss (true, _false_) ok + # + settingxml (string, _none_, _false_) ok + # + verbose (true, _false_) nop + # + points (subs cell) ok + # + shapes (cell) ok + # + values (int) ok + # + drop (cell_type oxygen) ok + # + keep (cell_type oxygen) ok + # + scale (maxabs, _std_) ok + + # timeseries + def test_pcdl_get_spatialdata_timeseries(self): + o_result = subprocess.run(['pcdl_get_spatialdata', s_path_2d], check=False, capture_output=True) + print(f'o_result: {o_result}\n') + print(f'o_result.returncode: {o_result.returncode}\n') + print(f'o_result.stdout: {o_result.stdout}\n') + print(f'o_result.stderr: {o_result.stderr}\n') + for i_step in range(25): + shutil.rmtree(f'{s_path_2d}/output000000{str(i_step).zfill(2)}_maxabs.zarr') + assert o_result.returncode == 0 + + def test_pcdl_get_spatialdata_timeseries_customtype(self): + o_result = subprocess.run(['pcdl_get_spatialdata', s_path_2d, '--custom_data_type', 'sample:bool'], check=False, capture_output=True) + print(f'o_result: {o_result}\n') + print(f'o_result.returncode: {o_result.returncode}\n') + print(f'o_result.stdout: {o_result.stdout}\n') + print(f'o_result.stderr: {o_result.stderr}\n') + for i_step in range(25): + shutil.rmtree(f'{s_path_2d}/output000000{str(i_step).zfill(2)}_maxabs.zarr') + assert o_result.returncode == 0 + + def test_pcdl_get_spatialdata_timeseries_microenv(self): + o_result = subprocess.run(['pcdl_get_spatialdata', s_path_2d, '--microenv', 'false'], check=False, capture_output=True) + print(f'o_result: {o_result}\n') + print(f'o_result.returncode: {o_result.returncode}\n') + print(f'o_result.stdout: {o_result.stdout}\n') + print(f'o_result.stderr: {o_result.stderr}\n') + for i_step in range(25): + shutil.rmtree(f'{s_path_2d}/output000000{str(i_step).zfill(2)}_maxabs.zarr') + assert o_result.returncode == 0 + + def test_pcdl_get_spatialdata_timeseries_graph(self): + o_result = subprocess.run(['pcdl_get_spatialdata', s_path_2d, '--graph', 'false'], check=False, capture_output=True) + print(f'o_result: {o_result}\n') + print(f'o_result.returncode: {o_result.returncode}\n') + print(f'o_result.stdout: {o_result.stdout}\n') + print(f'o_result.stderr: {o_result.stderr}\n') + for i_step in range(25): + shutil.rmtree(f'{s_path_2d}/output000000{str(i_step).zfill(2)}_maxabs.zarr') + assert o_result.returncode == 0 + + def test_pcdl_get_spatialdata_timeseries_physiboss(self): + o_result = subprocess.run(['pcdl_get_spatialdata', s_path_2d, '--physiboss', 'false'], check=False, capture_output=True) + print(f'o_result: {o_result}\n') + print(f'o_result.returncode: {o_result.returncode}\n') + print(f'o_result.stdout: {o_result.stdout}\n') + print(f'o_result.stderr: {o_result.stderr}\n') + for i_step in range(25): + shutil.rmtree(f'{s_path_2d}/output000000{str(i_step).zfill(2)}_maxabs.zarr') + assert o_result.returncode == 0 + + def test_pcdl_get_spatialdata_timeseries_settingxmlfalse(self): + o_result = subprocess.run(['pcdl_get_spatialdata', s_path_2d, '--settingxml', 'false'], check=False, capture_output=True) + print(f'o_result: {o_result}\n') + print(f'o_result.returncode: {o_result.returncode}\n') + print(f'o_result.stdout: {o_result.stdout}\n') + print(f'o_result.stderr: {o_result.stderr}\n') + for i_step in range(25): + shutil.rmtree(f'{s_path_2d}/output000000{str(i_step).zfill(2)}_maxabs.zarr') + assert o_result.returncode == 0 + + def test_pcdl_get_spatialdata_timeseries_settingxmlnone(self): + o_result = subprocess.run(['pcdl_get_spatialdata', s_path_2d, '--settingxml', 'none'], check=False, capture_output=True) + print(f'o_result: {o_result}\n') + print(f'o_result.returncode: {o_result.returncode}\n') + print(f'o_result.stdout: {o_result.stdout}\n') + print(f'o_result.stderr: {o_result.stderr}\n') + for i_step in range(25): + shutil.rmtree(f'{s_path_2d}/output000000{str(i_step).zfill(2)}_maxabs.zarr') + assert o_result.returncode == 0 + + def test_pcdl_get_spatialdata_timeseries_imageslabelspointsshapes_default(self): + o_result = subprocess.run(['pcdl_get_spatialdata', s_path_2d, '--images', 'subs', '--labels', '--points', 'subs', '--shapes', 'cell'], check=False, capture_output=True) + print(f'o_result: {o_result}\n') + print(f'o_result.returncode: {o_result.returncode}\n') + print(f'o_result.stdout: {o_result.stdout}\n') + print(f'o_result.stderr: {o_result.stderr}\n') + for i_step in range(25): + shutil.rmtree(f'{s_path_2d}/output000000{str(i_step).zfill(2)}_maxabs.zarr') + assert o_result.returncode == 0 + + def test_pcdl_get_spatialdata_timeseries_imageslabelspointsshapes_none(self): + o_result = subprocess.run(['pcdl_get_spatialdata', s_path_2d, '--images', '--labels', '--points', '--shapes'], check=False, capture_output=True) + print(f'o_result: {o_result}\n') + print(f'o_result.returncode: {o_result.returncode}\n') + print(f'o_result.stdout: {o_result.stdout}\n') + print(f'o_result.stderr: {o_result.stderr}\n') + for i_step in range(25): + shutil.rmtree(f'{s_path_2d}/output000000{str(i_step).zfill(2)}_maxabs.zarr') + assert o_result.returncode == 0 + + def test_pcdl_get_spatialdata_timeseries_value(self): + o_result = subprocess.run(['pcdl_get_spatialdata', s_path_2d, '2'], check=False, capture_output=True) + print(f'o_result: {o_result}\n') + print(f'o_result.returncode: {o_result.returncode}\n') + print(f'o_result.stdout: {o_result.stdout}\n') + print(f'o_result.stderr: {o_result.stderr}\n') + for i_step in range(25): + shutil.rmtree(f'{s_path_2d}/output000000{str(i_step).zfill(2)}_maxabs.zarr') + assert o_result.returncode == 0 + + def test_pcdl_get_spatialdata_timeseries_drop(self): + o_result = subprocess.run(['pcdl_get_spatialdata', s_path_2d, '--drop', 'cell_type', 'oxygen'], check=False, capture_output=True) + print(f'o_result: {o_result}\n') + print(f'o_result.returncode: {o_result.returncode}\n') + print(f'o_result.stdout: {o_result.stdout}\n') + print(f'o_result.stderr: {o_result.stderr}\n') + for i_step in range(25): + shutil.rmtree(f'{s_path_2d}/output000000{str(i_step).zfill(2)}_maxabs.zarr') + assert o_result.returncode == 0 + + def test_pcdl_get_spatialdata_timeseries_keep(self): + o_result = subprocess.run(['pcdl_get_spatialdata', s_path_2d, '--keep', 'cell_type', 'oxygen'], check=False, capture_output=True) + print(f'o_result: {o_result}\n') + print(f'o_result.returncode: {o_result.returncode}\n') + print(f'o_result.stdout: {o_result.stdout}\n') + print(f'o_result.stderr: {o_result.stderr}\n') + for i_step in range(25): + shutil.rmtree(f'{s_path_2d}/output000000{str(i_step).zfill(2)}_maxabs.zarr') + assert o_result.returncode == 0 + + def test_pcdl_get_spatialdata_timeseries_scale(self): + o_result = subprocess.run(['pcdl_get_spatialdata', s_path_2d, '--scale', 'std'], check=False, capture_output=True) + print(f'o_result: {o_result}\n') + print(f'o_result.returncode: {o_result.returncode}\n') + print(f'o_result.stdout: {o_result.stdout}\n') + print(f'o_result.stderr: {o_result.stderr}\n') + for i_step in range(25): + shutil.rmtree(f'{s_path_2d}/output000000{str(i_step).zfill(2)}_std.zarr') + assert o_result.returncode == 0 + + # timestep + def test_pcdl_get_spatialdata_timestep(self): + o_result = subprocess.run(['pcdl_get_spatialdata', s_pathfile_2d], check=False, capture_output=True) + print(f'o_result: {o_result}\n') + print(f'o_result.returncode: {o_result.returncode}\n') + print(f'o_result.stdout: {o_result.stdout}\n') + print(f'o_result.stderr: {o_result.stderr}\n') + shutil.rmtree(f'{s_path_2d}/output00000024_maxabs.zarr') + assert o_result.returncode == 0 + + def test_pcdl_get_spatialdata_timestep_customtype(self): + o_result = subprocess.run(['pcdl_get_spatialdata', s_pathfile_2d, '--custom_data_type', 'sample:bool'], check=False, capture_output=True) + print(f'o_result: {o_result}\n') + print(f'o_result.returncode: {o_result.returncode}\n') + print(f'o_result.stdout: {o_result.stdout}\n') + print(f'o_result.stderr: {o_result.stderr}\n') + shutil.rmtree(f'{s_path_2d}/output00000024_maxabs.zarr') + assert o_result.returncode == 0 + + def test_pcdl_get_spatialdata_timestep_microenv(self): + o_result = subprocess.run(['pcdl_get_spatialdata', s_pathfile_2d, '--microenv', 'false'], check=False, capture_output=True) + print(f'o_result: {o_result}\n') + print(f'o_result.returncode: {o_result.returncode}\n') + print(f'o_result.stdout: {o_result.stdout}\n') + print(f'o_result.stderr: {o_result.stderr}\n') + shutil.rmtree(f'{s_path_2d}/output00000024_maxabs.zarr') + assert o_result.returncode == 0 + + def test_pcdl_get_spatialdata_timestep_graph(self): + o_result = subprocess.run(['pcdl_get_spatialdata', s_pathfile_2d, '--graph', 'false'], check=False, capture_output=True) + print(f'o_result: {o_result}\n') + print(f'o_result.returncode: {o_result.returncode}\n') + print(f'o_result.stdout: {o_result.stdout}\n') + print(f'o_result.stderr: {o_result.stderr}\n') + shutil.rmtree(f'{s_path_2d}/output00000024_maxabs.zarr') + assert o_result.returncode == 0 + + def test_pcdl_get_spatialdata_timestep_physiboss(self): + o_result = subprocess.run(['pcdl_get_spatialdata', s_pathfile_2d, '--physiboss', 'false'], check=False, capture_output=True) + print(f'o_result: {o_result}\n') + print(f'o_result.returncode: {o_result.returncode}\n') + print(f'o_result.stdout: {o_result.stdout}\n') + print(f'o_result.stderr: {o_result.stderr}\n') + shutil.rmtree(f'{s_path_2d}/output00000024_maxabs.zarr') + assert o_result.returncode == 0 + + def test_pcdl_get_spatialdata_timestep_settingxmlfalse(self): + o_result = subprocess.run(['pcdl_get_spatialdata', s_pathfile_2d, '--settingxml', 'false'], check=False, capture_output=True) + print(f'o_result: {o_result}\n') + print(f'o_result.returncode: {o_result.returncode}\n') + print(f'o_result.stdout: {o_result.stdout}\n') + print(f'o_result.stderr: {o_result.stderr}\n') + shutil.rmtree(f'{s_path_2d}/output00000024_maxabs.zarr') + assert o_result.returncode == 0 + + + def test_pcdl_get_spatialdata_timestep_settingxmlnone(self): + o_result = subprocess.run(['pcdl_get_spatialdata', s_pathfile_2d, '--settingxml', 'none'], check=False, capture_output=True) + print(f'o_result: {o_result}\n') + print(f'o_result.returncode: {o_result.returncode}\n') + print(f'o_result.stdout: {o_result.stdout}\n') + print(f'o_result.stderr: {o_result.stderr}\n') + shutil.rmtree(f'{s_path_2d}/output00000024_maxabs.zarr') + assert o_result.returncode == 0 + + def test_pcdl_get_spatialdata_timestep_imageslabelspointsshapes_default(self): + o_result = subprocess.run(['pcdl_get_spatialdata', s_pathfile_2d, '--images', 'subs', '--labels', '--points', 'subs', '--shapes', 'cell'], check=False, capture_output=True) + print(f'o_result: {o_result}\n') + print(f'o_result.returncode: {o_result.returncode}\n') + print(f'o_result.stdout: {o_result.stdout}\n') + print(f'o_result.stderr: {o_result.stderr}\n') + shutil.rmtree(f'{s_path_2d}/output00000024_maxabs.zarr') + assert o_result.returncode == 0 + + def test_pcdl_get_spatialdata_timestep_imageslabelspointsshapes_none(self): + o_result = subprocess.run(['pcdl_get_spatialdata', s_pathfile_2d, '--images', '--labels', '--points', '--shapes'], check=False, capture_output=True) + print(f'o_result: {o_result}\n') + print(f'o_result.returncode: {o_result.returncode}\n') + print(f'o_result.stdout: {o_result.stdout}\n') + print(f'o_result.stderr: {o_result.stderr}\n') + shutil.rmtree(f'{s_path_2d}/output00000024_maxabs.zarr') + assert o_result.returncode == 0 + + def test_pcdl_get_spatialdata_timestep_value(self): + o_result = subprocess.run(['pcdl_get_spatialdata', s_pathfile_2d, '2'], check=False, capture_output=True) + print(f'o_result: {o_result}\n') + print(f'o_result.returncode: {o_result.returncode}\n') + print(f'o_result.stdout: {o_result.stdout}\n') + print(f'o_result.stderr: {o_result.stderr}\n') + shutil.rmtree(f'{s_path_2d}/output00000024_maxabs.zarr') + assert o_result.returncode == 0 + + def test_pcdl_get_spatialdata_timestep_drop(self): + o_result = subprocess.run(['pcdl_get_spatialdata', s_pathfile_2d, '--drop', 'cell_type', 'oxygen'], check=False, capture_output=True) + print(f'o_result: {o_result}\n') + print(f'o_result.returncode: {o_result.returncode}\n') + print(f'o_result.stdout: {o_result.stdout}\n') + print(f'o_result.stderr: {o_result.stderr}\n') + shutil.rmtree(f'{s_path_2d}/output00000024_maxabs.zarr') + assert o_result.returncode == 0 + + def test_pcdl_get_spatialdata_timestep_keep(self): + o_result = subprocess.run(['pcdl_get_spatialdata', s_pathfile_2d, '--keep', 'cell_type', 'oxygen'], check=False, capture_output=True) + print(f'o_result: {o_result}\n') + print(f'o_result.returncode: {o_result.returncode}\n') + print(f'o_result.stdout: {o_result.stdout}\n') + print(f'o_result.stderr: {o_result.stderr}\n') + shutil.rmtree(f'{s_path_2d}/output00000024_maxabs.zarr') + assert o_result.returncode == 0 + + def test_pcdl_get_spatialdata_timestep_scale(self): + o_result = subprocess.run(['pcdl_get_spatialdata', s_pathfile_2d, '--scale', 'std'], check=False, capture_output=True) + print(f'o_result: {o_result}\n') + print(f'o_result.returncode: {o_result.returncode}\n') + print(f'o_result.stdout: {o_result.stdout}\n') + print(f'o_result.stderr: {o_result.stderr}\n') + shutil.rmtree(f'{s_path_2d}/output00000024_std.zarr') + assert o_result.returncode == 0 + + class TestCommandLineInterfacePlotTimeSeries(object): ''' tests for one pcdl command line interface function. ''' @@ -1569,7 +1847,7 @@ def test_pcdl_make_ome_tiff_timeseries_conccutoff_oxygenminusone(self): os.remove(f'{s_path_2d}/timeseries_oxygen-1_water_default_blood_cells_ID.ome.tiff') assert o_result.returncode == 0 - def test_pcdl_make_ome_tiff_timeseries_focus_(self): + def test_pcdl_make_ome_tiff_timeseries_focus_oxygen(self): o_result = subprocess.run(['pcdl_make_ome_tiff', s_path_2d, '--focus', 'oxygen'], check=False, capture_output=True) print(f'o_result: {o_result}\n') print(f'o_result.returncode: {o_result.returncode}\n') diff --git a/test/test_timeseries_2d.py b/test/test_timeseries_2d.py index 1477da8..deb057e 100644 --- a/test/test_timeseries_2d.py +++ b/test/test_timeseries_2d.py @@ -983,3 +983,35 @@ def test_mcdsts_get_anndata_keepmcdsfalse(self): (ann.var.shape == (105, 0)) and \ (len(ann.uns) == 0) + +## spatialdata time seris related functions ## +class TestTimeSeriesSpatialData(object): + ''' test for pcdl.TestSeries class. ''' + + # get_sdmcds_list {integrated} + # get_cell_attributes ok + # get_get_spatialdata ok + # keep_mcds {True, _False_} + + def test_mcdsts_get_spatialdata_default(self): + mcdsts = pcdl.TimeSeries(s_path_2d, verbose=True) + lo_sdmcds_output = mcdsts.get_spatialdata(images={'subs'}, labels=set(), points={'subs'}, shapes={'cell'}, values=1, drop=set(), keep=set(), scale='maxabs', keep_mcds=True) + lo_sdmcds_memory = mcdsts.get_sdmcds_list() + assert(str(type(mcdsts)) == "") and \ + (len(mcdsts.l_mcds) == 25) and \ + (len(mcdsts.l_sdmcds) == 25) and \ + (lo_sdmcds_output == mcdsts.l_sdmcds) and \ + (lo_sdmcds_output == lo_sdmcds_memory) and \ + (str(type(lo_sdmcds_output[8])) == "") + + def test_mcdsts_get_spatialdata_keepmcdsfalse(self): + mcdsts = pcdl.TimeSeries(s_path_2d, verbose=True) + lo_sdmcds_output = mcdsts.get_spatialdata(images={'subs'}, labels=set(), points={'subs'}, shapes={'cell'}, values=1, drop=set(), keep=set(), scale='maxabs', keep_mcds=False) + lo_sdmcds_memory = mcdsts.get_sdmcds_list() + assert(str(type(mcdsts)) == "") and \ + (len(mcdsts.l_mcds) == 0) and \ + (len(mcdsts.l_sdmcds) == 25) and \ + (lo_sdmcds_output == mcdsts.l_sdmcds) and \ + (lo_sdmcds_output == lo_sdmcds_memory) and \ + (str(type(lo_sdmcds_output[8])) == "") + diff --git a/test/test_timeseries_3d.py b/test/test_timeseries_3d.py index 83ed8e2..77f4ae8 100644 --- a/test/test_timeseries_3d.py +++ b/test/test_timeseries_3d.py @@ -846,3 +846,35 @@ def test_mcdsts_get_anndata_keepmcdsfalse(self): (ann.var.shape == (105, 0)) and \ (len(ann.uns) == 0) + +## spatialdata time seris related functions ## +class TestTimeSeriesSpatialData(object): + ''' test for pcdl.TestSeries class. ''' + + # get_sdmcds_list {integrated} + # get_cell_attributes ok + # get_get_spatialdata ok + # keep_mcds {True, _False_} + + def test_mcdsts_get_spatialdata_default(self): + mcdsts = pcdl.TimeSeries(s_path_3d, verbose=True) + lo_sdmcds_output = mcdsts.get_spatialdata(images={'subs'}, labels=set(), points={'subs'}, shapes={'cell'}, values=1, drop=set(), keep=set(), scale='maxabs', keep_mcds=True) + lo_sdmcds_memory = mcdsts.get_sdmcds_list() + assert(str(type(mcdsts)) == "") and \ + (len(mcdsts.l_mcds) == 25) and \ + (len(mcdsts.l_sdmcds) == 25) and \ + (lo_sdmcds_output == mcdsts.l_sdmcds) and \ + (lo_sdmcds_output == lo_sdmcds_memory) and \ + (str(type(lo_sdmcds_output[8])) == "") + + def test_mcdsts_get_spatialdata_keepmcdsfalse(self): + mcdsts = pcdl.TimeSeries(s_path_3d, verbose=True) + lo_sdmcds_output = mcdsts.get_spatialdata(images={'subs'}, labels=set(), points={'subs'}, shapes={'cell'}, values=1, drop=set(), keep=set(), scale='maxabs', keep_mcds=False) + lo_sdmcds_memory = mcdsts.get_sdmcds_list() + assert(str(type(mcdsts)) == "") and \ + (len(mcdsts.l_mcds) == 0) and \ + (len(mcdsts.l_sdmcds) == 25) and \ + (lo_sdmcds_output == mcdsts.l_sdmcds) and \ + (lo_sdmcds_output == lo_sdmcds_memory) and \ + (str(type(lo_sdmcds_output[8])) == "") + diff --git a/test/test_timestep_2d.py b/test/test_timestep_2d.py index 7d76341..6c75084 100644 --- a/test/test_timestep_2d.py +++ b/test/test_timestep_2d.py @@ -1081,3 +1081,68 @@ def test_mcds_get_anndata(self): (ann.var.shape == (105, 0)) and \ (len(ann.uns) == 1) + +## spatialdata time step related functions ## +class TestTimeStepSpatialData(object): + ''' test for pcdl.TimeStep class. ''' + + ## get_spatialdata command ## + def test_mcds_get_spatialdata_default(self): + mcds = pcdl.TimeStep(s_pathfile_2d, verbose=False) + sdata = mcds.get_spatialdata(images={'subs'}, labels=set(), points={'subs'}, shapes={'cell'}, values=1, drop=set(), keep=set(), scale='maxabs') + assert(str(type(mcds)) == "") and \ + (str(type(sdata)) == "") and \ + (str(type(sdata['subs_image'])) == "") and \ + (sdata['subs_image'].shape == (2, 200, 300)) and \ + (str(type(sdata['subs_point'])) == "") and \ + (sdata['subs_point'].compute().shape[0] > 9) and \ + (sdata['subs_point'].compute().shape[1] == 2) and \ + (str(type(sdata['cell_shape'])) == "") and \ + (sdata['cell_shape'].shape[0] > 9) and \ + (sdata['cell_shape'].shape[1] == 2) and \ + (str(type(sdata['cell_table'])) == "") and \ + (sdata['cell_table'].shape[0] > 9) and \ + (sdata['cell_table'].shape[1] > 9) and \ + (str(type(sdata['subs_table'])) == "") and \ + (sdata['subs_table'].shape[0] > 9) and \ + (sdata['subs_table'].shape[1] == 2) and \ + (sdata['subs_table'].obs.shape[0] > 9) and \ + (sdata['subs_table'].obs.shape[1] == 11) and \ + (len(sdata['subs_table'].uns) == 1) + + def test_mcds_get_spatialdata_points(self): + mcds = pcdl.TimeStep(s_pathfile_2d, verbose=False) + sdata = mcds.get_spatialdata(images={'subs'}, labels={}, points={'subs','cell'}, shapes=set(), values=1, drop=set(), keep=set(), scale='maxabs') + assert(str(type(mcds)) == "") and \ + (str(type(sdata)) == "") and \ + (str(type(sdata['subs_image'])) == "") and \ + (sdata['subs_image'].shape == (2, 200, 300)) and \ + (str(type(sdata['subs_point'])) == "") and \ + (sdata['subs_point'].compute().shape[0] > 9) and \ + (sdata['subs_point'].compute().shape[1] == 2) and \ + (str(type(sdata['cell_point'])) == "") and \ + (sdata['cell_point'].compute().shape[0] > 9) and \ + (sdata['cell_point'].compute().shape[1] == 2) and \ + (str(type(sdata['cell_table'])) == "") and \ + (sdata['cell_table'].shape[0] > 9) and \ + (sdata['cell_table'].shape[1] > 9) and \ + (str(type(sdata['subs_table'])) == "") and \ + (sdata['subs_table'].shape[0] > 9) and \ + (sdata['subs_table'].shape[1] == 2) and \ + (sdata['subs_table'].obs.shape[0] > 9) and \ + (sdata['subs_table'].obs.shape[1] == 11) and \ + (len(sdata['subs_table'].uns) == 1) + + def test_mcds_get_spatialdata_none(self): + mcds = pcdl.TimeStep(s_pathfile_2d, verbose=False) + sdata = mcds.get_spatialdata(images=set(), labels=set(), points=set(), shapes=set(), values=1, drop=set(), keep=set(), scale='maxabs') + assert(str(type(mcds)) == "") and \ + (str(type(sdata)) == "") and \ + (str(type(sdata['cell_table'])) == "") and \ + (sdata['cell_table'].shape[0] > 9) and \ + (sdata['cell_table'].shape[1] > 9) and \ + (str(type(sdata['subs_table'])) == "") and \ + (sdata['subs_table'].shape[0] > 9) and \ + (sdata['subs_table'].shape[1] ==2) and \ + (sdata['subs_table'].obs.shape[0] > 9) and \ + (sdata['subs_table'].obs.shape[1] == 11) diff --git a/test/test_timestep_3d.py b/test/test_timestep_3d.py index 2e66863..dd522b9 100644 --- a/test/test_timestep_3d.py +++ b/test/test_timestep_3d.py @@ -670,3 +670,69 @@ def test_mcds_get_anndata(self): (ann.var.shape == (105, 0)) and \ (len(ann.uns) == 1) + +## spatialdata time step related functions ## +class TestTimeStepSpatialData(object): + ''' test for pcdl.TimeStep class. ''' + + ## get_spatialdata command ## + def test_mcds_get_spatialdata_default(self): + mcds = pcdl.TimeStep(s_pathfile_3d, verbose=False) + sdata = mcds.get_spatialdata(images={'subs'}, labels=set(), points={'subs'}, shapes={'cell'}, values=1, drop=set(), keep=set(), scale='maxabs') + assert(str(type(mcds)) == "") and \ + (str(type(sdata)) == "") and \ + (str(type(sdata['subs_image'])) == "") and \ + (sdata['subs_image'].shape == (2,11,200,300)) and \ + (str(type(sdata['subs_point'])) == "") and \ + (sdata['subs_point'].compute().shape[0] > 9) and \ + (sdata['subs_point'].compute().shape[1] == 3) and \ + (str(type(sdata['cell_shape'])) == "") and \ + (sdata['cell_shape'].shape[0] > 9) and \ + (sdata['cell_shape'].shape[1] == 2) and \ + (str(type(sdata['cell_table'])) == "") and \ + (sdata['cell_table'].shape[0] > 9) and \ + (sdata['cell_table'].shape[1] > 9) and \ + (str(type(sdata['subs_table'])) == "") and \ + (sdata['subs_table'].shape[0] > 9) and \ + (sdata['subs_table'].shape[1] == 2) and \ + (sdata['subs_table'].obs.shape[0] > 9) and \ + (sdata['subs_table'].obs.shape[1] == 11) and \ + (len(sdata['subs_table'].uns) == 1) + + def test_mcds_get_spatialdata_points(self): + mcds = pcdl.TimeStep(s_pathfile_3d, verbose=False) + sdata = mcds.get_spatialdata(images={'subs'}, labels=set(), points={'subs','cell'}, shapes=set(), values=1, drop=set(), keep=set(), scale='maxabs') + assert(str(type(mcds)) == "") and \ + (str(type(sdata)) == "") and \ + (str(type(sdata['subs_image'])) == "") and \ + (sdata['subs_image'].shape == (2,11,200,300)) and \ + (str(type(sdata['subs_point'])) == "") and \ + (sdata['subs_point'].compute().shape[0] > 9) and \ + (sdata['subs_point'].compute().shape[1] == 3) and \ + (str(type(sdata['cell_point'])) == "") and \ + (sdata['cell_point'].compute().shape[0] > 9) and \ + (sdata['cell_point'].compute().shape[1] == 3) and \ + (str(type(sdata['cell_table'])) == "") and \ + (sdata['cell_table'].shape[0] > 9) and \ + (sdata['cell_table'].shape[1] > 9) and \ + (str(type(sdata['subs_table'])) == "") and \ + (sdata['subs_table'].shape[0] > 9) and \ + (sdata['subs_table'].shape[1] == 2) and \ + (sdata['subs_table'].obs.shape[0] > 9) and \ + (sdata['subs_table'].obs.shape[1] == 11) and \ + (len(sdata['subs_table'].uns) == 1) + + def test_mcds_get_spatialdata_none(self): + mcds = pcdl.TimeStep(s_pathfile_3d, verbose=False) + sdata = mcds.get_spatialdata(images=set(), labels=set(), points=set(), shapes=set(), values=1, drop=set(), keep=set(), scale='maxabs') + assert(str(type(mcds)) == "") and \ + (str(type(sdata)) == "") and \ + (str(type(sdata['cell_table'])) == "") and \ + (sdata['cell_table'].shape[0] > 9) and \ + (sdata['cell_table'].shape[1] > 9) and \ + (str(type(sdata['subs_table'])) == "") and \ + (sdata['subs_table'].shape[0] > 9) and \ + (sdata['subs_table'].shape[1] ==2) and \ + (sdata['subs_table'].obs.shape[0] > 9) and \ + (sdata['subs_table'].obs.shape[1] == 11) +