Skip to content

Dataset loading problem with cell-load #67

@KamijouMikoto12311

Description

@KamijouMikoto12311

I am loading my h5ad file with cell-load 0.8.5, with toml file looking like this:

[datasets]
replogle_h1 = "competition_support_set/{competition_train,k562_gwps,rpe1,jurkat,k562,hepg2}.h5"
filtered = "other data, but not val_hesc"
val_hesc = "competition_support_set/data/PerturBase_hESC_decompressed.h5ad"

[training]
replogle_h1 = "train"
filtered = "train"

[zeroshot]
"val_hesc.hesc" = "test"

[fewshot]

Since in my PerturBase_hESC_decompressed.h5ad, there are 17234 entries, so when it it processing data, I am expecting:

Processed PerturBase_hESC_decompressed: 0 train, 0 val, 17234 test

However, I am getting:

Processed PerturBase_hESC_decompressed: 875 train, 0 val, 17234 test

and the "875 train" seem to come from nowhere, which is confusing. This problem is immediately solved by downgrading to cell-load 0.7.6. Hope developer would notice this.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions