Skip to content

Reproducing deepclean-prod in deepcleanv2 #40

@agoodmanjerry

Description

@agoodmanjerry
  1. /projects/train/train/data.py
    • DeepCleanDataset.setup(stage=stage)
      • line 200~202: Do bandpass first then y_scaler
  2. Make sure the following args in /projects/train/config.yaml are the same with the deepclean-prod config:
    • train_duration
    • train_stride
    • valid_frac
    • batch_size
    • kernel_length
    • filt_order
  3. There are differences between the nn models loaded in deepcleanv2 and deepclean-prod:
    • /projects/train/train/architectures/autoencoder.py
      a. The activation function used by Autoencoder is nn.ReLU(), but in deepclean-prod, the activation function used is nn.Tanh().
      b. In line 86, out_layers should be hidden_channels[-2:None:-1] + [num_witnesses] in order to be the same as deepclean-prod. hidden_channels is [8,16,32,64] which can be found in /projects/train/config.yaml.
  4. /projects/train/train/metrics.py
    • PsdRatio.spectral_density: set average="mean" and fast=False should give the same way of calculating loss during training steps.
  5. The optimizer and lr_scheduler
    • Make sure lr and weight_decay of the optimizer in /projects/train/config.yaml are set to be the same in deepclean-prod.
    • In /projects/train/train/cli.py is OneCycleLR, but StepLR(optimizer, 10, 0.1) is used in deepclean-prod.
  6. To implement the clean process of deepclean-prod, we need to build a new torchmetrics.Metric class (OfflinePsdRatio, for example) so the data aggregation and postprocess are the same with deepclean-prod. Need to change the metric argument of train.model.DeepClean to OfflinePsdRatio then use /projects/infer/infer/cli.py to do the cleaning.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions