Skip to content

Conversation

@wbenoit26
Copy link
Contributor

@wbenoit26 wbenoit26 commented Jul 3, 2025

  • Renames the TrainingWaveforms task to DeployTrainingWaveforms and adds a new TrainingWaveforms task that merges the files, the same as is done for the validation and testing waveforms.
  • Switches the default approximant back to IMRPhenomPv2
  • Adds ml4gw_generation_params property to the LALParameterSet object
  • Copies over the WaveformSampler objects from AMPLFI with a few changes for Aframe's use case
  • Removes the old waveform_sampler.py file

To-do:

  • Switch from coalescence_time parameterization to right_pad parameterization (while also not being confusing with the right_pad of the waveform placement in the kernel)
  • Potentially move the waveform slicing logic to the waveform sampler
  • Does it ever make sense to switch to using ml4gw to generate the validation/testing waveforms? Pros: simplifies a lot of the logic in ledger/injections.py and removes the flaky ligolw dependency from a lot of environments/containers. Cons: ml4gw is slower than PyCBC on CPU and can't handle mass ratios > 0.999ish No

@EthanMarx I'm running training with this right now, and there's still a bit more to do, but it would be good to have you take a look at this when you get a chance.

@wbenoit26 wbenoit26 requested a review from EthanMarx July 3, 2025 14:25
@wbenoit26
Copy link
Contributor Author

@EthanMarx The sensitive volume looks exactly like it should for this training, even though the validation curve stayed a lot lower than normal. Not sure why that would be, but I think it's worth trying to figure out before merging this in. Also, I want to investigate the spikes in the training loss more. Still, good to see that it performs as expected.

@wbenoit26
Copy link
Contributor Author

@EthanMarx Fixed both of the issues mentioned above. The training spikes were resolved by constraining the training waveform mass ratio to be less than 0.999. I was seeing the waveforms be nan for ratios too close to 1, but I guess that the nans didn't propagate all the way to the loss for whatever reason. The validation waveforms were my own fault; I had briefly tried generating validation waveforms during training, abandoned that idea, but forgot to remove the code that treated the validation waveforms like polarizations. So, the validation waveforms were getting re-projected.

The new training run is here, and the sensitive volume plot is here. Both look like they should.

@wbenoit26
Copy link
Contributor Author

@EthanMarx This is ready for review again. I've tested loading from disk vs. generating on the fly for both time domain and multimodal. Curious if you see a better approach than the WaveformLoader.

@EthanMarx
Copy link
Contributor

Looks good - the ability to use multiple waveform files still exists right?

@wbenoit26
Copy link
Contributor Author

No, but I could add that back in. The sandbox pipeline will merge the training waveforms into a single file by default now.

@wbenoit26
Copy link
Contributor Author

@EthanMarx This now can now work multiple training waveform files, though the pipeline isn’t set up to produce those. That can be a separate PR though, I think.

@wbenoit26
Copy link
Contributor Author

@EthanMarx Checking in on this. Anything else you'd like to see as part of this PR?

@EthanMarx
Copy link
Contributor

Okay I was worried about the multiple waveforms thing, but if the training pipeline can at least ingest existing ones thats fine.

@wbenoit26 wbenoit26 merged commit 74a78e4 into ML4GW:main Oct 15, 2025
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants