Skip to content

Conversation

@Julie-Fabre
Copy link

This PR ports bombcell-style unit classification to SpikeInterface.

Template metrics
  • Rewrote peak/trough detection with a new get_trough_and_peak_idx() function that uses scipy.signal.find_peaks(). Since SpikeInterface stores templates based on raw data rather than the heavily smoothed templates used in template matching, the waveforms can be noisy—so you can optionally apply Savitzky-Golay smoothing before detection. The function returns dicts for troughs, peaks before, and peaks after, each containing indices, values, prominences, and widths.
from spikeinterface.postprocessing import get_trough_and_peak_idx

troughs, peaks_before, peaks_after = get_trough_and_peak_idx(
    templates,
    sampling_frequency,
    smooth=True,
    min_thresh_detect_peaks_troughs=0.4,
)
  • New metrics: peak_before_to_trough_ratio, peak_after_to_trough_ratio, waveform_baseline_flatness, peak_before_width, trough_width, main_peak_to_trough_ratio.

  • Renamed peak_to_valley to peak_to_trough_duration.

analyzer.compute("template_metrics", metric_names=[
    "peak_before_to_trough_ratio",
    "waveform_baseline_flatness",
    "trough_width",
])
Quality metrics
  • Added snr_bombcell—peak amplitude over baseline MAD.
analyzer.compute("quality_metrics", metric_names=["snr_bombcell"])
  • amplitude_cutoff now has parameters for controlling the histogram fitting:
analyzer.compute("quality_metrics", metric_names=["amplitude_cutoff"], qm_params={
    "amplitude_cutoff": {
        "num_histogram_bins": 100,
        "histogram_smoothing_value": 3,
    }
})
Unit classification
  • New in spikeinterface.curation:
import spikeinterface.comparison as sc

thresholds = sc.bombcell_get_default_thresholds()
unit_type, unit_type_string = sc.bombcell_classify_units(
    quality_metrics,
    thresholds=thresholds,
    classify_non_somatic=True,
)
summary = sc.get_classification_summary(unit_type, unit_type_string)

Units get classified as NOISE → MUA → GOOD based on successive threshold checks. Optional NON_SOMA category for non-somatic waveforms.

Plots
  • Added plots for classification summaries, metric histograms with threshold lines, waveform overlays by category, and UpSet plots.
from spikeinterface.widgets import (
    plot_unit_classification,
    plot_classification_histograms,
    plot_waveform_overlay,
    plot_upset,
)

plot_unit_classification(analyzer, unit_type, unit_type_string)
plot_classification_histograms(quality_metrics, thresholds=thresholds)
plot_waveform_overlay(analyzer, unit_type, unit_type_string)
plot_upset(quality_metrics, unit_type, unit_type_string)

or a wrapper for all plots:

plots = plot_unit_classification_all(
    sorting_analyzer,
    unit_type,
    unit_type_string,
    quality_metrics=quality_metrics,  # optional, will try to get from analyzer
    thresholds=thresholds,            # optional, uses defaults
    split_non_somatic=False,
    include_upset=True,
)

Julie-Fabre and others added 20 commits January 7, 2026 01:15
…uration, add amplitude_median, bombcell_snr and fix non-somatic classification rules
@alejoe91 alejoe91 added the curation Related to curation module label Jan 8, 2026
@samuelgarcia
Copy link
Member

Salut Julie,
I read this super quickly. This is super impressive what you did during the hackahton!
I was not aware that you also did the widgets stuff. Waou.

I will be back with more carefully reading.

But some main stuff:

  • we avoid to push ipynb in the repo because in saturate the history so we use jupytext instead and push only the resulting generated rst,if the notebook is fast to generate (with simulate data) we also have the tutorial way to push doc through notebooks which is a py file run and generated by the documentaion build.
  • I would prefer to not have json directly in the code to handle parameters. I think simple python file with the same contents. lets discuss more
  • I would be courious to see the correlation between the basic snr and the one median based you did. I will try to make some plot on this.

import numpy as np
import warnings
from copy import deepcopy
from scipy.signal import find_peaks
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you move this to the function?

The core module has minimal dependencies, and all additional imports should be local :)

@@ -0,0 +1,430 @@
"""
Unit labelling based on quality metrics (Bombcell).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Unit labelling based on quality metrics (Bombcell).
Unit labeling based on quality metrics (Bombcell).

In general, we adopted american english (@chrishalcrow is not happy about it!).

Could you rename this and the files to labeling?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The file could be called bombcell_curation (similar to model_based_curation)

@@ -0,0 +1,74 @@
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can remove this file

Copy link
Member

@alejoe91 alejoe91 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Julie-Fabre massive effort! Thanks!

I did a first round of reviewing and I'm happy to discuss some details and also work on it :)

from typing import Optional


WAVEFORM_METRICS = [
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
WAVEFORM_METRICS = [
NOISE_METRICS = [

?

# bombcell
return {
# Waveform quality (failures -> NOISE)
"num_positive_peaks": {"min": np.nan, "max": 2},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"num_positive_peaks": {"min": np.nan, "max": 2},
"num_positive_peaks": {"min": None, "max": 2},

I would just keep None and deal with it in the function instead of NaN, so you can save/load to JSON without any custom fields

Comment on lines +91 to +92
quality_metrics=None,
template_metrics=None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use sorting_analyzer instead

unit_type_string : np.ndarray
String labels.
"""
combined_metrics = _combine_metrics(quality_metrics, template_metrics)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
combined_metrics = _combine_metrics(quality_metrics, template_metrics)
combined_metrics = sorting_analyzer.get_metrics_extension_data()

;)

values = np.abs(values)
thresh = thresholds[metric_name]
noise_mask |= np.isnan(values)
if not np.isnan(thresh["min"]):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if not np.isnan(thresh["min"]):
if thresh["min"] is not None:

and so on

class PeakToValley(BaseMetric):
metric_name = "peak_to_valley"
class PeakToTroughDuration(BaseMetric):
metric_name = "peak_to_trough_duration"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking it could be useful to add a deprecated_column_names, so we could automate backward compatibility :)

num_positive_peaks_dict = {}
num_negative_peaks_dict = {}
sampling_frequency = sorting_analyzer.sampling_frequency
sampling_frequency = tmp_data["sampling_frequency"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

goooooood catch @Julie-Fabre !!!!!!

Comment on lines +1225 to +1226
class WaveformDuration(BaseMetric):
metric_name = "waveform_duration"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the name doesn't convey the actual computation

Suggested change
class WaveformDuration(BaseMetric):
metric_name = "waveform_duration"
class MainToNextPeakDuration(BaseMetric):
metric_name = "main_to_next_peak_duration"

?

Comment on lines +1334 to +1336
"trough_width": "Width of the main trough in microseconds",
"peak_before_width": "Width of the main peak before trough in microseconds",
"peak_after_width": "Width of the main peak after trough in microseconds",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be consistent and output everything in the same unit. For now we have been doing seconds for the durations. The bombcell curation could still accept thresholds in us and do the conversion on the fly.

Alternatively, we could add a unit field to the BaseMetric, to specify units for each column. I think I would go with this, but it requires an additional refactoring. @chrishalcrow what do you think?

Comment on lines +27 to +28
quality_metrics=None,
template_metrics=None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
quality_metrics=None,
template_metrics=None,
sorting_analyzer

same reasons as curation module

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

curation Related to curation module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants