Skip to content

Serious bug: Problems in metadata extraction functions result in severe wrong behaviour when constructing likelihoods from sacc and factories! #547

@arthurmloureiro

Description

@arthurmloureiro

Describe the bug
Following the conversation on the Firecrown Slack Channel: what defines what in the sacc is linked to what in the factory yaml (the thing set to type_source: default) is the Nz type that firecrown reads from the metadata. However, nothing in the sacc tells firecrown when using extract_all_harmonic_data or similar functions to fill that metadata information with anything else but default see here:

InferredGalaxyZDist(
and here
type_source: TypeSource = TypeSource.DEFAULT
so even though the SACC has different tracers, the metadata reader from firecrown will always consider everything the same.

In other words, since nothing from the functions that are used to extract the metadata from the SACC is used to differentiate different type_source when constructing the Nz part of the metadata, all tracers of the same type are consider the same.

This is one problem.

The more serious problem is that when providing the likelihood builder from yaml factories and sacc metadata, NOTHING breaks to tell the user that there are more type_sources in the yaml factories than type_sources in the sacc metadata.

To Reproduce
When reading a sacc that contains more than one tracer of the same type:

for n, t in sacc_data.tracers.items():
    print(t.name, t.quantity, type(t))

resulting in:

gs0 generic <class 'sacc.tracers.NZTracer'>
gs1 generic <class 'sacc.tracers.NZTracer'>
gs2 generic <class 'sacc.tracers.NZTracer'>
gs3 generic <class 'sacc.tracers.NZTracer'>
gs4 generic <class 'sacc.tracers.NZTracer'>
gs5 generic <class 'sacc.tracers.NZTracer'>
gs6 generic <class 'sacc.tracers.NZTracer'>
gs7 generic <class 'sacc.tracers.NZTracer'>
gs8 generic <class 'sacc.tracers.NZTracer'>
gs9 generic <class 'sacc.tracers.NZTracer'>
gs10 generic <class 'sacc.tracers.NZTracer'>
g0 generic <class 'sacc.tracers.NZTracer'>
s0 generic <class 'sacc.tracers.NZTracer'>
g1 generic <class 'sacc.tracers.NZTracer'>
s1 generic <class 'sacc.tracers.NZTracer'>
g2 generic <class 'sacc.tracers.NZTracer'>
s2 generic <class 'sacc.tracers.NZTracer'>
g3 generic <class 'sacc.tracers.NZTracer'>
s3 generic <class 'sacc.tracers.NZTracer'>
g4 generic <class 'sacc.tracers.NZTracer'>
s4 generic <class 'sacc.tracers.NZTracer'>
g5 generic <class 'sacc.tracers.NZTracer'>
s5 generic <class 'sacc.tracers.NZTracer'>

following the tutorial to extract the metadata from the sacc in order to construct a firecrown likelihood using the facties, we can use this

two_point_reals = extract_all_harmonic_data(sacc_data)

The function extract_all_harmonic_data makes no distinction in the Nz metadata for g tracers or gs tracers, they are both measurements={<Galaxies.COUNTS: '1'>}, type_source='default')

Now if I use the following yaml of factories:

correlation_space: harmonic
weak_lensing_factories:
  - type_source: default
    per_bin_systematics:
    - type: MultiplicativeShearBiasFactory
    - type: PhotoZShiftFactory
    global_systematics:
    - type: LinearAlignmentSystematicFactory
      alphag: 1.0
number_counts_factories:
  - type_source: default
    per_bin_systematics:
    - type: PhotoZShiftFactory
    global_systematics: []
  - type_source: default-spec
    per_bin_systematics:
    - type: PhotoZShiftandStretchFactory
    - type: MagnificationBiasSystematicFactory
    global_systematics: []

I can still construct the likelihood using:

tp_factory = base_model_from_yaml(TwoPointFactory, two_point_yaml)
two_points_ready = TwoPoint.from_measurement(two_point_reals, tp_factory)
likelihood_ready = ConstGaussian.create_ready(
    two_points_ready, sacc_data.covariance.dense
)

Nothing breaks from the fact that there are more type_sources in the factory yaml than in the metadata.

Expected behavior

  1. A way to tell the extract_metadata functions about different tracers of the same type of measurement:
    measurements={<Galaxies.COUNTS: '1'>}, type_source='red') and measurements={<Galaxies.COUNTS: '1'>}, type_source='blue') for example
  2. The TwoPoint.from_measurement(two_point_reals, tp_factory) should break if more type_sources are requested than present in the metadata.

Observed behavior
Described above

Output
Here's the output from printing the default values of systematic parameters from the example above:

{'alphaz': 0.0,
 's2_delta_z': 0.0,
 'g2_bias': 1.5,
 's2_mult_bias': 1.0,
 'g1_bias': 1.5,
 'gs2_delta_z': 0.0,
 's4_mult_bias': 1.0,
 'gs5_delta_z': 0.0,
 'gs2_bias': 1.5,
 'gs9_bias': 1.5,
 'g0_bias': 1.5,
 'gs5_bias': 1.5,
 's1_mult_bias': 1.0,
 'gs8_delta_z': 0.0,
 'gs7_delta_z': 0.0,
 'ia_bias': 0.5,
 's4_delta_z': 0.0,
 'gs6_bias': 1.5,
 's1_delta_z': 0.0,
 'g3_delta_z': 0.0,
 'gs0_bias': 1.5,
 'gs8_bias': 1.5,
 'z_piv': 0.5,
 'gs6_delta_z': 0.0,
 'gs0_delta_z': 0.0,
 's3_mult_bias': 1.0,
 'g4_bias': 1.5,
 'gs7_bias': 1.5,
 's3_delta_z': 0.0,
 'gs4_bias': 1.5,
 'g0_delta_z': 0.0,
 's5_delta_z': 0.0,
 'gs1_delta_z': 0.0,
 'gs4_delta_z': 0.0,
 'g5_delta_z': 0.0,
 's5_mult_bias': 1.0,
 'g5_bias': 1.5,
 's0_mult_bias': 1.0,
 'gs3_delta_z': 0.0,
 'gs10_delta_z': 0.0,
 'gs9_delta_z': 0.0,
 'gs1_bias': 1.5,
 'gs10_bias': 1.5,
 'g4_delta_z': 0.0,
 'g3_bias': 1.5,
 'g1_delta_z': 0.0,
 's0_delta_z': 0.0,
 'g2_delta_z': 0.0,
 'gs3_bias': 1.5}

note that nothing is releated to the MagnificationBiasSystematicFactory or the PhotoZShiftandStretchFactory even though these were present in the yaml factory and no errors were thrown.

Configuration:
Please include all relevant information about your configuration:

  • Firecrown version: '1.12.0a0'

Additional context
Add any other context about the problem here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions