Skip to content

Conversation

@fitzgen
Copy link
Member

@fitzgen fitzgen commented Dec 29, 2025

Fixes #12156

@cfallin
Copy link
Member

cfallin commented Dec 30, 2025

Thanks for the PR! (Nick knows but for anyone else watching) I'm out on PTO then at a conference, so I'll be able to review this Mon, Jan 12 or later.

Copy link
Member

@cfallin cfallin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got time today to review this actually so: LGTM and I'm really excited about this!

If you don't mind, could we post Sightglass results here before merging? Specifically I'd like to see cycles measured for compilation and execution time; hopefully we see no or few regressions (as I know you've already seen during some experimentation).

@fitzgen
Copy link
Member Author

fitzgen commented Jan 5, 2026

Hrm, yeah cycles is looking very different from what I previously saw with insts-retired, unfortunately. Here is the default and spidermonkey suites' results, with statistically insignificant entries removed:

compilation :: cycles :: benchmarks/bz2/benchmark.wasm

  Δ = 38816006.88 ± 4557789.09 (confidence = 95%)

  main.so is 1.22x to 1.28x faster than inst-set.so!

  [173565996 191159750.88 226710599] inst-set.so
  [134897941 152343744.00 181128243] main.so

compilation :: cycles :: benchmarks/pulldown-cmark/benchmark.wasm

  Δ = 30708277.27 ± 5797934.73 (confidence = 95%)

  main.so is 1.11x to 1.16x faster than inst-set.so!

  [227209069 254002445.00 308181456] inst-set.so
  [194211038 223294167.73 282640832] main.so

compilation :: cycles :: benchmarks/spidermonkey/spidermonkey-regex.wasm

  Δ = 632063507.40 ± 84792443.24 (confidence = 95%)

  main.so is 1.09x to 1.11x faster than inst-set.so!

  [6551514230 7032180956.90 7890587513] inst-set.so
  [6198102237 6400117449.50 7008087519] main.so

compilation :: cycles :: benchmarks/spidermonkey/spidermonkey-markdown.wasm

  Δ = 628333545.93 ± 98363544.32 (confidence = 95%)

  main.so is 1.08x to 1.11x faster than inst-set.so!

  [6494087498 7067526581.28 7805742475] inst-set.so
  [6158498649 6439193035.35 7315715259] main.so

compilation :: cycles :: benchmarks/spidermonkey/spidermonkey-json.wasm

  Δ = 562993166.38 ± 115506969.41 (confidence = 95%)

  main.so is 1.07x to 1.10x faster than inst-set.so!

  [6486850237 7033140774.35 8340342367] inst-set.so
  [6044689306 6470147607.97 7743603904] main.so

execution :: cycles :: benchmarks/spidermonkey/spidermonkey-json.wasm

  Δ = 6534858.80 ± 5891940.54 (confidence = 95%)

  main.so is 1.00x to 1.02x faster than inst-set.so!

  [504126227 528608965.58 576654072] inst-set.so
  [496952074 522074106.78 604679005] main.so

I guess instructions retired isn't a good proxy for wall time in this case because we are switching from pretty much an array-of-scalars representation of cost to an array-of-trees representation, which is going to involve more pointer chasing and memory latency that isn't reflected in instructions retired.

Pretty clear to me that this PR cannot land as-is. Might be worth some profiling to see if there is any low-hanging fruit, but if we can't just engineer our way out of this would-be regression, then it is probably best to stop pursuing the instruction-set cost function any further :-/

@fitzgen fitzgen force-pushed the inst-set-cost-function branch from 7c125c7 to 711cc55 Compare January 5, 2026 20:06
@cfallin
Copy link
Member

cfallin commented Jan 6, 2026

Ah, that's too bad! Very worthwhile exploration regardless...

@fitzgen
Copy link
Member Author

fitzgen commented Jan 6, 2026

Idea: We could potentially try to do the current scalar cost first, but bail out if we ever saturate to infinity, and only then use the instruction-set cost function. Kind of heavy-handed, but might avoid compile-time regressions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cranelift Issues related to the Cranelift code generator

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Cranelift: improve egraph cost function / extraction to "understand" sharing / reusing sub-expressions

2 participants