Skip to content

Conversation

@basetunnel
Copy link
Collaborator

This allows CSE to consider all sub-expressions. Currently it excludes work-free subexpressions such as partially applied built-ins (https://github.com/IntersectMBO/plutus-private/issues/1761).

  • factored out isWorkFree' from the logic of cse, makes it easier to understand

  • instead of Bool, I've added a CseWhichSubterms type to avoid boolean blindness. I found that it makes the code more understandable. If Bool is preferred for consistency with other simplifier flags, I'm happy to change it.

@basetunnel basetunnel requested a review from zliu41 January 7, 2026 13:48
@zliu41 zliu41 requested a review from a team January 7, 2026 22:14
Copy link
Member

@zliu41 zliu41 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, but we should look at how turning it on affects the costs and sizes in the existing tests. That will help decide whether or not it should be on by default.

@basetunnel
Copy link
Collaborator Author

basetunnel commented Jan 8, 2026

I've taken a quick look at the five test cases that are there for CSE.

The cseExpensive test becomes 10x slower using AllSubterms. The other tests are too small to observe a difference:

  simplify
    cse
      AllSubterms
        cse1_AllSubterms:                    OK (0.00s)
        cse2_AllSubterms:                    OK (0.00s)
        cse3_AllSubterms:                    OK (0.00s)
        cseExpensive_AllSubterms:            OK (27.91s)
        csePlusTree_AllSubterms:             OK (0.00s)
        cseRepeatPlus_AllSubterms:           OK (0.00s)
      ExcludeWorkFree
        cse1_ExcludeWorkFree:                OK (0.00s)
        cse2_ExcludeWorkFree:                OK (0.00s)
        cse3_ExcludeWorkFree:                OK (0.00s)
        cseExpensive_ExcludeWorkFree:        OK (2.85s)
        csePlusTree_ExcludeWorkFree:         OK (0.01s)
        cseRepeatPlus_ExcludeWorkFree:       OK (0.00s)

Output size on disk (in bytes) is slightly better in some cases, but equal in others:

259 cse1_AllSubterms.golden.uplc
278 cse1_ExcludeWorkFree.golden.uplc
160 cse2_AllSubterms.golden.uplc
160 cse2_ExcludeWorkFree.golden.uplc
119 cse3_AllSubterms.golden.uplc
119 cse3_ExcludeWorkFree.golden.uplc
617455 cseExpensive_AllSubterms.golden.uplc
617455 cseExpensive_ExcludeWorkFree.golden.uplc
162 csePlusTree_AllSubterms.golden.uplc
162 csePlusTree_ExcludeWorkFree.golden.uplc
54 cseRepeatPlus_AllSubterms.golden.uplc
74 cseRepeatPlus_ExcludeWorkFree.golden.uplc

Size on disk is probably not a great measure, but I don't know of a simple way to measure AST size of the outputs.

I'll check why the expensive test gets so much slower, perhaps it's just considering way more sub-expressions.

@basetunnel
Copy link
Collaborator Author

The tests from plutus-tx-plugin:plutus-tx-plugin-tests, plutus-benchmark and cardano-constitution resulted in 54 changed golden.eval files. On average, there is a slight improvement in CPU en Mem, although size goes up, somewhat surprisingly.

--- CPU ---
Mean pct:        -0.11%
Std deviation:   3.9%
Min / Max pct:   -11.11% / +22.50%
Improvements:    32
Regressions:     18
Unchanged:       4

--- MEM ---
Mean pct:        -0.22%
Std deviation:   3.8%
Min / Max pct:   -10.71% / +18.36%
Improvements:    32
Regressions:     18
Unchanged:       4

--- AST ---
Mean pct:        +0.18%
Std deviation:   5.0%
Min / Max pct:   -5.59% / +26.88%
Improvements:    34
Regressions:     13
Unchanged:       7

--- FLAT ---
Mean pct:        +0.75%
Std deviation:   4.1%
Min / Max pct:   -4.85% / +26.15%
Improvements:    11
Regressions:     36
Unchanged:       7

@basetunnel
Copy link
Collaborator Author

CPU changes cpu_changes
Mem changes mem_changes
AST changes ast_changes
FLAT changes flat_changes

@basetunnel
Copy link
Collaborator Author

Looking at the diff in UPLC of 8 queens and Ed225119 I find it hard to see what happened and why. Perhaps a bad interaction between inlining and cse?

@zliu41
Copy link
Member

zliu41 commented Jan 17, 2026

Thanks for the data. I'm surprised that there are CPU improvements when allowing workfree terms. As I noted before,

      -- We don't consider work-free terms for CSE, because doing so may or may not
      -- have a size benefit, but certainly doesn't have any cost benefit (the cost
      -- will in fact be slightly higher due to the additional application).

Do you know why we are seeing CPU improvements? If not, it's worth looking into the or test case, which is fairly small.

We should also update Note [CSE]. There are a few mentions of "workfree" there.

@zliu41
Copy link
Member

zliu41 commented Jan 17, 2026

If not, it's worth looking into the or test case, which is fairly small.

Also, does the or test case really happen to have the same AST sizes before and after? It's hard to count by eye but that would be some coincidence.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants