Skip to content

Conversation

@recmo
Copy link
Contributor

@recmo recmo commented Sep 5, 2025

This PR adds two new structs:

  • DagExpr. This generalizes RecExpr by allowing multiple roots. It also maintains strong invariants on minimality and canonical order, essentially being a unique representation of a given DAG for the roots.
  • BeamExtract. A DAG extractor based on DagExpr that uses beam search to approximate the optimal.

One cool thing I realized while building this is that if we add a minor restriction on Language (the ordering of nodes is preserved under a monotonically increasing map of node Ids), then the merging two DagExprs can be done using a linear time merge sort variant.

I'm wondering if this minor restriction should be added to the docs of Language as a requirement, or an additional marker trait languages need to implement.

It may make sense to use DagExpr for the LpExtract result as well, but that would be breaking the public API.

@oflatt
Copy link
Member

oflatt commented Sep 6, 2025

Hi- thanks for your PR!
The egg team has switched focus to egglog and can't keep expanding egg.

Could this PR be a separate crate?

This algorithm could be a great contribution to the extraction gym! It works over a serialized egraph that both egg and egglog export to.

https://github.com/egraphs-good/extraction-gym

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants