-
Notifications
You must be signed in to change notification settings - Fork 45
Redefine a minimal bundle to have only one LiveRange. #214
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Redefine a minimal bundle to have only one LiveRange. #214
Conversation
When exerting additional pressure on regalloc with bytecodealliance/wasmtime#10502, which can lead to call instructions that have significantly more (`any`-constrained) defs, we hit panics in RA2 where (i) a bundle merged several LiveRanges, (ii) one of these LiveRanges had a fixed-reg constraint on an early use, (iii) this fixed-reg constraint conflicted with a clobber (which always happens at the late-point), (iv) the bundle merged in another LiveRange of some arbitrary def at the late point. This would make a bundle (which is the atomic unit of allocation) that covers the whole inst, including the late point; and is required to be in the fixed reg; this is unallocatable because the clobber is also at the late point in that reg. Our allocate-or-split-and-retry logic does not split if a bundle is "minimal". This is meant to give a base case to the retries: when bundles break down into their minimal pieces, any solvable set of constraints should result in allocations. Unfortunately the "is minimal" predicate definition did not account for multiple LiveRanges, but rather only tested whether the total program-point range of the bundle was over one instruction. If there are multiple LiveRanges, we can still split them off, and the resulting split bundles may cover only half the instruction, avoiding the clobbers.
|
This fixes one of the fuzzbugs that came in (https://issues.oss-fuzz.com/issues/408523840) -- likely the others too, but I haven't checked yet. I'm fuzzing locally with |
|
I fuzzed this (with the |
|
Would it make sense to beef up the fuzzer in this repo to catch this as well? (assuming it isn't possible for the preexisting fuzzer to eventually catch this) |
|
Yes definitely, I need to look into that -- it won't be super-simple though, as there isn't really a separate oracle for "is this set of constraints allocatable", so I need to do some thinking about how to generate more interesting sets of constraints by construction that work. In other words I don't want to block merging this fix on that as it will be a bit of a deeper task. |
fitzgen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 on trying to cover these cases in the fuzz target when possible
This pulls in a fix for a fuzzbug found after bytecodealliance#10502 started generating more challenging constraints for regalloc. The fix in bytecodealliance/regalloc2#214 updates bundle-splitting logic to properly handle bundles with multiple live-ranges all covering one instruction.
* Cranelift: update to regalloc2 0.11.3. This pulls in a fix for a fuzzbug found after #10502 started generating more challenging constraints for regalloc. The fix in bytecodealliance/regalloc2#214 updates bundle-splitting logic to properly handle bundles with multiple live-ranges all covering one instruction. * Update test expectations after regalloc perturbation.
This is a followup from bytecodealliance#214. We have use-cases in Cranelift now where we generate constraints on call instructions that have: - A number of fixed-reg uses at "early"; - A number of fixed-reg defs at "late"; - A number of register clobbers, potentially overlapping with uses; - A number of "any"-constrained defs at "late", to add register pressure. The existing fuzzing function generator would create at most 3 operands, and would add *either* fixed-reg constraints *or* clobbers, not both. This PR adds a "callsite-ish constraints" branch as a subcase of the fixed-reg choice that also adds the "any" defs to add register pressure, and adds new clobbers that don't interfere with existing constraints. It also loosens the logic around constructing fixed-reg constraints: if a conflict is found with some arbitrarily-chosen constraint, it keeps going to the next operand and tries to make it fixed, rather than giving up on further operands.
This is a followup from bytecodealliance#214. We have use-cases in Cranelift now where we generate constraints on call instructions that have: - A number of fixed-reg uses at "early"; - A number of fixed-reg defs at "late"; - A number of register clobbers, potentially overlapping with uses; - A number of "any"-constrained defs at "late", to add register pressure. The existing fuzzing function generator would create at most 3 operands, and would add *either* fixed-reg constraints *or* clobbers, not both. This PR adds a "callsite-ish constraints" branch as a subcase of the fixed-reg choice that also adds the "any" defs to add register pressure, and adds new clobbers that don't interfere with existing constraints. It also loosens the logic around constructing fixed-reg constraints: if a conflict is found with some arbitrarily-chosen constraint, it keeps going to the next operand and tries to make it fixed, rather than giving up on further operands.
This is a followup from bytecodealliance#214. We have use-cases in Cranelift now where we generate constraints on call instructions that have: - A number of fixed-reg uses at "early"; - A number of fixed-reg defs at "late"; - A number of register clobbers, potentially overlapping with uses; - A number of "any"-constrained defs at "late", to add register pressure. The existing fuzzing function generator would create at most 3 operands, and would add *either* fixed-reg constraints *or* clobbers, not both. This PR adds a "callsite-ish constraints" branch as a subcase of the fixed-reg choice that also adds the "any" defs to add register pressure, and adds new clobbers that don't interfere with existing constraints. It also loosens the logic around constructing fixed-reg constraints: if a conflict is found with some arbitrarily-chosen constraint, it keeps going to the next operand and tries to make it fixed, rather than giving up on further operands.
This is a followup from #214. We have use-cases in Cranelift now where we generate constraints on call instructions that have: - A number of fixed-reg uses at "early"; - A number of fixed-reg defs at "late"; - A number of register clobbers, potentially overlapping with uses; - A number of "any"-constrained defs at "late", to add register pressure. The existing fuzzing function generator would create at most 3 operands, and would add *either* fixed-reg constraints *or* clobbers, not both. This PR adds a "callsite-ish constraints" branch as a subcase of the fixed-reg choice that also adds the "any" defs to add register pressure, and adds new clobbers that don't interfere with existing constraints. It also loosens the logic around constructing fixed-reg constraints: if a conflict is found with some arbitrarily-chosen constraint, it keeps going to the next operand and tries to make it fixed, rather than giving up on further operands.
This PR rewords minimal-bundle logic to be consistent between property computation and splitting, and overall simpler. To recap: a "minimal bundle" is an allocation bundle that is as small as possible. Because RA2 does backtracking, i.e., can kick a bundle out of an allocation to make way for another, and because it can split, i.e. create more work for itself, we need to ensure the algorithm "bottoms out" somewhere to ensure termination. The way we do this is by defining a "minimal bundle": this is a bundle that cannot be split further. A minimal bundle is never evicted, and any allocatable input (set of uses/defs with constraints), when split into minimal bundles, should result in no conflicts. We thus want to define minimal bundles as *minimally* as possible so that we have the maximal solving capacity. For a long time, we defined minimal bundles as those spanning a single instruction (before- and after- progpoints) -- because we cannot insert moves in the middle of an instruction. In bytecodealliance#214, we updated minimal-bundle splitting to avoid putting two different unrelated uses in a single-instruction bundle, beacuse doing so and calling it "minimal" (unsplittable) can artificially extend the liverange of a use with a fixed-reg constraint at the before-point into the after-point, causing an unsolveable conflict. This was triggered by new and tighter constraints on callsites in Cranelift after bytecodealliance/wasmtime#10502 (merging retval defs into calls) landed. Unfortunately this also resulted in an infinite allocation loop, because the definition of "minimal bundle" did not agree between the split-into-minimal-bundles fallback/last-ditch code, and the bundle property computation. The splitter was splitting as far as it was willing to go, but our predicate didn't consider those bundles minimal, so we continued to re-attempt splitting indefinitely. While investigating this, I found that the minimal-bundle concept had accumulated significant cruft ("the detritus of dead fuzzbugs") and this tech-debt was making things more confusing than not -- so I started by clearly defining what a minimal bundle *is*. Precisely: - A single use, within a single LiveRange; - With that LiveRange having a program-point span consistent with the use: - Early def: whole instruction (must live past Late point so it can reach its uses; moves not possible within inst); - Late def: Late point only; - Early use: Early point only; - Late use: whole instruction (must be live starting at Early so the value can reach this use; moves not possible within inst). This is made easier and simpler than what we have before largely because the minimal-bundle splitter aggressively puts spans of LiveRange without uses into the spill bundle, and because we support overlapping LiveRanges for a vreg now (thanks Trevor!), so we can rely on having *some* connectivity between the def and its uses even if we aggressively trim LiveRanges in the minimal bundles down to just their defs/uses. Fixes bytecodealliance#218, bytecodealliance#219.
This PR rewords minimal-bundle logic to be consistent between property computation and splitting, and overall simpler. To recap: a "minimal bundle" is an allocation bundle that is as small as possible. Because RA2 does backtracking, i.e., can kick a bundle out of an allocation to make way for another, and because it can split, i.e. create more work for itself, we need to ensure the algorithm "bottoms out" somewhere to ensure termination. The way we do this is by defining a "minimal bundle": this is a bundle that cannot be split further. A minimal bundle is never evicted, and any allocatable input (set of uses/defs with constraints), when split into minimal bundles, should result in no conflicts. We thus want to define minimal bundles as *minimally* as possible so that we have the maximal solving capacity. For a long time, we defined minimal bundles as those spanning a single instruction (before- and after- progpoints) -- because we cannot insert moves in the middle of an instruction. In bytecodealliance#214, we updated minimal-bundle splitting to avoid putting two different unrelated uses in a single-instruction bundle, beacuse doing so and calling it "minimal" (unsplittable) can artificially extend the liverange of a use with a fixed-reg constraint at the before-point into the after-point, causing an unsolveable conflict. This was triggered by new and tighter constraints on callsites in Cranelift after bytecodealliance/wasmtime#10502 (merging retval defs into calls) landed. Unfortunately this also resulted in an infinite allocation loop, because the definition of "minimal bundle" did not agree between the split-into-minimal-bundles fallback/last-ditch code, and the bundle property computation. The splitter was splitting as far as it was willing to go, but our predicate didn't consider those bundles minimal, so we continued to re-attempt splitting indefinitely. While investigating this, I found that the minimal-bundle concept had accumulated significant cruft ("the detritus of dead fuzzbugs") and this tech-debt was making things more confusing than not -- so I started by clearly defining what a minimal bundle *is*. Precisely: - A single use, within a single LiveRange; - With that LiveRange having a program-point span consistent with the use: - Early def: whole instruction (must live past Late point so it can reach its uses; moves not possible within inst); - Late def: Late point only; - Early use: Early point only; - Late use: whole instruction (must be live starting at Early so the value can reach this use; moves not possible within inst). This is made easier and simpler than what we have before largely because the minimal-bundle splitter aggressively puts spans of LiveRange without uses into the spill bundle, and because we support overlapping LiveRanges for a vreg now (thanks Trevor!), so we can rely on having *some* connectivity between the def and its uses even if we aggressively trim LiveRanges in the minimal bundles down to just their defs/uses. The split-at-program-point splitter (i.e., not the fallback split-into-minimal-bundles splitter) also got a small fix related to this: it has a mode that was intended to "split off one use" if we enter with a split-point at the start of the bundle, but this was really splitting off all uses at the program point (if there are multiple of the same vreg at the same program point). In the case that we still need to split these apart, this just falls back to the minimal-bundle splitter now. Fixes bytecodealliance#218, bytecodealliance#219.
This PR reworks minimal-bundle logic to be consistent between property computation and splitting, and overall simpler. To recap: a "minimal bundle" is an allocation bundle that is as small as possible. Because RA2 does backtracking, i.e., can kick a bundle out of an allocation to make way for another, and because it can split, i.e. create more work for itself, we need to ensure the algorithm "bottoms out" somewhere to ensure termination. The way we do this is by defining a "minimal bundle": this is a bundle that cannot be split further. A minimal bundle is never evicted, and any allocatable input (set of uses/defs with constraints), when split into minimal bundles, should result in no conflicts. We thus want to define minimal bundles as *minimally* as possible so that we have the maximal solving capacity. For a long time, we defined minimal bundles as those spanning a single instruction (before- and after- progpoints) -- because we cannot insert moves in the middle of an instruction. In bytecodealliance#214, we updated minimal-bundle splitting to avoid putting two different unrelated uses in a single-instruction bundle, beacuse doing so and calling it "minimal" (unsplittable) can artificially extend the liverange of a use with a fixed-reg constraint at the before-point into the after-point, causing an unsolveable conflict. This was triggered by new and tighter constraints on callsites in Cranelift after bytecodealliance/wasmtime#10502 (merging retval defs into calls) landed. Unfortunately this also resulted in an infinite allocation loop, because the definition of "minimal bundle" did not agree between the split-into-minimal-bundles fallback/last-ditch code, and the bundle property computation. The splitter was splitting as far as it was willing to go, but our predicate didn't consider those bundles minimal, so we continued to re-attempt splitting indefinitely. While investigating this, I found that the minimal-bundle concept had accumulated significant cruft ("the detritus of dead fuzzbugs") and this tech-debt was making things more confusing than not -- so I started by clearly defining what a minimal bundle *is*. Precisely: - A single use, within a single LiveRange; - With that LiveRange having a program-point span consistent with the use: - Early def: whole instruction (must live past Late point so it can reach its uses; moves not possible within inst); - Late def: Late point only; - Early use: Early point only; - Late use: whole instruction (must be live starting at Early so the value can reach this use; moves not possible within inst). This is made easier and simpler than what we have before largely because the minimal-bundle splitter aggressively puts spans of LiveRange without uses into the spill bundle, and because we support overlapping LiveRanges for a vreg now (thanks Trevor!), so we can rely on having *some* connectivity between the def and its uses even if we aggressively trim LiveRanges in the minimal bundles down to just their defs/uses. The split-at-program-point splitter (i.e., not the fallback split-into-minimal-bundles splitter) also got a small fix related to this: it has a mode that was intended to "split off one use" if we enter with a split-point at the start of the bundle, but this was really splitting off all uses at the program point (if there are multiple of the same vreg at the same program point). In the case that we still need to split these apart, this just falls back to the minimal-bundle splitter now. Fixes bytecodealliance#218, bytecodealliance#219.
This PR reworks minimal-bundle logic to be consistent between property computation and splitting, and overall simpler. To recap: a "minimal bundle" is an allocation bundle that is as small as possible. Because RA2 does backtracking, i.e., can kick a bundle out of an allocation to make way for another, and because it can split, i.e. create more work for itself, we need to ensure the algorithm "bottoms out" somewhere to ensure termination. The way we do this is by defining a "minimal bundle": this is a bundle that cannot be split further. A minimal bundle is never evicted, and any allocatable input (set of uses/defs with constraints), when split into minimal bundles, should result in no conflicts. We thus want to define minimal bundles as *minimally* as possible so that we have the maximal solving capacity. For a long time, we defined minimal bundles as those spanning a single instruction (before- and after- progpoints) -- because we cannot insert moves in the middle of an instruction. In #214, we updated minimal-bundle splitting to avoid putting two different unrelated uses in a single-instruction bundle, beacuse doing so and calling it "minimal" (unsplittable) can artificially extend the liverange of a use with a fixed-reg constraint at the before-point into the after-point, causing an unsolveable conflict. This was triggered by new and tighter constraints on callsites in Cranelift after bytecodealliance/wasmtime#10502 (merging retval defs into calls) landed. Unfortunately this also resulted in an infinite allocation loop, because the definition of "minimal bundle" did not agree between the split-into-minimal-bundles fallback/last-ditch code, and the bundle property computation. The splitter was splitting as far as it was willing to go, but our predicate didn't consider those bundles minimal, so we continued to re-attempt splitting indefinitely. While investigating this, I found that the minimal-bundle concept had accumulated significant cruft ("the detritus of dead fuzzbugs") and this tech-debt was making things more confusing than not -- so I started by clearly defining what a minimal bundle *is*. Precisely: - A single use, within a single LiveRange; - With that LiveRange having a program-point span consistent with the use: - Early def: whole instruction (must live past Late point so it can reach its uses; moves not possible within inst); - Late def: Late point only; - Early use: Early point only; - Late use: whole instruction (must be live starting at Early so the value can reach this use; moves not possible within inst). This is made easier and simpler than what we have before largely because the minimal-bundle splitter aggressively puts spans of LiveRange without uses into the spill bundle, and because we support overlapping LiveRanges for a vreg now (thanks Trevor!), so we can rely on having *some* connectivity between the def and its uses even if we aggressively trim LiveRanges in the minimal bundles down to just their defs/uses. The split-at-program-point splitter (i.e., not the fallback split-into-minimal-bundles splitter) also got a small fix related to this: it has a mode that was intended to "split off one use" if we enter with a split-point at the start of the bundle, but this was really splitting off all uses at the program point (if there are multiple of the same vreg at the same program point). In the case that we still need to split these apart, this just falls back to the minimal-bundle splitter now. Fixes #218, #219.
When exerting additional pressure on regalloc with bytecodealliance/wasmtime#10502, which can lead to call instructions that have significantly more (
any-constrained) defs, we hit panics in RA2 where (i) a bundle merged several LiveRanges, (ii) one of these LiveRanges had a fixed-reg constraint on an early use, (iii) this fixed-reg constraint conflicted with a clobber (which always happens at the late-point), (iv) the bundle merged in another LiveRange of some arbitrary def at the late point.This would make a bundle (which is the atomic unit of allocation) that covers the whole inst, including the late point; and is required to be in the fixed reg; this is unallocatable because the clobber is also at the late point in that reg.
Our allocate-or-split-and-retry logic does not split if a bundle is "minimal". This is meant to give a base case to the retries: when bundles break down into their minimal pieces, any solvable set of constraints should result in allocations.
Unfortunately the "is minimal" predicate definition did not account for multiple LiveRanges, but rather only tested whether the total program-point range of the bundle was over one instruction. If there are multiple LiveRanges, we can still split them off, and the resulting split bundles may cover only half the instruction, avoiding the clobbers.