This is an archive of the discontinued LLVM Phabricator instance.

[GlobalISel] Hide hoisted constants behind G_BITCAST to prevent folding.
AbandonedPublic

Authored by aemerson on May 9 2023, 12:32 AM.

Details

Summary

The constant hoisting pass tries to hoist large constants into predecessors and also
generates remat instructions in terms of the hoisted constants. These aim to prevent
codegen from rematerializing expensive constants multiple times. So we can re-use
this optimization, we can preserve the no-op bitcasts that are used to anchor
constants to the predecessor blocks.

SelectionDAG achieves this by having the OpaqueConstant node, which is just a
normal constant with an opaque flag set. I've opted to avoid introducing a new
constant generic instruction here.

However, once we've run all the combiners we can eliminate these G_BITCASTS
to allow selection to see them. For AArch64 it makes sense to do it at post-legalizer
combining, but *after* the actual combiner loop runs since we don't want it eliminated
before any constant folding runs.

This change by itself has very minor improvements in -Os CTMark overall. What this
does allow is better optimizations when future combines are added that rely on having
expensive constants remain unfolded.

Diff Detail

Event Timeline

aemerson created this revision.May 9 2023, 12:32 AM
aemerson requested review of this revision.May 9 2023, 12:32 AM
arsenm added a comment.May 9 2023, 1:10 AM

Seems like this is another one of those DAG workaround passes we ideally wouldn't need to keep running

llvm/test/CodeGen/AArch64/GlobalISel/irtranslator-hoisted-constants.ll
30

Also add some vector cases?

Seems like this is another one of those DAG workaround passes we ideally wouldn't need to keep running

I thought so too, but reading its documentation more I think the problem it's solving is actually agnostic to selectors. That is, the core issue isn't that DAG can only see basic blocks, but one of balancing CSE & register pressure. We could stop running this IR pass (which does exist as IR due to SDAG's nature), but we'd just need to implement something similar in the GISel pipeline anyway.

llvm/test/CodeGen/AArch64/GlobalISel/irtranslator-hoisted-constants.ll
30

Good point, constant hoisting doesn't touch vector constants so we can avoid this bitcast trickery for those.

aemerson updated this revision to Diff 520750.May 9 2023, 10:37 AM

Didn't notice the test failures before. One of them is the verifier complaining about the types being the same for G_BITCAST, so I've relaxed that restriction.

Just wondering: if we have to add some "special" bitcast that needs to be removed before ISel anyway (+ relax G_BITCAST check rules on top of it), couldn't we just add a dedicated opcode instead?
I'm thinking of something like a G_OPAQUE opcode that serves as an optimization hint that an instruction should be considered opaque and no combiner should attempt to see through it. It could just be removed right before ISel.

I'm wondering if we'll see more use cases for this "hack" later and eventually end up with a use-case that cannot use G_BITCAST (e.g. pointer type?), so we end up doing a similar hack but on another opcode for a different type, etc

For instance, a G_OPAQUE instruction like that could be used to prevent infinite combine loop between the target-independent and target combine rules. If the target combine wants to create a pattern that isn't considered optimal by the generic combiner, it could use that opaque instruction to prevent further optimizations.

Just wondering: if we have to add some "special" bitcast that needs to be removed before ISel anyway (+ relax G_BITCAST check rules on top of it), couldn't we just add a dedicated opcode instead?
I'm thinking of something like a G_OPAQUE opcode that serves as an optimization hint that an instruction should be considered opaque and no combiner should attempt to see through it. It could just be removed right before ISel.

I'm wondering if we'll see more use cases for this "hack" later and eventually end up with a use-case that cannot use G_BITCAST (e.g. pointer type?), so we end up doing a similar hack but on another opcode for a different type, etc

For instance, a G_OPAQUE instruction like that could be used to prevent infinite combine loop between the target-independent and target combine rules. If the target combine wants to create a pattern that isn't considered optimal by the generic combiner, it could use that opaque instruction to prevent further optimizations.

Yeah, I think that's ok too.

For instance, a G_OPAQUE instruction like that could be used to prevent infinite combine loop between the target-independent and target combine rules. If the target combine wants to create a pattern that isn't considered optimal by the generic combiner, it could use that opaque instruction to prevent further optimizations.

Yeah, I think that's ok too.

Isn't this what the DAG does? There are opaque constants

For instance, a G_OPAQUE instruction like that could be used to prevent infinite combine loop between the target-independent and target combine rules. If the target combine wants to create a pattern that isn't considered optimal by the generic combiner, it could use that opaque instruction to prevent further optimizations.

Yeah, I think that's ok too.

Isn't this what the DAG does? There are opaque constants

Pretty much, except Pierre’s suggestion is more of a dedicated barrier instruction (sort of like freeze), whereas OpaqueConstants are just normal constants with a flag bit set. I think a dedicated barrier is better since it’s messy to have two different constant representations in the MIR.