⚙ D60413 [BDCE] SExt -> ZExt when no sign bits is used and instruction has multiple uses

dnsampaio created this revision.Apr 8 2019, 10:23 AM

Herald added a project: Restricted Project. · View Herald TranscriptApr 8 2019, 10:23 AM

Herald added a subscriber: llvm-commits. · View Herald Transcript

InstCombine knows how to do this, but only when there is a single use of the sext. Multiple uses is much more difficult to do. This would probably fit best in AggressiveInstCombine.

Added transformation for comparing.

Moved to agressiveInstCombine.

dnsampaio updated this revision to Diff 194704.Apr 11 2019, 8:59 AM

dmgreen added a subscriber: dmgreen.Apr 11 2019, 10:12 AM

Moved conversion to the existing folding loop

Harbormaster completed remote builds in B31002: Diff 196670.Apr 25 2019, 10:12 AM

Rebase

Herald added a subscriber: hiraditya. · View Herald TranscriptJul 7 2020, 5:25 AM

dnsampaio edited the summary of this revision. (Show Details)Jul 7 2020, 5:27 AM

dnsampaio added reviewers: dmgreen, eli.friedman.

lebedev.ri retitled this revision from SExt -> ZExt when no sign bits is used with multiple uses to [AggressiveInstCombine] SExt -> ZExt when no sign bits is used with multiple uses.Jul 7 2020, 5:29 AM

lebedev.ri added reviewers: spatel, lebedev.ri, nikic.

Why doesn't InstCombiner::SimplifyDemandedUseBits() handle this?
I would have expected this to already deal with it.

In D60413#2135831, @lebedev.ri wrote:

Why doesn't InstCombiner::SimplifyDemandedUseBits() handle this?
I would have expected this to already deal with it.

SimplifyDemandedUseBits() can only handle single-use cases, because demanded bits are only computed for a specific use. There is SimplifyMultipleUseDemandedBits(), but it can only simplify a specific use-site, not replace a whole instruction.

This patch has the right general idea, in that DemandedBits is the analysis that can determine this for the multi-use case. However, I'm not very comfortable with performing a full demanded bits calculation (on the whole function) in AggressiveInstCombine, just for this purpose. I think it would be better to repurpose BDCE (which already computes DemandedBits) to be a bit more general and also perform some demanded-bits based folds there, rather than only DCE. (This is similar to how SCCP has recently started replacing sext with zext if possible, even though that is not the primary purpose of that pass.)

nikic added inline comments.Jul 7 2020, 8:58 AM

llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp
356 ↗	(On Diff #276016)	It's quite likely that this analysis may get invalidated by some of the performed transforms. AssumptionCache should also be made a pass dependency, not constructed inline. But as mentioned, I would recommend moving this into BDCE instead.

Moved to BDCE.

Harbormaster completed remote builds in B63346: Diff 276291.Jul 7 2020, 8:10 PM

Thanks, this looks about right to me, but please wait for @nikic.

This revision is now accepted and ready to land.Jul 8 2020, 1:20 AM

spatel added inline comments.Jul 8 2020, 5:02 AM

llvm/test/Transforms/AggressiveInstCombine/sext_multi_uses.ll
2	Run line should be more like this: opt -S -bdce < %s The test file should be moved to this folder: https://github.com/llvm/llvm-project/tree/master/llvm/test/Transforms/BDCE (please also update the title of this review / commit message)

Fixed test location and command

dnsampaio retitled this revision from [AggressiveInstCombine] SExt -> ZExt when no sign bits is used with multiple uses to [BDCE] SExt -> ZExt when no sign bits is used with multiple uses.Jul 8 2020, 7:13 AM

dnsampaio retitled this revision from [BDCE] SExt -> ZExt when no sign bits is used with multiple uses to [BDCE] SExt -> ZExt when no sign bits is used and instruction has multiple uses.

Harbormaster completed remote builds in B63414: Diff 276423.Jul 8 2020, 8:10 AM

nikic requested changes to this revision.Jul 8 2020, 9:25 AM

nikic added inline comments.

llvm/lib/Transforms/Scalar/BDCE.cpp
127	You probably need to `clearAssumptionsOfUsers()` here. Please check this test case: https://alive2.llvm.org/ce/z/caMis2

This revision now requires changes to proceed.Jul 8 2020, 9:25 AM

Clear users assumptions (and test it)

dnsampaio added inline comments.Jul 9 2020, 2:40 AM

llvm/lib/Transforms/Scalar/BDCE.cpp
127	Indeed, many thanks.

Harbormaster completed remote builds in B63557: Diff 276674.Jul 9 2020, 3:09 AM

In D60413#2139023, @dnsampaio wrote:

Fixed test location and command

The file is not moved here in the review, so something may be out-of-sync. The best thing would be to commit that file first with the current CHECK lines, then update it after applying this code patch. That way, we will highlight the test diffs.

In D60413#2141645, @spatel wrote:

In D60413#2139023, @dnsampaio wrote:

Fixed test location and command

The file is not moved here in the review, so something may be out-of-sync. The best thing would be to commit that file first with the current CHECK lines, then update it after applying this code patch. That way, we will highlight the test diffs.

@spatel Ups, sorry about that, I got two working environments now wfh, one was not fixed. Will do as you recommend.

Re-fixed test file, now showing only differences

Harbormaster completed remote builds in B63590: Diff 276741.Jul 9 2020, 8:56 AM

LG

llvm/lib/Transforms/Scalar/BDCE.cpp
125	Please pass `SE->getName()` here to preserve the instruction name.

This revision is now accepted and ready to land.Jul 9 2020, 9:46 AM

dnsampaio marked an inline comment as done.Jul 10 2020, 12:01 AM

Preserve name

Closed by commit rG7bf168390fd0: [BDCE] SExt -> ZExt when no sign bits is used and instruction has multiple uses (authored by Diogo Sampaio <diogo.sampaio@arm.com>). · Explain WhyJul 10 2020, 12:35 AM

This revision was automatically updated to reflect the committed changes.

Harbormaster completed remote builds in B63697: Diff 276926.Jul 10 2020, 12:38 AM

Having some problems with this patch downstream:

Patch does not update any dbg.value that might refer to the value that now is zext:ed instead of sext:ed. So debugging experience might become weird. It might be possible to rewrite debug uses, using some nifty debug expression to express a trunc+sext of the value. But in my case it was an <128 x i40> vector, and we got no dwarf expressions to express a vector sext afaik. Instead I guess any debug uses should be invalidated.

My target really prefers sext before zext when it comes to extending from <128 x i32> to <128 x i40>. Well, at the moment the zext isn't even legal. So now I get compilation failures. Pretty hard to redo the (global) demanded bits analysis during ISel to perform the reverse transform. Not sure really how such problems are dealt with in general. For my use case it had been much better to hoist the trunc that implies that the upper bits is irrelevant, and that way avoiding an <128 x i40> vector with irrelevant upper bits altogether.

At least (1) is something that I think we need to do something about.
When it comes to (2) I'd welcome any suggestions on how I could avoid this transform for my target (or ideas about how to add a reverse transform closer to codegen).

In D60413#2223866, @bjope wrote:

Having some problems with this patch downstream:

Patch does not update any dbg.value that might refer to the value that now is zext:ed instead of sext:ed. So debugging experience might become weird. It might be possible to rewrite debug uses, using some nifty debug expression to express a trunc+sext of the value. But in my case it was an <128 x i40> vector, and we got no dwarf expressions to express a vector sext afaik. Instead I guess any debug uses should be invalidated.

Isn't that an issue for any of the other places that do such a transform, among other things?
This very transform also exists in InstCombine and CVP.

My target really prefers sext before zext when it comes to extending from <128 x i32> to <128 x i40>. Well, at the moment the zext isn't even legal. So now I get compilation failures. Pretty hard to redo the (global) demanded bits analysis during ISel to perform the reverse transform. Not sure really how such problems are dealt with in general. For my use case it had been much better to hoist the trunc that implies that the upper bits is irrelevant, and that way avoiding an <128 x i40> vector with irrelevant upper bits altogether.

You can always just do sext and then mask unneeded high bits away.

At least (1) is something that I think we need to do something about.
When it comes to (2) I'd welcome any suggestions on how I could avoid this transform for my target (or ideas about how to add a reverse transform closer to codegen).

In D60413#2223875, @lebedev.ri wrote:

In D60413#2223866, @bjope wrote:

Having some problems with this patch downstream:

Patch does not update any dbg.value that might refer to the value that now is zext:ed instead of sext:ed. So debugging experience might become weird. It might be possible to rewrite debug uses, using some nifty debug expression to express a trunc+sext of the value. But in my case it was an <128 x i40> vector, and we got no dwarf expressions to express a vector sext afaik. Instead I guess any debug uses should be invalidated.

Isn't that an issue for any of the other places that do such a transform, among other things?
This very transform also exists in InstCombine and CVP.

That might be a bigger problem ofcourse. But if a transform doing RAUW isn't replacing with the same thing, then I guess it need to salvage/undef the debug uses and only replace the non-debug uses. Otherwise I think the debug values will be incorrect.

My target really prefers sext before zext when it comes to extending from <128 x i32> to <128 x i40>. Well, at the moment the zext isn't even legal. So now I get compilation failures. Pretty hard to redo the (global) demanded bits analysis during ISel to perform the reverse transform. Not sure really how such problems are dealt with in general. For my use case it had been much better to hoist the trunc that implies that the upper bits is irrelevant, and that way avoiding an <128 x i40> vector with irrelevant upper bits altogether.

You can always just do sext and then mask unneeded high bits away.

At least (1) is something that I think we need to do something about.
When it comes to (2) I'd welcome any suggestions on how I could avoid this transform for my target (or ideas about how to add a reverse transform closer to codegen).

Oh, I wish I had a vector-AND. I can probably implement the zext using SHL+LSHR, but that is two operations instead of one (SEXT+AND would also be two ops instead of one single SEXT). The target also support sextload but not zextload, so I might miss out on that as well. In general the target got better support for sext than zext (also in some scalar cases). Sext is often for free, while zext isn't. Maybe I need to teach CodegenPrepare to turn zext into sext if the upper bits are irrelevant. Undoing all these transforms changing zext into sext.

One minor problem with implementing a reverse transform for targets that prefer sext over zext, e.g. in CodeGenPrepare, is that it will be hard to restore the debug info (considering that we do the right thing here an invalidate metadata uses).

Maybe we want to have anyext already in IR (to allow targets to use either zext or sext)?

I'm out in vacations, just have my phone here. I'm surprised I forgot to
preserve de debug information, sorry about that.

Regarding use of anyext on IR, I don't know how well that would play with
all other optimizations.

The sext to zext for example already exists in instruction combine, but
just for those with a single use. Perhaps these transformations could just
check the backend if it prefers sext or zext and decide to act based on
that?

I've submitted a PR, https://bugs.llvm.org/show_bug.cgi?id=47296, regarding the debuginfo problem.

For know I simply hardcoded a condition to avoid the transform downstream for out target (when vectors are involved), as a workaround to the problem with now having any simple way to lower a vector zext.

This is an archive of the discontinued LLVM Phabricator instance.

[BDCE] SExt -> ZExt when no sign bits is used and instruction has multiple uses
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 276291

llvm/lib/Transforms/Scalar/BDCE.cpp

llvm/test/Transforms/AggressiveInstCombine/sext_multi_uses.ll