Page MenuHomePhabricator

[SelDag] Add FREEZE
ClosedPublic

Authored by aqjune on Jan 23 2017, 3:56 AM.

Details

Summary
  • Add FREEZE node to SelDag
  • Lower FreezeInst (in IR) to FREEZE node
  • Add Legalization for FREEZE node

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Do you also want a MIR test that shows IR freeze -> MIR freeze lowering?

aqjune updated this revision to Diff 229230.Nov 13 2019, 10:45 PM
aqjune edited the summary of this revision. (Show Details)
  • Add freeze IR -> MIR test
aqjune marked an inline comment as done.Nov 13 2019, 10:46 PM
aqjune added inline comments.
llvm/test/CodeGen/X86/freeze-mir.ll
1 ↗(On Diff #229230)

I made a separate freeze-mir.ll file for mir test because update_mir_test_checks.py did not work after freeze.ll is processed with update_llc_test_checks.py

Hmm, thanks, i think this now looks good to me.
@craig.topper / @qcolombet / @arsenm ?

llvm/test/CodeGen/X86/freeze-call.ll
7–12 ↗(On Diff #229230)

You can avoid cfi noise via define i32 @foo() nounwind {

aqjune updated this revision to Diff 229443.Nov 14 2019, 8:47 PM
  • Add nounwind to tests
arsenm added inline comments.Nov 14 2019, 8:51 PM
llvm/include/llvm/Target/Target.td
1067 ↗(On Diff #229443)

Why isNotDuplicable? I don't think this is the best maintained instruction property

aqjune marked an inline comment as done.Nov 14 2019, 9:36 PM
aqjune added inline comments.
llvm/include/llvm/Target/Target.td
1067 ↗(On Diff #229443)

It is because different freeze defs can yield different values. For example, consider following transformation:

%undef = IMPLICIT_DEF
%x = FREEZE %undef
use(%x)
use(%x) // these two uses should see the same freezed value
->
%undef = IMPLICIT_DEF
%x = FREEZE %undef
use(%x)
%x' = FREEZE %undef
use(%x') // It is possible that %x and %x' are assigned differently freezed values.

This transformation is incorrect, because use()s after the transformation can see different values.
To prevent this class of optimizations, isNotDuplicable is set to 1.

arsenm added inline comments.Nov 14 2019, 9:47 PM
llvm/include/llvm/Target/Target.td
1067 ↗(On Diff #229443)

But these both are using the same undef vreg? The second freeze is still using the original undef, so this should be fine?

arsenm added inline comments.Nov 14 2019, 9:53 PM
llvm/include/llvm/Target/Target.td
1067 ↗(On Diff #229443)

To clarify, I think the property you are really looking for is not the property given to you by isNotDuplicable. What you really care about is the input operand not using a new instance of undef. The "nonduplicability" is a function of the input value and not the instruction itself

aqjune marked an inline comment as done.Nov 14 2019, 9:55 PM
aqjune added inline comments.
llvm/include/llvm/Target/Target.td
1067 ↗(On Diff #229443)

The second use of the undef register can be assigned a different physical register at (perhaps) register allocation, after seeing that its definition is IMPLICIT_DEF.

%undef = IMPLICIT_DEF
%x = FREEZE %undef
use(%x)
%x' = FREEZE %undef
use(%x')
=>
%x = FREEZE $eax
use(%x)
%x' = FREEZE $ebx
use(%x')

This transformation can happen if there is a function call between them. test/CodeGen/X86/freeze-call.ll checks that it never happens.

arsenm added inline comments.Nov 14 2019, 9:59 PM
llvm/include/llvm/Target/Target.td
1067 ↗(On Diff #229443)

I think handling this correctly requires changing to how IMPLICIT_DEF is handled in register allocation. I don't think noduplicate is really going to model this constraint sufficiently

aqjune marked an inline comment as not done.Nov 14 2019, 10:13 PM
aqjune added inline comments.
llvm/include/llvm/Target/Target.td
1067 ↗(On Diff #229443)

It would make complexity of this patch (lowering of freeze IR instruction) easier, but would the change in register allocation need performance tests?
For example, if all uses of IMPLICIT_DEF should see a same caller-save register, the register should be spilled whenever its liveness crosses a function call.
Currently IMPLICIT_DEF is commented as MachineInstr-level equivalent of undef, and undef`may evaluate to different values as well, so I wonder whether there are passes other than regalloc that use that semantics too.

isNotDuplicable's comment says

class MCInstrDesc {
...
  /// Return true if this instruction cannot be safely
  /// duplicated.  For example, if the instruction has a unique labels attached
  /// to it, duplicating it would cause multiple definition errors.
  bool isNotDuplicable() const { return Flags & (1ULL << MCID::NotDuplicable); }

so I thought non-duplicability was a property of the freeze instruction itself.

arsenm added inline comments.Nov 14 2019, 10:40 PM
llvm/include/llvm/Target/Target.td
1067 ↗(On Diff #229443)

I mean the definition of IMPLICIT_DEF needs to change to support this. I think we may actually need two different flavors of IMPLICIT_DEF. noduplicate is a stronger barrier to useful transforms, and also doesn't model the real problem here.

If you started out with two separate freeze instructions reading the same IMPLICIT_DEF register, you would still see the same problem in your example.

aqjune added inline comments.Nov 14 2019, 11:36 PM
llvm/include/llvm/Target/Target.td
1067 ↗(On Diff #229443)

I mean the definition of IMPLICIT_DEF needs to change to support this. I think we may actually need two different flavors of IMPLICIT_DEF.

Understood. So we can have two kinds of IMPLICIT_DEFs; the first one is an instruction that can be optimized to different values whenever used. The second one has the same value whenever it is used.

I think the second IMPLICIT_DEF is equivalent to FREEZE (IMPLICIT_DEF of the first kind), because freeze is an instruction that guarantees all uses of the value see the same value even though the operand is an undef-like value.

If you started out with two separate freeze instructions reading the same IMPLICIT_DEF register, you would still see the same problem in your example.

If the program originally contained two separated freezes, then it is okay, because the program is originally written in that way. The programmer must have written down two freezes in LLVM IR bitcode, and they are lowered into MIR.
The problematic case is when a program with a single freeze is optimized to a program with several freezes. Different freezes can return different values.

%undef = IMPLICIT_DEF
%x = FREEZE %undef
use(%x)
use(%x) // these two uses see the same value with help of freeze.
->
%undef = IMPLICIT_DEF
%x = FREEZE %undef
use(%x)
%x' = FREEZE %undef
use(%x') // Now %x and %x' can be assigned different values, so this optimization is wrong.
arsenm added inline comments.Nov 14 2019, 11:50 PM
llvm/include/llvm/Target/Target.td
1067 ↗(On Diff #229443)

I think it's important to define the rules here clearly for MIR, and not rely on expectations from the original IR. Some legalization for example may end up inserting multiple freezes of the same input value, and that should work correctly. Multiple freezes of the same input register should produce the same value.

aqjune added inline comments.Nov 15 2019, 12:49 AM
llvm/include/llvm/Target/Target.td
1067 ↗(On Diff #229443)

I agree, I'll write the rule for freeze.
Regarding legalization, in which case can it be copied? The case that I was aware of was splitting a register of illegal size, which copies freeze but does not increase # of uses per definition:

%x = ... // invalid type, say i50
%y = freeze %x
  =>
%x1 = ... // i32
%x2 = ... // i18
%y1 = freeze %x1
%y2 = freeze %x2
liuz added a subscriber: liuz.Nov 15 2019, 8:39 PM
lebedev.ri requested changes to this revision.Nov 22 2019, 3:19 AM

Looks like this is waiting on an update.
Really happy to see this progressing.

This revision now requires changes to proceed.Nov 22 2019, 3:19 AM
aqjune updated this revision to Diff 230899.Nov 25 2019, 7:27 AM
aqjune edited the summary of this revision. (Show Details)

I added description about FREEZE and IMPLICIT_DEF to TargetOpcodes.def.
To address the legalization issue, updated the semantics so FREEZEs return the same value if placed consecutively. Would this properly address the issue?

I'm sorry, it seems the track of this patch was lost.
We really need to get this last bit of freeze support landed.
Can anyone spot any remaining issues with the patch?

craig.topper added inline comments.Jan 8 2020, 3:36 PM
llvm/include/llvm/Support/TargetOpcodes.def
61 ↗(On Diff #230899)

constraint -> constrain

nlopes added a comment.Jan 9 2020, 6:26 AM

The patch & semantics look good to me, but I'm not a backend expert. I'll leave the final LGTM to someone else.
It would be awesome if we could get this in for 10.0 so that we have complete support for freeze.

aqjune updated this revision to Diff 237090.Jan 9 2020, 8:09 AM
aqjune retitled this revision from [SelDag][MIR] Add FREEZE to [SelDag][MIR] Add FREEZE.
  • Rebase
  • Address a comment
aqjune added a comment.Jan 9 2020, 8:20 AM

To synchronize - the remaining issue was about duplicability of the freeze instruction in MIR.

In IR, a duplicated freeze may result in a different value due to the nature of undef. However, in MIR, this may be problematic in some cases.

My suggestion is to allow freeze in MIR to yield the same value when they are consecutively arranged.
This involved clarificaton of when the value of IMPLICIT_DEF could change during execution, and the updated comments of TargetOpcodes.def reflects that.

Currently, isNotDuplicable is still set to 1 to leave it as conservative & copiers deal with freeze specially in the future.

aqjune added a comment.Jan 9 2020, 4:16 PM

BTW, there was a discussion about IMPLICIT_DEF as well, about whether it should return consistent value for every use vs. it can return different values per use (as IR's undef).

Currently IMPLICIT_DEF can yieid different values, and there are a few cases where that happens (such as using IMPLICIT_DEF across function calls).

Making it return consistent value will allow FREEZE to be duplicable, but it would require fixing several transformations that deal with IMPLICIT_DEF.

Before discussion about IMPLICIT_DEF is further made and it is changed to return concrete value, I'd like to suggest leaving FREEZE as semi-duplicable (as suggested by this patch), but not fully duplicable.

efriedma added inline comments.Jan 17 2020, 4:21 PM
llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
1135 ↗(On Diff #237090)

If we expand FREEZE in ExpandPostRAPseudos, how can we reach this code?

aqjune updated this revision to Diff 239013.Jan 19 2020, 4:00 PM
  • Rebase
  • Remove AsmPrinter::emitFreeze
aqjune marked 2 inline comments as done.Jan 19 2020, 4:05 PM
aqjune added inline comments.
llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
1135 ↗(On Diff #237090)

It was a legacy code to see the freeze comment from assembly before updating ExpandPostRAPseudos. I removed it.

aqjune marked an inline comment as done.Jan 19 2020, 4:40 PM

For the next step, what can I do?

If duplicability is the main concern, I can split this patch so (1) FREEZE is defined to be duplicable first, and (2) when there happens a practical case where duplication matters, the next one is created.

It will reduce diffs in TargetOpcodes.def as well because the comment update at IMPLICIT_DEF is not needed anymore.

aqjune marked 3 inline comments as done.Jan 19 2020, 4:40 PM

ping

If adding FREEZE to MachineIR causes maintenance cost, another possible solution is to lower SelDag's FREEZE(UNDEF) to IMPLICIT_DEF directly.
In this option, FREEZE only exist in SelDag, not in MIR.
Instead, now IMPLICIT_DEF in MIR should always yield the same value. This requires a few transformations regarding IMPLICIT_DEF in MIR to be fixed.
This may cause suboptimal codegen in a few cases, but I believe we have a solution for the cases, because a single UNDEF from SelDag can be always lowered into multiple IMPLICIT_DEFs (per each use) in MIR.

lebedev.ri accepted this revision.Feb 26 2020, 2:13 AM
lebedev.ri added a reviewer: arsenm.

This seems good to me, but you want to wait for someone else more knowledgeable..
ping @arsenm @qcolombet @craig.topper

llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
65–68 ↗(On Diff #241401)

This is unrelated, please feel free to commit this right away.

This revision is now accepted and ready to land.Feb 26 2020, 2:13 AM
aqjune updated this revision to Diff 246778.Feb 26 2020, 9:58 AM
  • Rebase
  • Make the update in getOperationCost as a separate commit (4f71252)
aqjune marked an inline comment as done.Feb 26 2020, 10:02 AM
aqjune updated this revision to Diff 251955.Mar 23 2020, 12:19 AM

Leave the SelDag patch only, MachineIR part will be made as a separate patch (as discussed at D76483)

aqjune edited the summary of this revision. (Show Details)Mar 23 2020, 12:19 AM
aqjune retitled this revision from [SelDag][MIR] Add FREEZE to [SelDag] Add FREEZE.
spatel added inline comments.Mar 23 2020, 1:29 PM
llvm/test/CodeGen/X86/freeze-legalize.ll
9–13

Do we need to add basic simplify / constant folding for SDAG ?
freeze ( Constant ) --> Constant

aqjune marked an inline comment as done.Mar 23 2020, 5:10 PM
aqjune added inline comments.
llvm/test/CodeGen/X86/freeze-legalize.ll
9–13

Yes, I agree it will be great. Do you want to make this patch contain the change as well?

aqjune marked an inline comment as done.Mar 24 2020, 5:45 AM
aqjune added inline comments.
llvm/test/CodeGen/X86/freeze-legalize.ll
9–13

Or I can land this first, and add the simplify / constant folding for SDAG.
I prefer incrementally making things because this patch itself is a big change.

spatel added inline comments.Mar 24 2020, 6:10 AM
llvm/test/CodeGen/X86/freeze-legalize.ll
9–13

I prefer smaller patches too. Let's make that a follow-up.

Is it correct that there is very little chance that this patch will create a visible performance regression (because there should be almost no freeze instruction creation in IR yet)?

aqjune marked an inline comment as done.Mar 24 2020, 6:56 AM
aqjune added inline comments.
llvm/test/CodeGen/X86/freeze-legalize.ll
9–13

Yes, it is.
Currently there is only one place where freeze is introduced - https://reviews.llvm.org/D76179
I checked that from assembly outputs of LLVM test-suite , only 3 / 5239 files are affected by this patch.

This revision was automatically updated to reflect the committed changes.
spatel added inline comments.Mar 24 2020, 7:37 AM
llvm/test/CodeGen/X86/freeze-legalize.ll
9–13

That sounds good then. But can we avoid those 3 regressions cases before or within this patch? Ideally, we don't want to knowingly regress anything.

aqjune marked an inline comment as done.Mar 24 2020, 8:22 AM
aqjune added inline comments.
llvm/test/CodeGen/X86/freeze-legalize.ll
9–13

Among 3 files, two were simple regressions that had a bit more verbose assembly:

...
sete  %al    
orb %bpl, %al
jne .LBB9_1  
=>
...
sete  %al    
orb %bpl, %al
testb $1, %al
jne .LBB9_1
...
testb $1, %al
je  .LBB0_2  
=>
...
movl  %eax, %ecx 
andl  $1, %ecx   
je  .LBB0_2

Case 3's assembly diff was bigger, so needs inspection. I can visit it after the simpler two cases are resolved.

@aqjune

@bkramer committed a fix for a crash in LegalizeFloatTypes where this operator appeared in SoftPromoteHalfResult

Our downstream ARM compiler is crashing in regressions in a similar manner in SoftenFloatResult in the same file. Is it not conceivable that ISD::FREEZE might make it to this function? If that's the case, we can look into it and start up a new review if necessary. The test causing this crash is select-cc.ll.

@aqjune

@bkramer committed a fix for a crash in LegalizeFloatTypes where this operator appeared in SoftPromoteHalfResult

Our downstream ARM compiler is crashing in regressions in a similar manner in SoftenFloatResult in the same file. Is it not conceivable that ISD::FREEZE might make it to this function? If that's the case, we can look into it and start up a new review if necessary. The test causing this crash is select-cc.ll.

Hi, I saw @bkramer 's fix too (commit id 0019c2f194a5e1f4cd65c5284e204328cc40ab3d ), and I think it is a valid fix. Similar thing can be applied to the SoftenFloatResult function. I'll make a patch for this.