This is an archive of the discontinued LLVM Phabricator instance.

[SelDag] Add FREEZE
ClosedPublic

Authored by aqjune on Jan 23 2017, 3:56 AM.

Details

Summary
  • Add FREEZE node to SelDag
  • Lower FreezeInst (in IR) to FREEZE node
  • Add Legalization for FREEZE node

Event Timeline

aqjune created this revision.Jan 23 2017, 3:56 AM
mkuper added a subscriber: mkuper.Jan 24 2017, 11:41 AM
mkuper added inline comments.
include/llvm/CodeGen/ISDOpcodes.h
180 ↗(On Diff #85346)

Please add more documentation here.
In particular, a very short form of the explanation from SelectionDAGISel::Select_FREEZE() about the difference in semantics between FREEZE and UNDEF would be appropriate.

lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
2238 ↗(On Diff #85346)

I'm a bit confused about this. The function above selects UNDEF to IMPLICIT_DEF.
Where do we do the duplication? If we duplicate UNDEFs on the DAG level, then everything's already fine, and you can just select a FREEZE directly to IMPLICIT_DEF.
If we're duplicating IMPLICIT_DEFs, then this is a problem regardless of what you do here, if the freeze ends up being an IMPLICIT_DEF. Or are you saying we're duplicating IMPLICIT_DEFs in the DAG, but not in MI? I'd find that surprising.

(Note that the documentation for IMPLICIT_DEF just says that "this is the MachineInstr-level equivalent of undef." If we're not allowed to duplicate IMPLICIT_DEFs and they have a single defined value, that should be corrected.)

Also, I didn't see a corresponding patch for FastISel. Did I miss one?

nlopes added a subscriber: nlopes.Jan 25 2017, 4:17 AM
nlopes added inline comments.
lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
2238 ↗(On Diff #85346)

If you have e.g.:
mul undef, undef

this eventually may get translated into:

%v1 = IMPLICIT_DEF
%v2 = IMPLICIT_DEF
%v3 = mul %v1, %v2

Even if it doesn't get translated to this explicitly, the semantics of UNDEF/IMPLICIT_DEF are that each read may see a different value. For example, IMPLICIT_DEF can be sank into a loop and each iteration may see a different value, and known-bits analyses will also take advantage of the fact that each bit can take any value it wants.
Therefore we use this "hack" of making a copy of the vreg where IMPLICIT_DEF is assigned to. This seems to guarantee that all uses see the same value, since there isn't a pass propagating IMPLICIT_DEF to copies.

aqjune updated this revision to Diff 85743.Jan 25 2017, 6:32 AM
  • FastISel for FREEZE added.
  • Updated description of FREEZE.
aqjune marked an inline comment as done.Jan 26 2017, 1:29 AM
mkuper added inline comments.Jan 26 2017, 2:36 PM
include/llvm/CodeGen/ISDOpcodes.h
180 ↗(On Diff #85743)

random -> arbitrary (sorry, it's a pet peeve :-) )

lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
2238 ↗(On Diff #85346)

If I understand correctly, the freeze.ll test shows you do end up with just an IMPLICIT_DEF eventually.
So I'm not sure where the benefit is? You rely on backend phase ordering to elide the copy only after the last time IMPLICIT_DEF may be duplicated? This seems a bit fishy.

Anyway, this is just a drive-by, I defer to Quentin and the other if they think this is reasonable.

aqjune updated this revision to Diff 86018.Jan 26 2017, 11:23 PM
aqjune marked an inline comment as done.
regehr added a subscriber: regehr.Jan 28 2017, 10:03 PM
aqjune updated this revision to Diff 106874.Jul 17 2017, 8:05 AM

Rebased to svn commit 308173

aqjune updated this revision to Diff 145379.May 5 2018, 12:11 PM

Rebased to svn commit 331585

All these new llc tests - have you considered using utils/update_llc_test_checks.py ?

aqjune updated this revision to Diff 217190.Aug 26 2019, 10:21 AM

Rebase to trunk@369887 (August 26th, 2019)

I guess this is the next patch in the queue?
This needs rebasing, and more tests to match the langref wording.

aqjune updated this revision to Diff 224861.Oct 14 2019, 8:04 AM
  • Support floating points, pointers, aggregate types, add tests for those types

I believe vector legalization tests are missing.

Cost model handling might be missing (i would guess they should be treated as free)

This looks reasonable-ish, but i don't have much knowledge on this part, sadly.
I'm wondering if @RKSimon / @spatel / @craig.topper might want to help review this?

include/llvm/CodeGen/ISDOpcodes.h
179 ↗(On Diff #224861)

s/integer/value/

craig.topper added inline comments.Oct 14 2019, 10:21 AM
lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
385 ↗(On Diff #224861)

Just use ISD::FREEZE instead of N->getOpcode()

4021 ↗(On Diff #224861)

I think this should be with the other PromoteIntOp functions? I think that's how this file is sorted.

4021 ↗(On Diff #224861)

Does this function even get called? The input type and the output type are the same right? So PromoteIntRes should have been called first.

4025 ↗(On Diff #224861)

Same with this. We should expand the result and then we'll never have to expand the input separately.

lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
3105 ↗(On Diff #224861)

Would we better off just handling freeze on its own instead of adding all this aggregate code to the generic visitUnary?

lebedev.ri requested changes to this revision.Oct 20 2019, 4:40 AM

(as per inline comments)

This revision now requires changes to proceed.Oct 20 2019, 4:40 AM
aqjune updated this revision to Diff 225827.Oct 21 2019, 1:04 AM
  • Remove DAGTypeLegalizer::PromoteIntOp_FREEZE, DAGTypeLegalizer::ExpandIntOp_FREEZE as they are never called
  • Add cost model of freeze
  • Add vector legalization test to freeze-legalize.ll
aqjune marked 6 inline comments as done.Oct 21 2019, 1:06 AM
aqjune added inline comments.
lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
3105 ↗(On Diff #224861)

I put it here as visitBinary() was also handling opcodes, but I have no special preference.

@qcolombet - ping as per inline comment?
I'm also not really fully sold that the SelectionDAGISel::Select_FREEZE()
approach is fully future-proof/correct.

Other than that, nothing major stands out to me here.

include/llvm/Analysis/TargetTransformInfoImpl.h
853 ↗(On Diff #225827)

Also here
And in TargetTransformInfo::getInstructionThroughput()

lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
4055 ↗(On Diff #225827)

Spurious newline change

lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
3106 ↗(On Diff #225827)

assert(Opcode == ISD::FREEZE && "Can only get here for freeze` nodes");`

3110 ↗(On Diff #225827)
for (unsigned i = 0, NumOps = Op.getNumOperands(); i < NumOps; ++i)
lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
2238 ↗(On Diff #85346)

@qcolombet ping, please?

I was looking through MachineInstr's optimizations to recheck validity of using IMPLICIT_DEF, and found that it caused problems at two cases.

  1. When a function call is involved, its value is not preserved.
define i32 @foo()
  %y1 = freeze i32 undef
  %k = add i32 %y1, 10
  call i32 @g()
  %k2 = add i32 %y1, 20
  ..

->

_foo:
  addl  $10, %ebx
  callq _g
  addl  $20, %eax <- this is wrong because %eax and %ebx may differ
  ..
  1. When PHI node is involved, it is folded into an incorrect value.
  br i1 %cond, label %BB1, label %BB2
BB1:
  %y1 = freeze i32 undef
  %k1 = sub i32 %y1, 1
  br label %END
BB2:
  %y2 = freeze i32 undef
  br label %END
END:
  %p = phi i32 [%y1, %BB1], [%y2, %BB2]
  %p2 = phi i32 [%k1, %BB1], [0, %BB2]
  store i32 %p, i32* @x
  store i32 %p2, i32* @y

->

LBB0_3:                                 ## %END
  movl  %eax, _x(%rip)
  movl  %eax, _y(%rip) <- should be "%eax - 1" if %cond is true

But, other than these two (function call and phi), other operations seemed okay.
Arithmetic operations including add / mul didn't fold IMPLICIT_DEF in a undef-like way.

What would be a good way to address the two cases?
The latter seemed to be done by ProcessImplicitDefs.cpp / PHIElimination.cpp. The former one seems more complicated; it seems to require preservation of information about the original IMPLICIT_DEF before it is removed by later pipeline.

arsenm added a subscriber: arsenm.Oct 27 2019, 2:33 PM
arsenm added inline comments.
lib/CodeGen/SelectionDAG/FastISel.cpp
1573 ↗(On Diff #225827)

s/unsigned/Register

lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
3109 ↗(On Diff #225827)

1 seems a bit small for the size

lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
2238 ↗(On Diff #85346)

I see one potential problem with using a COPY of IMPLICIT_DEF. Suppose an instruction has two registers operands with different register classes. A codegen pass may reasonably try to rematerialize one of the implicit_defs to the natural operand register class to avoid the copy to satisfy operand constraints

2302 ↗(On Diff #225827)

s/unsigned/Register/

test/CodeGen/X86/freeze.ll
89 ↗(On Diff #225827)

I'm not sure where Phabricator gets it syntax highlighting from, but it should add the freeze keyword

aqjune updated this revision to Diff 228705.Nov 11 2019, 8:56 AM
  • Add FREEZE pseudoinstruction for MachineIR
  • Let ExpandPostRAPseudos expand FREEZE
  • Address reviewers' comments
  • Add tests for function call / phi
  • Add a comment to TargetPassConfig stating that after ExpandPostRAPseudos IMPLICIT_DEF should not be optimized to different registers/constants
Herald added a project: Restricted Project. · View Herald TranscriptNov 11 2019, 8:56 AM

I added FREEZE pseudoinstruction to MachineIR, as it seemed to be the succinct way to make it correct.
I made ExpandPostRAPseudos expand the FREEZE to register-copy assembly instruction. ExpandPostRAPseudos pass is done after register allocation (which, or at least a pass relevant to it, seems to replace uses of IMPLICIT_DEF with different registers) as well as ProcessImplicitDef (which is the culprit of the incorrect PHI optimization example) and PHIElimination. I left a commit that explicitly states that IMPLICIT_DEF cannot be used in an undef-y way after the pass.

craig.topper added inline comments.Nov 11 2019, 7:50 PM
llvm/include/llvm/Support/TargetOpcodes.def
59

Comment says IMPLICIT_DEF

llvm/lib/CodeGen/SelectionDAG/FastISel.cpp
1571

Why doesn't this TargetOpcode::Freeze?

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
3106

Separate this into just visitFreeze.

aqjune updated this revision to Diff 228816.Nov 11 2019, 10:58 PM
aqjune edited the summary of this revision. (Show Details)
  • Let FastISel emit FREEZE
  • Let SelectionDAGBuilder::visitFreeze lower freeze IR instructions
  • Minor fixes
aqjune marked 3 inline comments as done.Nov 11 2019, 10:59 PM
aqjune retitled this revision from [SelDag] Implement FREEZE node to [SelDag][MIR] Add FREEZE .Nov 11 2019, 11:07 PM
craig.topper added inline comments.Nov 11 2019, 11:42 PM
llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
3104

Drop this change?

lkail added a subscriber: lkail.Nov 12 2019, 12:02 AM

Do you also want a MIR test that shows IR freeze -> MIR freeze lowering?

aqjune updated this revision to Diff 229230.Nov 13 2019, 10:45 PM
aqjune edited the summary of this revision. (Show Details)
  • Add freeze IR -> MIR test
aqjune marked an inline comment as done.Nov 13 2019, 10:46 PM
aqjune added inline comments.
llvm/test/CodeGen/X86/freeze-mir.ll
2

I made a separate freeze-mir.ll file for mir test because update_mir_test_checks.py did not work after freeze.ll is processed with update_llc_test_checks.py

Hmm, thanks, i think this now looks good to me.
@craig.topper / @qcolombet / @arsenm ?

llvm/test/CodeGen/X86/freeze-call.ll
8–13

You can avoid cfi noise via define i32 @foo() nounwind {

aqjune updated this revision to Diff 229443.Nov 14 2019, 8:47 PM
  • Add nounwind to tests
arsenm added inline comments.Nov 14 2019, 8:51 PM
llvm/include/llvm/Target/Target.td
1067

Why isNotDuplicable? I don't think this is the best maintained instruction property

aqjune marked an inline comment as done.Nov 14 2019, 9:36 PM
aqjune added inline comments.
llvm/include/llvm/Target/Target.td
1067

It is because different freeze defs can yield different values. For example, consider following transformation:

%undef = IMPLICIT_DEF
%x = FREEZE %undef
use(%x)
use(%x) // these two uses should see the same freezed value
->
%undef = IMPLICIT_DEF
%x = FREEZE %undef
use(%x)
%x' = FREEZE %undef
use(%x') // It is possible that %x and %x' are assigned differently freezed values.

This transformation is incorrect, because use()s after the transformation can see different values.
To prevent this class of optimizations, isNotDuplicable is set to 1.

arsenm added inline comments.Nov 14 2019, 9:47 PM
llvm/include/llvm/Target/Target.td
1067

But these both are using the same undef vreg? The second freeze is still using the original undef, so this should be fine?

arsenm added inline comments.Nov 14 2019, 9:53 PM
llvm/include/llvm/Target/Target.td
1067

To clarify, I think the property you are really looking for is not the property given to you by isNotDuplicable. What you really care about is the input operand not using a new instance of undef. The "nonduplicability" is a function of the input value and not the instruction itself

aqjune marked an inline comment as done.Nov 14 2019, 9:55 PM
aqjune added inline comments.
llvm/include/llvm/Target/Target.td
1067

The second use of the undef register can be assigned a different physical register at (perhaps) register allocation, after seeing that its definition is IMPLICIT_DEF.

%undef = IMPLICIT_DEF
%x = FREEZE %undef
use(%x)
%x' = FREEZE %undef
use(%x')
=>
%x = FREEZE $eax
use(%x)
%x' = FREEZE $ebx
use(%x')

This transformation can happen if there is a function call between them. test/CodeGen/X86/freeze-call.ll checks that it never happens.

arsenm added inline comments.Nov 14 2019, 9:59 PM
llvm/include/llvm/Target/Target.td
1067

I think handling this correctly requires changing to how IMPLICIT_DEF is handled in register allocation. I don't think noduplicate is really going to model this constraint sufficiently

aqjune marked an inline comment as not done.Nov 14 2019, 10:13 PM
aqjune added inline comments.
llvm/include/llvm/Target/Target.td
1067

It would make complexity of this patch (lowering of freeze IR instruction) easier, but would the change in register allocation need performance tests?
For example, if all uses of IMPLICIT_DEF should see a same caller-save register, the register should be spilled whenever its liveness crosses a function call.
Currently IMPLICIT_DEF is commented as MachineInstr-level equivalent of undef, and undef`may evaluate to different values as well, so I wonder whether there are passes other than regalloc that use that semantics too.

isNotDuplicable's comment says

class MCInstrDesc {
...
  /// Return true if this instruction cannot be safely
  /// duplicated.  For example, if the instruction has a unique labels attached
  /// to it, duplicating it would cause multiple definition errors.
  bool isNotDuplicable() const { return Flags & (1ULL << MCID::NotDuplicable); }

so I thought non-duplicability was a property of the freeze instruction itself.

arsenm added inline comments.Nov 14 2019, 10:40 PM
llvm/include/llvm/Target/Target.td
1067

I mean the definition of IMPLICIT_DEF needs to change to support this. I think we may actually need two different flavors of IMPLICIT_DEF. noduplicate is a stronger barrier to useful transforms, and also doesn't model the real problem here.

If you started out with two separate freeze instructions reading the same IMPLICIT_DEF register, you would still see the same problem in your example.

aqjune added inline comments.Nov 14 2019, 11:36 PM
llvm/include/llvm/Target/Target.td
1067

I mean the definition of IMPLICIT_DEF needs to change to support this. I think we may actually need two different flavors of IMPLICIT_DEF.

Understood. So we can have two kinds of IMPLICIT_DEFs; the first one is an instruction that can be optimized to different values whenever used. The second one has the same value whenever it is used.

I think the second IMPLICIT_DEF is equivalent to FREEZE (IMPLICIT_DEF of the first kind), because freeze is an instruction that guarantees all uses of the value see the same value even though the operand is an undef-like value.

If you started out with two separate freeze instructions reading the same IMPLICIT_DEF register, you would still see the same problem in your example.

If the program originally contained two separated freezes, then it is okay, because the program is originally written in that way. The programmer must have written down two freezes in LLVM IR bitcode, and they are lowered into MIR.
The problematic case is when a program with a single freeze is optimized to a program with several freezes. Different freezes can return different values.

%undef = IMPLICIT_DEF
%x = FREEZE %undef
use(%x)
use(%x) // these two uses see the same value with help of freeze.
->
%undef = IMPLICIT_DEF
%x = FREEZE %undef
use(%x)
%x' = FREEZE %undef
use(%x') // Now %x and %x' can be assigned different values, so this optimization is wrong.
arsenm added inline comments.Nov 14 2019, 11:50 PM
llvm/include/llvm/Target/Target.td
1067

I think it's important to define the rules here clearly for MIR, and not rely on expectations from the original IR. Some legalization for example may end up inserting multiple freezes of the same input value, and that should work correctly. Multiple freezes of the same input register should produce the same value.

aqjune added inline comments.Nov 15 2019, 12:49 AM
llvm/include/llvm/Target/Target.td
1067

I agree, I'll write the rule for freeze.
Regarding legalization, in which case can it be copied? The case that I was aware of was splitting a register of illegal size, which copies freeze but does not increase # of uses per definition:

%x = ... // invalid type, say i50
%y = freeze %x
  =>
%x1 = ... // i32
%x2 = ... // i18
%y1 = freeze %x1
%y2 = freeze %x2
liuz added a subscriber: liuz.Nov 15 2019, 8:39 PM
lebedev.ri requested changes to this revision.Nov 22 2019, 3:19 AM

Looks like this is waiting on an update.
Really happy to see this progressing.

This revision now requires changes to proceed.Nov 22 2019, 3:19 AM
aqjune updated this revision to Diff 230899.Nov 25 2019, 7:27 AM
aqjune edited the summary of this revision. (Show Details)

I added description about FREEZE and IMPLICIT_DEF to TargetOpcodes.def.
To address the legalization issue, updated the semantics so FREEZEs return the same value if placed consecutively. Would this properly address the issue?

I'm sorry, it seems the track of this patch was lost.
We really need to get this last bit of freeze support landed.
Can anyone spot any remaining issues with the patch?

craig.topper added inline comments.Jan 8 2020, 3:36 PM
llvm/include/llvm/Support/TargetOpcodes.def
61

constraint -> constrain

nlopes added a comment.Jan 9 2020, 6:26 AM

The patch & semantics look good to me, but I'm not a backend expert. I'll leave the final LGTM to someone else.
It would be awesome if we could get this in for 10.0 so that we have complete support for freeze.

aqjune updated this revision to Diff 237090.Jan 9 2020, 8:09 AM
aqjune retitled this revision from [SelDag][MIR] Add FREEZE to [SelDag][MIR] Add FREEZE.
  • Rebase
  • Address a comment
aqjune added a comment.Jan 9 2020, 8:20 AM

To synchronize - the remaining issue was about duplicability of the freeze instruction in MIR.

In IR, a duplicated freeze may result in a different value due to the nature of undef. However, in MIR, this may be problematic in some cases.

My suggestion is to allow freeze in MIR to yield the same value when they are consecutively arranged.
This involved clarificaton of when the value of IMPLICIT_DEF could change during execution, and the updated comments of TargetOpcodes.def reflects that.

Currently, isNotDuplicable is still set to 1 to leave it as conservative & copiers deal with freeze specially in the future.

aqjune added a comment.Jan 9 2020, 4:16 PM

BTW, there was a discussion about IMPLICIT_DEF as well, about whether it should return consistent value for every use vs. it can return different values per use (as IR's undef).

Currently IMPLICIT_DEF can yieid different values, and there are a few cases where that happens (such as using IMPLICIT_DEF across function calls).

Making it return consistent value will allow FREEZE to be duplicable, but it would require fixing several transformations that deal with IMPLICIT_DEF.

Before discussion about IMPLICIT_DEF is further made and it is changed to return concrete value, I'd like to suggest leaving FREEZE as semi-duplicable (as suggested by this patch), but not fully duplicable.

efriedma added inline comments.Jan 17 2020, 4:21 PM
llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
1135 ↗(On Diff #237090)

If we expand FREEZE in ExpandPostRAPseudos, how can we reach this code?

aqjune updated this revision to Diff 239013.Jan 19 2020, 4:00 PM
  • Rebase
  • Remove AsmPrinter::emitFreeze
aqjune marked 2 inline comments as done.Jan 19 2020, 4:05 PM
aqjune added inline comments.
llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
1135 ↗(On Diff #237090)

It was a legacy code to see the freeze comment from assembly before updating ExpandPostRAPseudos. I removed it.

aqjune marked an inline comment as done.Jan 19 2020, 4:40 PM

For the next step, what can I do?

If duplicability is the main concern, I can split this patch so (1) FREEZE is defined to be duplicable first, and (2) when there happens a practical case where duplication matters, the next one is created.

It will reduce diffs in TargetOpcodes.def as well because the comment update at IMPLICIT_DEF is not needed anymore.

aqjune marked 3 inline comments as done.Jan 19 2020, 4:40 PM

ping

If adding FREEZE to MachineIR causes maintenance cost, another possible solution is to lower SelDag's FREEZE(UNDEF) to IMPLICIT_DEF directly.
In this option, FREEZE only exist in SelDag, not in MIR.
Instead, now IMPLICIT_DEF in MIR should always yield the same value. This requires a few transformations regarding IMPLICIT_DEF in MIR to be fixed.
This may cause suboptimal codegen in a few cases, but I believe we have a solution for the cases, because a single UNDEF from SelDag can be always lowered into multiple IMPLICIT_DEFs (per each use) in MIR.

lebedev.ri accepted this revision.Feb 26 2020, 2:13 AM
lebedev.ri added a reviewer: arsenm.

This seems good to me, but you want to wait for someone else more knowledgeable..
ping @arsenm @qcolombet @craig.topper

llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
65–68

This is unrelated, please feel free to commit this right away.

This revision is now accepted and ready to land.Feb 26 2020, 2:13 AM
aqjune updated this revision to Diff 246778.Feb 26 2020, 9:58 AM
  • Rebase
  • Make the update in getOperationCost as a separate commit (4f71252)
aqjune marked an inline comment as done.Feb 26 2020, 10:02 AM
aqjune updated this revision to Diff 251955.Mar 23 2020, 12:19 AM

Leave the SelDag patch only, MachineIR part will be made as a separate patch (as discussed at D76483)

aqjune edited the summary of this revision. (Show Details)Mar 23 2020, 12:19 AM
aqjune retitled this revision from [SelDag][MIR] Add FREEZE to [SelDag] Add FREEZE.
spatel added inline comments.Mar 23 2020, 1:29 PM
llvm/test/CodeGen/X86/freeze-legalize.ll
9–13

Do we need to add basic simplify / constant folding for SDAG ?
freeze ( Constant ) --> Constant

aqjune marked an inline comment as done.Mar 23 2020, 5:10 PM
aqjune added inline comments.
llvm/test/CodeGen/X86/freeze-legalize.ll
9–13

Yes, I agree it will be great. Do you want to make this patch contain the change as well?

aqjune marked an inline comment as done.Mar 24 2020, 5:45 AM
aqjune added inline comments.
llvm/test/CodeGen/X86/freeze-legalize.ll
9–13

Or I can land this first, and add the simplify / constant folding for SDAG.
I prefer incrementally making things because this patch itself is a big change.

spatel added inline comments.Mar 24 2020, 6:10 AM
llvm/test/CodeGen/X86/freeze-legalize.ll
9–13

I prefer smaller patches too. Let's make that a follow-up.

Is it correct that there is very little chance that this patch will create a visible performance regression (because there should be almost no freeze instruction creation in IR yet)?

aqjune marked an inline comment as done.Mar 24 2020, 6:56 AM
aqjune added inline comments.
llvm/test/CodeGen/X86/freeze-legalize.ll
9–13

Yes, it is.
Currently there is only one place where freeze is introduced - https://reviews.llvm.org/D76179
I checked that from assembly outputs of LLVM test-suite , only 3 / 5239 files are affected by this patch.

This revision was automatically updated to reflect the committed changes.
spatel added inline comments.Mar 24 2020, 7:37 AM
llvm/test/CodeGen/X86/freeze-legalize.ll
9–13

That sounds good then. But can we avoid those 3 regressions cases before or within this patch? Ideally, we don't want to knowingly regress anything.

aqjune marked an inline comment as done.Mar 24 2020, 8:22 AM
aqjune added inline comments.
llvm/test/CodeGen/X86/freeze-legalize.ll
9–13

Among 3 files, two were simple regressions that had a bit more verbose assembly:

...
sete  %al    
orb %bpl, %al
jne .LBB9_1  
=>
...
sete  %al    
orb %bpl, %al
testb $1, %al
jne .LBB9_1
...
testb $1, %al
je  .LBB0_2  
=>
...
movl  %eax, %ecx 
andl  $1, %ecx   
je  .LBB0_2

Case 3's assembly diff was bigger, so needs inspection. I can visit it after the simpler two cases are resolved.

@aqjune

@bkramer committed a fix for a crash in LegalizeFloatTypes where this operator appeared in SoftPromoteHalfResult

Our downstream ARM compiler is crashing in regressions in a similar manner in SoftenFloatResult in the same file. Is it not conceivable that ISD::FREEZE might make it to this function? If that's the case, we can look into it and start up a new review if necessary. The test causing this crash is select-cc.ll.

@aqjune

@bkramer committed a fix for a crash in LegalizeFloatTypes where this operator appeared in SoftPromoteHalfResult

Our downstream ARM compiler is crashing in regressions in a similar manner in SoftenFloatResult in the same file. Is it not conceivable that ISD::FREEZE might make it to this function? If that's the case, we can look into it and start up a new review if necessary. The test causing this crash is select-cc.ll.

Hi, I saw @bkramer 's fix too (commit id 0019c2f194a5e1f4cd65c5284e204328cc40ab3d ), and I think it is a valid fix. Similar thing can be applied to the SoftenFloatResult function. I'll make a patch for this.