This is an archive of the discontinued LLVM Phabricator instance.

[DAGCombiner] Set the right SDLoc on a newly-created zextload (1/N)
ClosedPublic

Authored by vsk on Apr 23 2018, 3:20 PM.

Details

Summary

Setting the right SDLoc on a newly-created zextload fixes a line table
bug which resulted in non-linear stepping behavior.

Several backend tests contained CHECK lines which relied on the IROrder
inherited from the wrong SDLoc. This patch breaks that dependence where
feasbile and regenerates test cases where not.

In some cases, changing a node's IROrder may alter register allocation
and spill behavior. This can affect performance. I have chosen not to
prevent this by applying a "known good" IROrder to SDLocs, as this may
hide a more general bug in the scheduler, or cause regressions on other
test inputs.

rdar://33755881, llvm.org/PR37262

Diff Detail

Repository
rL LLVM

Event Timeline

vsk created this revision.Apr 23 2018, 3:20 PM
aprantl requested changes to this revision.Apr 23 2018, 3:47 PM

Wow that's a surprising amount of churn! The change itself looks good, I would just prefer a smaller testcase if that is possible.

test/CodeGen/AArch64/arm64-aapcs.ll
31 ↗(On Diff #143664)

So the order of these instructions is irrelevant?

test/CodeGen/X86/fold-zext-trunc-dbginfo.ll
1 ↗(On Diff #143664)

I'm not convinced if this huge test adds anything on top of all the other tests. Could you either remove it or replace it with a function that really only consists of the affected pattern?

This revision now requires changes to proceed.Apr 23 2018, 3:47 PM
aprantl added inline comments.Apr 23 2018, 3:49 PM
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
8032 ↗(On Diff #143664)

Out of curiosity, why doesn't this also get the SDLoc(N0)?

vsk updated this revision to Diff 143670.Apr 23 2018, 4:13 PM
vsk marked an inline comment as done.
  • Extend test/CodeGen/X86/fold-zext-trunc.ll to test the debug location change.
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
8032 ↗(On Diff #143664)

I'm not sure what SetCCUses are. It's possible that these need a different location. I can look into it as a follow-up.

test/CodeGen/AArch64/arm64-aapcs.ll
31 ↗(On Diff #143664)

It seems so, although it'd be nice if @ab could chime in about this.

test/CodeGen/X86/fold-zext-trunc-dbginfo.ll
1 ↗(On Diff #143664)

There's an existing test I can repurpose.

niravd added inline comments.Apr 23 2018, 8:26 PM
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
7483 ↗(On Diff #143670)

As per previous discussion, this should probably be SDLoad(ExtLoad).

7560 ↗(On Diff #143670)

Ditto.

7786 ↗(On Diff #143670)

We should be consistant and make sure we choose the load's location for every DAG.getExtLoad.

7817 ↗(On Diff #143670)

Ditto

8192 ↗(On Diff #143670)

Ditto

8342 ↗(On Diff #143670)

Ditto

8371 ↗(On Diff #143670)

Ditto

8733 ↗(On Diff #143670)

Ditto

8749 ↗(On Diff #143670)

Ditto

11190 ↗(On Diff #143670)

Ditto

test/CodeGen/AArch64/arm64-aapcs.ll
31 ↗(On Diff #143664)

They're definitely unordered.

test/CodeGen/X86/avg.ll
296 ↗(On Diff #143670)

Much of this test looks like if you ran the update llc test script before this patch they would be generalized and be equivalent. Since you're making revisions, could you land that before updating this patch?

test/CodeGen/X86/win-smallparams.ll
60 ↗(On Diff #143670)

Were these pushes here before this patch?

vsk planned changes to this revision.Apr 24 2018, 11:40 AM

Thanks @niravd and Adrian for your feedback. I'll take another shot at this with an eye towards fixing similar location bugs.

test/CodeGen/X86/avg.ll
296 ↗(On Diff #143670)

I'll do that and rebase.

test/CodeGen/X86/win-smallparams.ll
60 ↗(On Diff #143670)

No, the pushes were not here previously. I was surprised by this codegen changed and meant to follow-up with @rnk about it. It wasn't clear to me whether this indicates some bug related to IROrder. Should I file a PR?

Are those extra pushes spills? I could imagine that changing the order of instruction affects register allocation and sufficiently increases the register pressure on i386 (which doesn't have many general purpose registers) to push it over the edge.

vsk updated this revision to Diff 144242.Apr 26 2018, 4:59 PM
vsk marked 4 inline comments as done.
vsk retitled this revision from [DAGCombiner] Set the right SDLoc on a newly-created zextload to [DAGCombiner] Set the right SDLoc on a newly-created zextload (1/N).
vsk edited the summary of this revision. (Show Details)

@niravd I'm still working through your comments. Because each change introduces some test case churn, I've split things up for easier review. I'll link the new reviews to this llvm.org/PR37262.

Are those extra pushes spills? I could imagine that changing the order of instruction affects register allocation and sufficiently increases the register pressure on i386 (which doesn't have many general purpose registers) to push it over the edge.

I think so. The affected registers are immediately used and subsequently popped.

lib/CodeGen/SelectionDAG/DAGCombiner.cpp
8032 ↗(On Diff #143664)

Looking at this some more, it seems like applying SDLoc(ExtLoad) to the new SetCC nodes would make more sense.

vsk updated this revision to Diff 144426.Apr 27 2018, 6:13 PM
vsk marked 4 inline comments as done.
vsk edited the summary of this revision. (Show Details)
vsk added a reviewer: fedor.sergeev.
  • Update a SPARC test missed during earlier testing.
vsk added inline comments.Apr 27 2018, 6:13 PM
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
8032 ↗(On Diff #143664)

I've proposed a patch for this here: https://reviews.llvm.org/D46216

7483 ↗(On Diff #143670)
7560 ↗(On Diff #143670)

For posterity, the patch for this is: https://reviews.llvm.org/D46156

7786 ↗(On Diff #143670)

I'm planning on addressing this here: https://reviews.llvm.org/D46158

So if I understand correctly the new instruction ordering in the assembler output is actually closer to the instruction ordering in the IR, since we are now setting the order metadata correctly in DAGCombine?

vsk marked an inline comment as done.May 1 2018, 11:25 AM

Some of the follow-up patches for issues identified in this review depend on the test case changes here.

Are there any outstanding issues left to address? I understand that the test case churn might be a blocking concern: if so, should I file a follow-up bug report against the scheduler, or should we introduce a workaround by preserving the old IROrder on new nodes?

So if I understand correctly the new instruction ordering in the assembler output is actually closer to the instruction ordering in the IR, since we are now setting the order metadata correctly in DAGCombine?

Yes, that's my read of what's happening. For example, in test/CodeGen/ARM/vector-load.ll, I think the add.w instruction corresponds to a gep, which should logically occur right before the store.

lib/CodeGen/SelectionDAG/DAGCombiner.cpp
7817 ↗(On Diff #143670)
aprantl accepted this revision.May 1 2018, 11:32 AM
This revision is now accepted and ready to land.May 1 2018, 11:32 AM
This revision was automatically updated to reflect the committed changes.