This is an archive of the discontinued LLVM Phabricator instance.

[PPC] Expand ISEL instruction into if-then-else sequence
ClosedPublic

Authored by jtony on Aug 17 2016, 3:17 PM.

Details

Summary

This patch will expand the isel instruction into an if-then-else sequence.
The expansion is done late, after instruction scheduling, as the isel instruction does terminate basic blocks, allowing for larger scopes for scheduling.

There are still some performance problems that I'm investigating. Specifically, there is a significant slowdown in SPEC2k6 hmmer (roughly 40%).
So, there will likely be another version of this patch necessary before committing. However, any feedback on the current patch is appreciated.

Diff Detail

Event Timeline

kbarton updated this revision to Diff 68433.Aug 17 2016, 3:17 PM
kbarton retitled this revision from to [PPC] Expand ISEL instruction into if-then-else sequence.
kbarton updated this object.
kbarton added reviewers: hfinkel, nemanjai, amehsan, Carrot.
kbarton added subscribers: llvm-commits, echristo.
nemanjai added inline comments.Aug 24 2016, 5:08 AM
lib/Target/PowerPC/PPCExpandISEL.cpp
35

I think it's clearer if you explicitly initialize this to false (presumably that's the default).

130

The doxygen comment for this function states that we want to expand the isel instructions on Power7 and Power8 processors but this actually relies on different conditions. I am not entirely clear on whether the SVR4 ABI only applies on those processors, but the conditions seem to be mismatched.

144

Is there any concern that there may be some [pathological] cases where this would end up copying an unreasonably large vector (i.e. if the BB contains many isel instructions and has heap-allocated storage)?

149

Should this whole function be in an
#ifndef _NDEBUG
block? And of course the call.

165

Minor nit: The block numbers are dumped as "BB#<num>" above, but here they're dumped as "block <num>".

183

Not that this is likely to ever change, but perhaps for the sake of consistency this assert should be:

assert(isISEL(MI) && "Expecting an ISEL instruction");
188

Again, when the only statement[s] in the loop will be compiled out without asserts, I think the whole loop should be in a DEBUG() block. Mind you, I don't fully follow why we're dumping the operands separately after we've already dumped the MachineInstr - especially since they'll be dumped again below (Dest, TrueValue, False, ConditionRegister).

213

Does this cast work (i.e. converting an iterator to a pointer)?

222

I don't know the details of this so this may just be a ridiculous suggestion, but will this not assert if the isel is the very last instruction before returning from the function? Perhaps that would be impossible because the branch-to-link-register would be last. Except perhaps if this is a "noreturn" function (in which case, I'm not sure why there'd be an isel at the end).

223

Minor nit: I think that "succ_it" (although a cool variable name :)) breaks with variable naming conventions (camel vs. underscores).

267

Probably forgot to delete this line. Same applies to similar lines below.

268

I think for both cases where you add this instruction, a check should be performed whether the two registers are the same. If they are, you don't need the instruction. Also, just out of curiosity, is there any significance to the choice to use ADDI vs. OR (which I think is what we normally use for reg-to-reg copies)?

272

This is probably not needed since you're not saving a pointer to the instruction. Same applies below.

285

I wonder if there's anything available at this stage in the compilation that can be used for some heuristics to choose which block we should branch to and which should be a fall-through block?

jtony commandeered this revision.Nov 14 2016, 3:02 PM
jtony added a reviewer: kbarton.
jtony updated this revision to Diff 77896.Nov 14 2016, 3:34 PM
jtony added reviewers: sfertile, syzaara, lei.
jtony marked 12 inline comments as done.

Generally, the ISEL is expanded into if-then-else sequence, in some cases (like when the destination register is the same with the true or false value register), it may just be expanded into just the if or else sequence. This patches does these further optimizations compare to the previous patch by Kit Barton.

nemanjai added inline comments.Nov 14 2016, 5:48 PM
lib/Target/PowerPC/PPCExpandISEL.cpp
86

This is just not true. Please see the comment below at the definition of the function.

130

At this point, I would question the applicability of this function. We now have an option that is meant to tell the compiler to generate or not generate the instruction, but has unexpected behaviour on some platforms.
Namely, when the option is specified to be true, the compiler will generate the isel on every platform. On the other hand, when the option is specified to be false, the compiler will not generate the isel on 64-bit non-Darwin targets. This of course, may be the desired behaviour (i.e. always generate isel on 32-bit or Darwin targets), but it should be documented as such.

167

I am not in favour of naming this function the same as the other overload. The two overloads do different things - this one expands ALL isel's, the other overload expands only ONE isel. I know this is somewhat common, but I think giving this one a name that suggests this plurality aids readability of the code.

177

I assume you have tested this, but I am just curious if dumping this iterator without dereferencing it prints the actual MI.

196

These names don't conform to the naming convention (variables start with a capital letter).

200

This is a very long function. I wonder if it would be a little easier to follow the code if it was split into (for example):

PPCExpandISEL::expandISELBothBlocks()
PPCExpandISEL::expandISELNoTrueBlock()
PPCExpandISEL::expandISELNoFalseBlock()

Or if that would lead to code duplication, perhaps just move all the code that sets up the basic block layout as needed in the given situation into
PPCExpandISEL::reorganizeBlockLayout()

216

The indentation seems off here.

245

I think this should be moved up to where you select how to set Successor. Basically within that conditional statement, handle reordering the layout of the basic blocks and then move on to creating the true/false blocks as necessary.

273

I don't understand the need for this strange code layout. Namely, you have:

if (isTrueBlockRequired)
  // do true block stuff
if (isFalseBlockRequired)
  // do false block stuff
if (isTrueBlockRequired)
  // do more true block stuff
if (isFalseBlockRequired)
  // do more false block stuff
...

Why can't the code in these conditional blocks be collapsed together?

nemanjai added inline comments.Nov 15 2016, 5:55 AM
lib/Target/PowerPC/PPCExpandISEL.cpp
34

Should probably be changed to something along the lines of
Disable generating the ISEL instruction on 64-bit non-Darwin PPC targets.

185

I'm not sure it makes sense for this function to have a bool return value since it can never return false. This should likely be a void function.

303

One of these may be redundant. I think if the code is reorganized to keep things together, it will be clearer what needs to be done. What I mean is that if we group together things we do if we need a true block and group together things we do if we need a false block, the logic will be easier to follow.

jtony updated this revision to Diff 78664.Nov 20 2016, 1:18 PM
jtony marked 8 inline comments as done.

Special case 1: If all the registers of ISEL instruction are the same, we simply remove the ISEL instruction.
Special case 2: If the input registers of the ISEL instruction are the same but different with the output register, we just replace ISEL with one unconditional copy.
Special case 3: If the true value register of the ISEL instruction is the same with the dest register, we skip the true block.
Special case 4: If the false value register of the ISEL instruction is the same with the dest register, we skip the false block.
Most general case 5: Otherwise, we expand ISEL to if-then-else.

jtony updated this revision to Diff 78749.Nov 21 2016, 11:52 AM
jtony marked 6 inline comments as done.
kbarton requested changes to this revision.Nov 21 2016, 1:32 PM
kbarton edited edge metadata.

There are some things that need to get cleaned up.

lib/Target/PowerPC/PPCExpandISEL.cpp
35

Good idea. I will change this.

130

These checks are really messy right now, in my opinion.
This will enable it on all non-Darwin systems, which should include PWR7 and PWR8, regardless of endianness. I'm happy to change this if there is something more specific to use.

130

Yes, I agree with this given the new direction of the patch (continue to generate ISEL, provide a switch to disable it).
If we want to disable ISEL generation by default, we can assess whether we should add similar logic to focus it to a subset of architectures.

144

You could certainly construct such a case, but I doubt it would happen in practice.

183

Good call, done.

200

Could you please add a statistic to track the number of ISEL instructions that are removed.

223

Fixed

226

This should be done earlier, especially before (possibly) creating the NewSuccessor basic block above.
I would move this as early as possible, once all the information needed is available.

268

OR treats R0 as special (the literal value 0), while ADDI respects the contents of R0. So, if the register allocator has decided to use R0 for one of the inputs to ISEL, and you simply do an OR, you can end up with incorrect results.

272

Good point, fixed.

285

Yes, the expansion can certainly be changed. We can discuss various heuristics, if we decide we want to actually expand the ISELs. For now I'd prefer to leave this as the default until that decision has been made.

303

I agree - I think we should be able to consolidate some of this code into just two tests.

test/CodeGen/PowerPC/expand-isel.ll
2

You need to include PowerPC triple

This revision now requires changes to proceed.Nov 21 2016, 1:32 PM
nemanjai added inline comments.Nov 22 2016, 7:49 PM
test/CodeGen/PowerPC/expand-isel.ll
2

Yes, please do not forget this because I'm pretty sure it would break every multi-target build bot other than the PPC ones.

nemanjai added inline comments.Nov 24 2016, 8:49 AM
lib/Target/PowerPC/PPCExpandISEL.cpp
226

Yes, I'd handle the cases where you're getting rid of the ISEL without introducing new control flow first - i.e. before you start creating new control flow.

jtony marked 9 inline comments as done.Nov 30 2016, 3:06 PM
jtony added inline comments.
lib/Target/PowerPC/PPCExpandISEL.cpp
34

I reconsider this comment and realize that this ppc-gen-isel option is set to true, maybe it is better (I mean consistent with the default behavior) we use the "Enable generating the ISEL instruction"?

35

Shall we make this option visible to all users?

268

Nemanjaj already noticed the bug in the patch :-) (always use ADDI for both True and False branch, we should just use ADDI for True branch and ORI for false branch), if we have considered this comments more carefully, could avoid this bug.

303

Last time when I move around these code, it caused assertion, so I just combined a few blocks, leave some other blocks untouched. If people strongly disagree, I can try again to move these blocks around.

hfinkel edited edge metadata.Nov 30 2016, 3:33 PM

As a general note: I'd like to see this controlled with our existing isel toggle. In other words, take all of the existing places that call hasISEL() and hard-code them to true and then have this pass expand the isel instructions when hasISEL() is false.

There are a lot of loops here that could be range-based for loops. Please make them range-based for loops.

I've been thinking about how difficult it might be to handle the case where multiple isels use the same condition. I don't think it would be too difficult to support this: As you iterate through the isels in each block, keep track of the true block, false block and condition for the last expanded isel. If this isel has the same condition as the last one, then scan between them to make sure that nothing defines this isel's input registers in between the two isels and that this isel does not depend on the last one. If that's true, then insert the generated instructions into the blocks from the last isel instead of making new ones.

As a quick example, I'm thinking about something like this:

int foo(int a, int b, int d, int f, int g) {
  int q = g > 0 ? a : b;
  int r = g > 0 ? d : f;
  return q+r;
}

which generates code like this:

cmpwi 0, 7, 0
isel 3, 3, 4, 1
isel 12, 5, 6, 1
add 3, 3, 12
extsw 3, 3
blr

lib/Target/PowerPC/PPCExpandISEL.cpp
12

Please note here that this pass is intended to be run post-RA only.

jtony updated this revision to Diff 79855.Nov 30 2016, 7:30 PM
jtony edited edge metadata.
jtony marked an inline comment as done.

Fix the bootstrap ISEl bug (should use ORI instruction rather than ADDI for the false branch) .
Update all the existing test cases that used to generate ISEL to test -ppc-gen-isel=false situation.
Address the comments in phabricator.
Update the new expand-isel.ll file according to the bug fix.

I've been thinking about how difficult it might be to handle the case where multiple isels use the same condition. I don't think it would be too difficult to support this: As you iterate through the isels in each block, keep track of the true block, false block and condition for the last expanded isel. If this isel has the same condition as the last one, then scan between them to make sure that nothing defines this isel's input registers in between the two isels and that this isel does not depend on the last one. If that's true, then insert the generated instructions into the blocks from the last isel instead of making new ones.

I agree this can be done, and set up the original code to make this fairly straightforward. I had hoped to make a decision about whether we want ISEL generated or not before making further improvements to this patch, but it may be better to handle this case first, as it can effect performance and thus influence the final decision regarding what the defaults for ISEL generation will be.

lib/Target/PowerPC/PPCExpandISEL.cpp
89

We can remove all uses of \brief since doxygen autobrief is now enabled by default.

kbarton requested changes to this revision.Dec 1 2016, 8:16 AM
kbarton edited edge metadata.

Need to address Hal's comments and post a subsequent patch. Thanks.

This revision now requires changes to proceed.Dec 1 2016, 8:16 AM
jtony updated this revision to Diff 81095.Dec 12 2016, 9:18 AM
jtony edited edge metadata.
jtony marked an inline comment as done.

Support merging contiguous ISELs.

jtony added inline comments.Dec 12 2016, 9:21 AM
test/CodeGen/PowerPC/expand-contiguous-isel.ll
2

This test case is reduced from a bootstrap build bug, still very complex, I am not sure whether we want to deliver it or not.

jtony added a comment.EditedDec 12 2016, 9:42 AM

The total SPEC statistics:

Number Expanded 23601

For these ISEL with same CR (just compare operand(3), it is still possible CR can be modified in between the two ISELs, and the registers used or defined by the later ISEL could also be changed) :

Number Contiguous 825
Number Non-Contiguous 2618

Also, this patch doesn't address the option comment by Hal.

jtony updated this revision to Diff 81302.Dec 13 2016, 2:22 PM
jtony edited edge metadata.

Address the option comment from Hal. Now all the patch is complete, it address all the three comments from Hal.

Thanks for all of the updates! Some more things...

In general, our policy is that comments are complete sentences, starting with a capital letter and ending with a period (or other punctuation). Please audit all comments and adjust accordingly.

lib/Target/PowerPC/PPCExpandISEL.cpp
310

I think that, in rare circumstances, this assert would fire. It would happen if you had a block that was the last block in a function which ended with an "unreachable" and the instruction before that were an isel. I think you need to handle this case. I think that you should be able to create a .mir test case for this.

399

Don't compare with literial zero here either.

hfinkel added inline comments.Dec 13 2016, 6:37 PM
lib/Target/PowerPC/PPCExpandISEL.cpp
256

if -> If

265

You need to check against PPC::X0 and PPC::R0, not the literal 0. Please create some .mir test cases for these.

275

Do we need to reset IsTrueBlockRequired and IsFalseBlockRequired here?

281

Same here (don't compare to the literal 0).

lib/Target/PowerPC/PPCTargetMachine.cpp
420

Do you require the verifyAfter = false here? (I'm not sure why we have that for the others either). Can you test without it?

kbarton added inline comments.Dec 14 2016, 8:33 PM
lib/Target/PowerPC/PPCExpandISEL.cpp
56

You've got two different references to MachineFunction: MF here and Fn on line 60

63

Why do we need an LLVM_BB? I would have expected we're only working with MachineBasicBlocks at this point.

70

In general I don't like this approach of caching all of these intermediate values here. Given the algorithm iterates over lists of ISELs and does expansions somewhat different each time, you need to be very careful that all of these are set/reset between each iteration to ensure that the values are not stale. My previous experiences have taught me that this can be both confusing and error prone.

189

I don't see why these parameters need to references to MachineInstr *, why can't they just be references to MachineInstr?
I think the preference is to only use pointers if the value can be null for some reason. If it cannot be null, then it's better to use references.

282

insturction -> instruction

jtony marked 9 inline comments as done.Dec 15 2016, 6:58 AM
jtony added inline comments.
lib/Target/PowerPC/PPCExpandISEL.cpp
56

Good call, one of them is redundant, I removed all usage of Fn and use MF instead.

70

That's a good point, we should try to restrict the life of a variable as short as possible. Successor can be made a function scope variable while NewSuccessor can be passed around between functions.

275

We don't need to reset them, the removed ISEL isn't considered for calculating the value of IsTrueBlockRequired and IsFalseBlockRequired.

nemanjai added inline comments.Dec 21 2016, 7:23 AM
lib/Target/PowerPC/PPCExpandISEL.cpp
12

This wasn't addressed. I think it's important to note that since we need to have physical registers here.

37

This comment is no longer accurate.

109

Why does this need to be public?

121

Fragment. Please revise this comment to complete sentences and to account for the new semantics (i.e. interaction with -mno-isel/-mattr=-isel).

133

I would move this down. We don't want to dump out an entire MachineFunction if we're not planning to do anything with it.

160

I don't really understand the purpose for this function any longer. Why do we want two different options to disable generating ISEL's?
With this we have -mno-isel/-mattr=-isel as well as -ppc-generate-isel=false.

248

I don't think the name of the variable is useful to the consumer of the debug output. Probably something like:
DEBUG(dbgs() << "ISEL: " << **MI << "\n");

274

I suppose you've considered whether the erase here invalidates the iterator...
Perhaps setting MI to the result of the erase keeps the iterator valid. If so, maybe a quick comment to that end.

281

Just out of curiosity, is the comparison even needed? Wouldn't an ISEL that looks like
isel 5, 0, 0, 1
actually be a MachineInstr like
ISEL %X5, %ZERO8, %X0, %CR0LT
or something along those lines?

345

I am wondering if this is safe to do.
If there's a KILL for one of the live-ins in this MBB prior to the ISEL being expanded (or the ISEL itself is a KILL), would we violate some kind of invariant? Basically adding a live-in to a block when it isn't live out of the now truncated MBB.

jtony marked 28 inline comments as done.Dec 23 2016, 11:10 AM
jtony added inline comments.
lib/Target/PowerPC/PPCExpandISEL.cpp
35

Discussed in the scrum, we want it to be hidden because the user can use the FE option -misel and -mno-isel to control the generation of ISEL instructions.

109

Good call, it doesn't need to be public, have made it class private.

160

Good question, we need the -ppc-generate-isel only for our convenient, without it we couldn't control the IR and MIR test cases to expand ISEL in the BE, the FE options misel and mno-isel is for the users, the ppc-gen-isel option is set to hidden are only used by us.

274

Exactly, a lot of people made errors about this, I fixed one such bug before in the FE. Good call, I will add a comment to that end.

281

The (TrueValue.getReg() != PPC::R0) parted is redundant, removed.

310

Good call! Fixed and tested with a MIR file: expand-isel-6.mir

345

You are right we don't need to copy all the LiveIn registers, but it's hard to tell which ones we should copy and which shouldn't, we simply called everything to the NewSuccessor. I think it is just some performance issue, won't cause any problem as far as I know. I also tested adding an redudant register to an MIR file, it doesn't cause issue.

jtony marked 16 inline comments as done.Dec 23 2016, 11:22 AM
jtony updated this revision to Diff 82636.Dec 28 2016, 5:24 PM

Add 6 MIR test cases for special conditions ( mainly about R0)
(2) Fix the bug that triggers the can-fall-through assertion (when the ISEL is the last non-debug instruction of the last BB). And create 1 MIR test case for this assertion.
(3) Address some other minor comments posted on phabricator.

nemanjai edited edge metadata.Dec 29 2016, 2:51 PM

None of the comments indicate a serious issue that requires a major revision. However, there are a number of comments to be addressed and I think it'd be better if you published another revision that addresses them.

However, if other reviewers feel this is ready to commit, I certainly won't object.

lib/Target/PowerPC/PPCExpandISEL.cpp
62

Minor nit: I'd just state in the comment what this actually is - "A map of MBB numbers to their lists of contained ISEL instructions."

71

I don't think the name clearly reflects what the function does. Perhaps expandAndMergeISELs.

86

"Are the two operands..."

87

Please make these references const, we don't want to modify them.

166

I wonder if we should maybe have an assert here that the insert actually inserts something (i.e. we don't somehow end up with duplicate MBB numbers). Of course, this is not likely to happen, but if it did, we'd have unpredictable behaviour.

182

I would simplify this function with early exits rather than initializing boolean variables you'll only query once:

if (!(isSameRegister(...))
  return false;
// more init - come to think of it, you may want to just get rid of them and cast to iterator in the if below

if (std::prev(MBBI) != PrevPushedMBBI) {
  NumNonContiguous++;
  return false;
}

NumContiguous++;
return true;
227

No need to evaluate BIL.end() on every iteration. Or is it needed because of the potential call to erase()?

294

Perhaps a quick comment above this line. Sure you know why we're creating the basic block now, but you (or anyone else) will have to think about this condition a fair bit to understand why.
Maybe something like:

// Another basic block is needed to move the instructions that follow this ISEL.
// If the ISEL is the last instruction in a block that can't fall through,
// we need another block to branch to.
305

Why is this not a range-based for loop? The MBB should have a way to get an iterator_range for its successors. Perhaps something like MBB->successors()

338

Again, there must be a way to make this a range-based for loop. Perhaps MBB->liveins()?

349

It seems that you only need to add Successor as a successor to MBB if either the TrueBlock or the FalseBlock are not required. If so, these don't need to be executed unconditionally here - they can be sunk into the respective if statement.

387

Perhaps I've lost track here since the patch is fairly large, but don't you already have this available in IsTrueBlockRequired? Similarly with IsFalseBlockRequired below.

405

Why? I don't see the need for a register you're about to define to be added as live-in.

412

This seems a bit suspicious. If the ISEL you're actually expanding was a KILL for the True/False Register[s], it seems wrong to add it as live-in to the successor.

I suppose it doesn't necessarily hurt since you're not actually setting kill flags on the instructions you're adding. The only way I could see this causing a problem is if something that runs after this is looking for free registers. Of course, this isn't impossible. I think it's quite possible that prologue/epilogue insertion happens after this and it may need to find an unused register. Since you've [possibly] eliminated the kill flags, the register scavenger will not find these and we may emit sub-optimal code if we're shrink-wrapping.

lib/Target/PowerPC/PPCSubtarget.h
301–303

I didn't check, but presumably the base class returns false for this? If not, just get rid of the override. If so, put a comment as to why we always want to run EarlyIfConversion.

test/CodeGen/PowerPC/expand-contiguous-isel.ll
124

I don't think this particular register allocation is guaranteed. Perhaps change all of these to something like:

; CHECK-GEN-ISEL-TRUE: isel [[SAME:r[0-9]+]], [[SAME]], [[SAME]]

test/CodeGen/PowerPC/expand-isel-2.mir
36

All the %bb without a number is confusing. Can you change these to something like
; CHECK: BC %cr0gt, [[TRUE:%bb[0-9]+]]
and so on?

jtony marked 13 inline comments as done.Jan 1 2017, 5:29 PM
jtony added inline comments.
lib/Target/PowerPC/PPCExpandISEL.cpp
166

Sorry , I am not clear what kind of assertion we should insert here, can you be more specific?

227

I think the BIL.end() is needed here, one reason is we may call erase, also even we don't call erase(), we need it to tell us when to finish the loop.

349

I can merge these two ternary expression with the following two if guards by adding two more else, but I prefer to leave it as it is unless for some other solid reason.

387

I agree this patch is really large and kind of hard to follow the whole logic. IsTrueBlockRequired is used to guard whether generate the True Block, while IsADDIInstRequired only guards whether we should generate one ADDI instruction. IsTrueBlockRequired = IsADDIInstRequired(1) OR IsADDIInstRequired(2) ... OR IsADDIInstRequired(N). Similarly for IsFalseBlockRequired and IsORIInstRequired.

jtony marked 3 inline comments as done.Jan 1 2017, 5:29 PM
jtony marked 3 inline comments as done.
jtony marked 3 inline comments as done.
jtony marked 5 inline comments as done.
jtony updated this revision to Diff 82822.Jan 2 2017, 4:50 PM
jtony edited edge metadata.

Address more comments from Nemanja.

jtony updated this revision to Diff 82886.Jan 3 2017, 7:32 AM
jtony marked an inline comment as done.Jan 3 2017, 7:35 AM
jtony marked 2 inline comments as done.Jan 3 2017, 1:24 PM
jtony updated this revision to Diff 83049.Jan 4 2017, 6:42 AM

There are still a few things that need to get cleaned up.

lib/Target/PowerPC/PPCExpandISEL.cpp
84

Technically, I think this should be called useSameRegister

113

insturction -> instruction

255

I don't understand why this is only valid when there is one ISEL in the list.

255

You also need to check for register 0.
This isn't valid if TrueValue or FalseValue is r0, since it has different semantics on the True/False paths.

lib/Target/PowerPC/PPCTargetMachine.cpp
420

Did this get done?
I agree it would be better to not use verifyAfter=false here unless it is needed for some reason.

nemanjai added inline comments.Jan 4 2017, 2:59 PM
lib/Target/PowerPC/PPCExpandISEL.cpp
68

I meant to rename both expandCanMergeISELs with my previous comment. The name expandCanMergeISELs sounds to me like something called expand has the ability to merge ISELs - it's quite a confusing name.
The function takes a list of ISEL instructions, expands them into control flow AND it merges them (since they're already demonstrably mergeable).

74

This is just a minor nit that you can choose to address or ignore, but this does not need any non-static members so I would make these queries static.

255

I think Kit raises valid points here and I think we should add comments to address them as any reader would reasonably have the same questions when looking at the code.

// Note 1: We favour merging ISEL expansions over folding a single one. If the
//         passed list has multiple merge-able ISEL's, we won't fold any.
// Note 2: There is no need to test for PPC::R0/PPC::X0 because the first input
//         for ISEL uses register class 'gprc_nor0' which explicitly excludes
//         PPC::R0 (and similarly for the 64-bit variants). So:
//         TrueVal.getReg() == FalseVal.getReg() => FalseVal.getReg() != PPC::R0
lib/Target/PowerPC/PPCTargetMachine.cpp
420

I don't know the characteristics of the verification, but I assume that it takes some non-zero compile time and that it will run verification in all build modes.
If this is the case, I am personally in favour of adding a call to MachineFunction::verify() at the end of RunOnMachineFunction() wrapped in an #ifndef NDEBUG so that we get the verification with asserts and in debug mode, but save the associated compile-time in release mode.

jtony updated this revision to Diff 83562.Jan 8 2017, 8:24 AM
jtony marked 8 inline comments as done.

Address the comments from Kit and Nemanja.
Main changes are :
(1)setting verifyAfter=true when adding pass.
(2) fix some undefined physical register errors during verification, which is caused by when splitting one MBB into two, Regs defined in the first half is no longer visible to the second half. Need to add Regs defined in the first half as LiveIns of the second half. See lines 341-352 of file: lib/Target/PowerPC/PPCExpandISEL.cpp for details.

jtony added inline comments.Jan 8 2017, 8:29 AM
lib/Target/PowerPC/PPCTargetMachine.cpp
420

verifyAfter=false is removed now, by default we will use verifyAfter=true, so the verification is done always.

jtony marked an inline comment as done.Jan 8 2017, 8:30 AM
jtony updated this revision to Diff 83831.Jan 10 2017, 11:14 AM

Set the LiveIns of NewSuccessor Block properly.

Main changes are around lines: 338-349 of file lib/Target/PowerPC/PPCExpandISEL.cpp.

Similar logic can be found in function HexagonFrameLowering::expandStoreVec2() of file lib/Target/Hexagon/HexagonFrameLowering.cpp

kbarton added inline comments.Jan 11 2017, 11:26 AM
lib/Target/PowerPC/PPCExpandISEL.cpp
261

There is currently a bug filed against the PPC backend for this (https://llvm.org/bugs/show_bug.cgi?id=31065). Once this bug is fixed, the register class will need to include r0.
Please include a check for R0 here as well, to ensure correctness.

348

According to the documentation for stepForward, it is not the preferred method for calculating liveness. Instead, stepBackward should be used instead.
Did you try using stepBackward? If not, please change this to use stepBackward.

You'll need to rebase this to the current TopOfTrunk and make changes due to https://reviews.llvm.org/D28556.

nemanjai added inline comments.Jan 13 2017, 4:34 AM
lib/Target/PowerPC/PPCExpandISEL.cpp
261

The register class can't include PPC::R0/PPC::X0. It must include PPC::ZERO/PPC::ZERO8 instead. The bug mentioned isn't valid (I've added a clarifying comment to it). We certainly do allocate PPC::ZERO/PPC::ZERO8 for the ISEL instruction when the true operand comes from LI 0 (i.e. it's an immediate zero). This is done in PPCInstrInfo::FoldImmediate().

kbarton accepted this revision.Jan 13 2017, 9:51 AM
kbarton edited edge metadata.

Aside from updating two comments, this LGTM.

lib/Target/PowerPC/PPCExpandISEL.cpp
261

OK, I understand.
In that case, please change the comment to:

Note 2: There is no need to test for PPC::R0/PPC::X0 because PPC::ZERO/PPC::ZERO8 will be used for the first operand if the value is meant to be zero. In this case, the useSameRegister method will return false, thereby preventing this ISEL from being folded.

348

Based on an off-line conversanion, stepBackwark doesn't work.
Please add a comment here saying why stepForward is being used.

This revision is now accepted and ready to land.Jan 13 2017, 9:51 AM
jtony closed this revision.Jan 18 2017, 9:39 AM