The code checks when an integer load is followed by one or more direct move. If there is no other user for the integer load we can replace two instructions with one floating point load. Essentially what we do here is disabling aggressive exploitation of direct move instruction. Otherwise the code for replacing the sequence already exists and works properly.
Diff Detail
Event Timeline
| lib/Target/PowerPC/PPCISelLowering.cpp | ||
|---|---|---|
| 6302 | I think this only makes sense if the direct-move path wouldn't be able to use a D-Form load. In that case, you're simply eliminating one direct move instruction. However, in cases where the direct-move path would be able to use a D-Form load, we've just added another load (for the X-Form lxsiwax). For [a contrived] example: float test (int *arr) {
  return arr[2];
}That being said, I don't know that this would be a show-stopper for this approach as the benefits may outweigh this issue. 
 I believe that all of these can be checked without making the code overly complicated. | |
| 6309 | Just a minor nit - I think we generally prefer complete sentences for comments (capitalization, punctuation). | |
| test/CodeGen/PowerPC/direct-move-profit.ll | ||
| 5 | You should probably remove the metadata here and (some) below. The triple on the command line and here conflict and I don't believe that all the metadata below is necessary. | |
| lib/Target/PowerPC/PPCISelLowering.cpp | ||
|---|---|---|
| 6302 | We discussed this over email. This is a summary of our conclusion. The current patch for deciding whether or not we should use direct move is fine. The code generated for the example above (after this patch) is: 0: 08 00 80 38 li r4,8 4: 98 20 03 7c lxsiwax vs0,r3,r4 8: e0 04 20 f0 xscvsxdsp vs1,vs0 c: 20 00 80 4e blr Ideally it should be like what gcc does: 0: 08 00 63 38 addi r3,r3,8 4: ae 1e 20 7c lfiwax f1,0,r3 8: 9c 0e 20 ec fcfids f1,f1 c: 20 00 80 4e blr The second code pattern has lower register pressure and that is why it is better. The code that generates first code pattern above was previously used only for Power7. After this patch this code is now used for Power8. The example exposes an opportunity within the new code path. That will be addressed separately from this patch. | |
| lib/Target/PowerPC/PPCISelLowering.cpp | ||
|---|---|---|
| 6302 | OK, I'm confused by these comments. With this patch, what code sequence is going to be generated? | |
| 6389–6390 | I have a very slight preference for checking whether DirectMove is available before checking whether it is profitable. | |
| test/CodeGen/PowerPC/direct-move-profit.ll | ||
| 5 | I agree. Please remove this, and the target triple line immediately below, and specify the correct triple on the RUN step. | |
| lib/Target/PowerPC/PPCISelLowering.cpp | ||
|---|---|---|
| 6302 | The first sequence is the one generated after this patch. The second sequence is the one currently gcc generates. The one that is generated before this patch is this: 0: 08 00 63 80 lwz r3,8(r3) 4: a6 01 03 7c mtfprwa f0,r3 8: e0 04 20 f0 xscvsxdsp vs1,vs0 c: 20 00 80 4e blr | |
| lib/Target/PowerPC/PPCISelLowering.cpp | ||
|---|---|---|
| 6302 | I discussed this with Kit offline. I will address the requested change in the test case (making sure meta data is correct) and two nits suggested below. After final approval I will commit this change. The bug exposed by this patch (which was discussed in the comments above) will be addressed separately under the following bugzilla item. | |
Removed data layout and redundant triple definition from the test case.
Capitalized comment in the code.
Minor change in the if statement which has the new condition. (Changed the order of two conditions)
| lib/Target/PowerPC/PPCISelLowering.cpp | ||
|---|---|---|
| 6309 | Please add a space between the "//" and the comments. | |
| 6389 | The calls to the Subtarget.* are likely very cheap, while the call to directMoveIsProfitable is less so. I'd suggest checking the directMoveIsProfitable() last. | |
| test/CodeGen/PowerPC/direct-move-profit.ll | ||
| 6 | Please add a space after the semicolon. | |
| lib/Target/PowerPC/PPCISelLowering.cpp | ||
|---|---|---|
| 6389 | Actually, calls to the Subtarget.* are going to be always successful unless someone is compiling for Power 7 or below (which is not very likely) or is doing some work with the testcases. So for almost all practical purposes, we will always call directMoeIsProfitable and make the decision based on that. | |
| lib/Target/PowerPC/PPCISelLowering.cpp | ||
|---|---|---|
| 6389–6391 | Fair enough. :) | |
| lib/Target/PowerPC/PPCISelLowering.cpp | ||
|---|---|---|
| 6389–6391 | :) | |
I think this only makes sense if the direct-move path wouldn't be able to use a D-Form load. In that case, you're simply eliminating one direct move instruction. However, in cases where the direct-move path would be able to use a D-Form load, we've just added another load (for the X-Form lxsiwax). For [a contrived] example:
float test (int *arr) { return arr[2]; }That being said, I don't know that this would be a show-stopper for this approach as the benefits may outweigh this issue.
Also, I think you can probably predict whether the ISD::LOAD will be lowered to a D-Form load based on the inputs.
I think the complete condition when we don't want the direct move would be:
I believe that all of these can be checked without making the code overly complicated.