This is an archive of the discontinued LLVM Phabricator instance.

[PowerPC] Clear the sideeffect bit for those instructions that didn't have the match pattern
ClosedPublic

Authored by steven.zhang on Oct 20 2019, 7:57 PM.

Details

Summary

If the instruction have match pattern, llvm-tblgen will infer the sideeffect bit from the match pattern and it works well. If not, the tblgen will set it as true that hurt the scheduling.

// The mayLoad and mayStore flags default to false.
// Conservatively assume hasSideEffects if it wasn't explicit.
if (InstInfo->hasSideEffects_Unset)
  InstInfo->hasSideEffects = true;

PowerPC has some instructions that didn't specify the match pattern(i.e. LXSD etc), which is manually selected post-ra according to the register pressure. We need to clear the sideeffect flag for these instructions.

Diff Detail

Event Timeline

steven.zhang created this revision.Oct 20 2019, 7:57 PM
Herald added a project: Restricted Project. · View Herald TranscriptOct 20 2019, 7:57 PM

Gentle ping ...

hfinkel accepted this revision as: hfinkel.Oct 28 2019, 8:43 AM

LGTM

This revision is now accepted and ready to land.Oct 28 2019, 8:43 AM
jsji accepted this revision as: jsji.Oct 28 2019, 2:30 PM

I don't see any connection of this bit to match pattern.
Can you explain why we want to limit that didn't have the match pattern -- We actually didn't care about the match pattern in your patch either.

Overall LGTM as well, but maybe we can improve the testcase further. Thanks.

llvm/test/CodeGen/PowerPC/mi-scheduling.ll
36 ↗(On Diff #225814)

Global memory objects can be hasOrderedMemoryRef as well, so having Global memory object might NOT always due to hasUnmodeledSideEffects.
And it is *NOT* easy to come up the testcase in scheduling that can test the other affected opcode above.
so mi-scheduling might not be a great place to check hasUnmodeledSideEffects.

Maybe we can check it in peephole-opt with MIR. eg:

$ cat tm.mir
name:            sideeffects
alignment:       16
liveins:
  - { reg: '$x3', virtual-reg: '' }
body:             |
  bb.0.entry:
    liveins: $x3
   renamable $vf1 = LXSD 136, renamable $x3 ::

...

$ llc -run-pass peephole-opt tm.mir -mtriple=powerpc64le-unknown-linux-gnu -mcpu=pwr9 -debug-only=peephole-opt
********** PEEPHOLE OPTIMIZER **********
********** Function: sideeffects
NAPhysCopy: blowing away all info due to renamable $vf1 = LXSD 136, renamable $x3
Encountered load fold barrier on renamable $vf1 = LXSD 136, renamable $x3

This way, we can easily add more affected opcodes in the MIR tests. What do you think?

steven.zhang marked an inline comment as done.Oct 28 2019, 7:35 PM

I don't see any connection of this bit to match pattern.
Can you explain why we want to limit that didn't have the match pattern -- We actually didn't care about the match pattern in your patch either.

Overall LGTM as well, but maybe we can improve the testcase further. Thanks.

If we didn't have the match pattern, llvm-tblgen will set the sideeffects as true if we didn't specify it. Otherwise, it will infer this flag from match pattern. And if we didn't set the sideeffect for match pattern(usually case), it will set it as false by default. So, what we need to clear is about the case that didn't have the match pattern.

llvm/test/CodeGen/PowerPC/mi-scheduling.ll
36 ↗(On Diff #225814)

Your suggestion will definitely work. However, as I am checking NOT having "Global memory object" and it will cover the hasUnmodeledSideEffects and other conditions you mentioned. Personally, I want to check that, LXSD SU didn't have memory dependency with others, not just the sideeffects flag.

jsji added inline comments.Oct 28 2019, 7:58 PM
llvm/test/CodeGen/PowerPC/mi-scheduling.ll
36 ↗(On Diff #225814)

Fair enough, but in that case, you shouldn't call the function: sideeffects, and the comments should be updated to reflect your test points.

Also this IR can be further reduced to include ONLY LXSD and fewer fadd, store is unnecessary.

And what is more, we should add more testing for other affected op? Eg: opcode with match patterns like 'LXSIBZX'? Make sure nothing changed?

hfinkel added inline comments.Oct 28 2019, 8:43 PM
llvm/test/CodeGen/PowerPC/mi-scheduling.ll
36 ↗(On Diff #225814)

It's probably best if we just add a utility that dumps all of the instruction properties, and then we can directly test the output of that utility. I'd certainly welcome some follow-up patch which did that.

steven.zhang marked 2 inline comments as done.Oct 28 2019, 10:09 PM
steven.zhang added inline comments.
llvm/test/CodeGen/PowerPC/mi-scheduling.ll
36 ↗(On Diff #225814)

This is a good idea to verify the instruction properties as it is really needed. However, need some effort. I will create a follow up issue for this. Thank you.

36 ↗(On Diff #225814)

re jsji: Good point. I will update the comments, rename the function name, and remove the store. For other opcode, I think, we need some follow up as hfinkel said to verify them together.

Update the test and move it into the scheduling-mem-dependency.ll. And update the test in scheduling-mem-dependency.ll. to make it more robust together with this patch, as it was added by me before ...

jsji added inline comments.Oct 29 2019, 8:53 AM
llvm/test/CodeGen/PowerPC/mi-scheduling.ll
36 ↗(On Diff #225814)

It's probably best if we just add a utility that dumps all of the instruction properties, and then we can directly test the output of that utility. I'd certainly welcome some follow-up patch which did that.

Yes, good idea for such utility, we should be able to catch quite some hidden bugs with it.

hfinkel added inline comments.Oct 29 2019, 9:40 AM
llvm/test/CodeGen/PowerPC/mi-scheduling.ll
36 ↗(On Diff #225814)

Yes, good idea for such utility, we should be able to catch quite some hidden bugs with it.

Exactly; I used to do this every once in a while my manually searching through the TableGen-generated files; I suppose that we might be able to do this by literally running TableGen and matching the output, but a separate utility might be better from a build-process-management standpoint and because it won't require rerunning TableGen (which could be slow in Debug mode).

This revision was automatically updated to reflect the committed changes.