This is an archive of the discontinued LLVM Phabricator instance.

[SystemZ] set 'guessInstructionProperties = 0' and add flags as needed.
ClosedPublic

Authored by jonpa on Nov 24 2017, 6:16 AM.

Details

Reviewers
uweigand
Summary

After setting 'guessInstructionProperties = 0', mayLoad, mayStore and hasSideEffects have been set to 0 per default in InstSystemZ.

Then those flags have been set on individual instructions so that there is no diff on the tablegen output compared to trunk.

I suspect that null_fag may have set this flag earlier in places where we could now remove it.

Not sure why, but SRAK seems to have the SideEffects flag...

hasSideEffects = 0 could also be removed in a few places.

Diff Detail

Event Timeline

jonpa created this revision.Nov 24 2017, 6:16 AM
uweigand edited edge metadata.Nov 24 2017, 10:57 AM

Thanks for looking at and reviewing those flags!

It's a good first step to ensure that flags remain identical to what they currently are. However, this seems to have exposed a couple of bugs in the current flags. Let's see what the best way to get to a correct solution is, without introducing too many intermediate changes ... I'm now wondering whether we should first fix wrong flags and then switch to guessInstructionProperties = 0 or the other way around.

First of all, I'm not sure it makes much sense to introduce "generic" settings of hasSideEffects in instruction format patterns. You've added them to all SideEffect* patterns -- however, while all those instructions have "side effects" in the sense that they not only operate on their operands, the many (most?) cases those side effect are simply memory accesses (loads or stores), which should already be handled via mayLoad / mayStore. Those should *not* also have hasSideEffects set; that flag is only needed if there are any *other* "unmodeled" side effects. I think I added the correct mayLoad/mayStore/hasSideEffects setting to all (I hope!) SideEffect* pattersn, but I didn't realize I had to actively reset hasSideEffects to zero. So I guess it is OK to add the flag to the SideEffect* patterns for now to ensure we get NFC, but then later on remove them all again, since they're actually incorrect. (Possibly doing another review of all users first.)

A similar issue are the "generic" vector instructions (i.e. the *Generic patterns). Those are all inline-assembler-only patterns, and therefore have no DAG pattern (which I guess is why they were classified as hasSideEffects by the default logic). You seem to have added the flag to some but not all *Generic patterns, but for those patterns where you haven't added it you instead added it to the instructions that use it ... Similar to above, I think it is OK to add the flag for now, but then you should just add it to *all* Generic patterns, to avoid unnecessary changes to the vector instruction patterns.

Yet again similar are the *K patterns. There the hasSideEffects flags is pretty much always wrong, it's just added incorrectly because the patterns use null_frag. Again, I'm fine with temporarily adding and later removing it. However, the duplication of the BinaryRSAndK pattern seems weird. Is this because some users of BinaryRSAndK manually set hasSideEffects = 0? This was probably already an attempt to work around. They should just *all* use that ...

The final major issue seem to be the HFP (hex floating-point) and DFP (decimal floating-point) settings. I guess they all have hasSideEffects because they are never used for code generation and therefore all use null_frag. Now I thought that floating-point operations usually have side effects anyway since they may trap. But that doesn't seem to be even modeled for the normal (binary) floating-point instructions, where it would be most important since those are actually used for codegen ... Not sure if this hides a bug somewhere. On the other hand, other platforms don't appear to set this for FP instructions either, so I guess we should leave it for now. But then we should probably treat HFP and DFP the same.

I've gone through the changes in detail. Note that my comments really refer to the state we should have in the end, i.e. in most cases *different* from the current setting. Given that we'd have to introduce many changes to those files just to revert them again, maybe in would be better to do the changes in a single step after all ...

lib/Target/SystemZ/SystemZInstrDFP.td
20

Given the discussion above, the comment should probably be different.

lib/Target/SystemZ/SystemZInstrFP.td
136

This shouldn't have hasSideEffects = 1, but rather mayLoad = 1. This would fix a current bug.

Note that the whole block of load instructions really should have mayLoad = 1.

157

Again, probably best if the whole block is marked mayStore = 1.

172

These aren't really different from other FP instructions, so if the rest don't have hasSideEffects = 1, this shouldn't either ... The same comment actually applies to the rest of these changes throughout this file.

541

These are actually wrong: EFPC and SFPC do not touch memory at all, while STFPC is only a store and LFPC is only a load. I think this is wrongly inferred because the definition of the int_s390_efpc / int_s390_sfpc intrinsics in include/llvm/IR/IntrinsicsSystemZ.td is wrong: they should be marked as IntrNoMem. (This specific problem can just be fixed before, I think.)

lib/Target/SystemZ/SystemZInstrFormats.td
4592

This is certainly also wrong, those compares have no special side effects ...

lib/Target/SystemZ/SystemZInstrInfo.td
14

Since those are fully no-op on SystemZ, they have no side effects.

37

I believe as long as branches are correctly marked as isBranch, they don't need to *also* carry a hasSideEffects flag.

204

But traps certainly should, so this one is fine.

376

This has no side effects.

381

This is a no-op and has no side effects.

417

Probably best to mark the whole block as mayLoad = 1.

461

Same here, mark the whole block as mayStore = 1.

520

This doesn't have side effects, just a mayStore ...

525

These should actually match the mayLoad / mayStore settings of the underlying instructions.

593

Doesn't actually have any side effects.

698

Doesn't have side effects.

710

Likewise.

746

Should just be mayStore.

756

Should be mayLoad / mayStore as appropriate, has no other side effects.

865

Should just be mayLoad.

921

No side effects with any of the adds. The to-memory variants should be mayLoad and mayStore.

1054

No side effects.

1063

Neither here. Why are SLFI and SLGFI different in the first place, that's weird ...

1280

No side effects on any of the multiplies ...

1362

As discussed above, none of the shifts or rotates has side effects.

1434

No side effects.

1488

Likewise.

1569

Just mayLoad

1583

This is actually fine, I guess.

1608

Should just have mayLoad/mayStore, like the others. Maybe for the whole block?

1744

Hmm. Not sure if atomics should have the SideEffect marker ... but either all of them or none.

1793

Just mayLoad is enough.

1854

mayLoad should be enough for these. (They may trap, but so can any load ...)

1868

Just mayLoad or mayStore

1906

Just mayLoad

1918

No particular side effects for the access register routines (except mayLoad / mayStore as appropriate)

1954

As above, no side effects for branches.

1988

This really should only be a mayStore. Maybe another bug in the intrinsics definition.

1998

A TABORT is probably best genuinely marked as hasSideEffects = 1. It shouldn't be mayLoad or mayStore.

2002

Should be store only.

2029

No side effects.

2075

This is OK.

lib/Target/SystemZ/SystemZInstrSystem.td
63

This as well.

186

Not really. None of the remaining changes in this file look good.

lib/Target/SystemZ/SystemZInstrVector.td
16

Nothing in this file really should have hasSideEffects = 1.

225

No, this should just be mayStore. Another intrinsics issue?

260

Likewise.

jonpa updated this revision to Diff 124915.Nov 30 2017, 4:54 AM
jonpa marked 42 inline comments as done.

Still NFC.

Patch updated to minimize changes in SystemZInstrVector.td, by moving the flag settings to the *Generic classes instead.

jonpa updated this revision to Diff 124916.Nov 30 2017, 4:59 AM

Patch updated per review, with functional changes.

Three SystemZ tests have been updated, where two of them now have an extra COPY (instructions may now be more freely rescheduled). Hopefully, this should go away with other ongoing patches. One SystemZ test is crashing (asm-18.ll), see llvm-dev for explanation of the TwoAddress problem.

Hopefully, it should be possible to observe the changes by diffing to the previous diff, which was NFC.

lib/Target/SystemZ/SystemZInstrDFP.td
20

I removed the hasSideEffects flag in both DFP/HFP to be consistent , and instead made a TODO comment about perhaps adding this flag to all relevant FP instructions.

lib/Target/SystemZ/SystemZInstrFP.td
541

Not sure exactly how to do this - there is only one intrinsic (for read/set), but two instructions, where only one of them loads/stores.

It does not compile if just setting mayLoad=0, mayStore=0 on the non-mem instructions.

Would it work to add duplicate intrinsics for this in InstrinsicsSystemZ.td, like

 def int_s390_sfpc : GCCBuiltin<"__builtin_s390_sfpc">,
                      Intrinsic<[], [llvm_i32_ty], [IntrNoMem]>;

 def int_s390_sfpc_mem : GCCBuiltin<"__builtin_s390_sfpc">,
                          Intrinsic<[], [llvm_i32_ty], []>;
...

Or is this so rare that it could be ignored and left alone?

lib/Target/SystemZ/SystemZInstrInfo.td
14

Tried to remove, but did not work due to the callseq_... nodes in the pattern.

1063

yes... I don't know really

1744

Since we are unsure, I put all inside a new block with this flag set, and added a TODO comment regarding this. This is now *all* instructions in the atomic section, not just the ATOMIC instructions. Is that what you meant?

1954

OK - I am assuming this should also work the same for isCall / isReturn

1988

Yes. I tried to experiment with the intrinsics declarations like adding WriteOnly<0> and so on, but to no other effect than very long rebuild times...

I am leaving these flags as they are now as they are forced by the intrinsics definitions...

lib/Target/SystemZ/SystemZInstrVector.td
16

I suppose the vector FP instructions still follow the same discussion as for the FP files?

225

most likely -- leaving as for the others above.

jonpa added a comment.Nov 30 2017, 5:00 AM

This file is the output of a simple script I made, to perhaps get a better overview of the functional changes for the instructions.

Thanks for providing the diff. I noticed two problems, also annotated in the detailed comment:

  • You added mayLoad to a number of load-conditional instructions that actually don't access memory (reg-reg or reg-immed instructions)
  • The guarded-storage instructions should really have the side effects flag (I overlooked this in my previous review).

The rest of the changes looks OK to me.

As to your question:

I suppose the vector FP instructions still follow the same discussion as for the FP files?

Yes, this is true.

lib/Target/SystemZ/SystemZInstrInfo.td
93–95

This should better be in the FixedCondBranchRXY pattern -- this already hard-codes the "load" DAG node, so it can also hard-code mayLoad. B.t.w. the other CondBranchRXY pattern variants really also should have the mayLoad flag ...

525–538

Most of these are load immediate or load register instructions, they do not access memory.

533

Only this one should have mayLoad.

555–557

Again, the first two are load register instructions, only the second two should have mayLoad.

947

Do we need this? Should be correctly autodetected from the DAG patterns. The same applied to various patterns below.

951

This is already in BinarySIY. Same applies to other BinarySIY patterns.

1853–1854

Thinking about this, the whole set of FeatureGuardedStorage instructions should have the HasSideEffects flags after all, since they use and/or modify the guarded storage control registers, which we do not otherwise model, so the must not be reordered with respect to each other.

jonpa updated this revision to Diff 125112.Dec 1 2017, 5:08 AM
jonpa marked 7 inline comments as done.

Updated per review.

The bugfix in TwoAddress is under review separately. Only the test case here will be committed by this patch, as it exposes the bug.

With the bugfix I could build benchmarks and I saw just some minor differences in instruction counts:

tmll           :                15798                15987     +189
lr             :                25915                25769     -146
wfsdb          :                 6655                 6795     +140
sdbr           :                 3134                 2994     -140
j              :                98539                98678     +139
nr             :                 2845                 2709     -136
n              :                 8918                 9052     +134
tmhl           :                  933                  799     -134
risbhg         :                 1996                 1880     -116
ldr            :                22280                22172     -108
stg            :               124184               124093      -91
lg             :               317058               316968      -90
lt             :                 7152                 7072      -80
cgrjlh         :                11594                11673      +79
risblg         :                 8448                 8527      +79
l              :                65752                65830      +78
st             :                51556                51628      +72
mdbr           :                 6686                 6615      -71
risbgn         :                34903                34837      -66

                 spec-llvm_A_clang-master/            spec-llvm
Spill|Reload   :               164860               164772      -88

Reduced spilling should be a good indicator.

lib/Target/SystemZ/SystemZInstrInfo.td
93–95

I added 'mayLoad = 1' to the three *CondBranchRXY patterns. This added the mayLoad flag on BIAsm* and BIC[Asm] instructions.

533

removed the mayLoad setting - FixedCondUnaryRSY already hard codes it.

555–557

Oops. Again removed the mayLoad = 1 flag, as it is already hard coded. I assume this to be the policy, so I also removed mayStore from STOC* instructions.

951

OK - I am guessing then that if the instruction class sets the mayLoad / mayStore flag, it should preferrably not also be set on the instruction. I tried to clean away the cases I could find in this file. I suppose the other files could also be cleaned up?

These changes (at the code level) look reasonable to me now. (There is one mayStore on a STOC instruction that is probably still superfluous now.)

I've looked again at the atomic instructions, and I now think they should not have the "unmodelled side effects" flags: they do have side effects, but those are modeled (e.g. via the ordering flags). Common code should (and as far as I can tell, does) look at those properties to decide whether and how an atomic instruction may be scheduled. This also matches the fact that most of the atomics didn't have hasSideEffects before your patch, and they don't on other platforms either. So I think they should simply not have the flag.

Otherwise, there's still two things that should be fixed before this can go in: the TwoAddress fix, and the intrinsics flags.

Once this is all ready, could you please re-create another diff for the attribute settings, just so we can verify that nothing surprising has crept in? Thanks!

lib/Target/SystemZ/SystemZInstrInfo.td
525–538

Shouldn't this be unnecessary as well?

jonpa updated this revision to Diff 125306.Dec 4 2017, 3:34 AM
jonpa marked an inline comment as done.

Path updated.

Functional change:

UnmodeledSideEffects REMOVED: ATOMIC_CMP_SWAPW, ATOMIC_LOADW_AFI, ATOMIC_LOADW_AR, ATOMIC_LOADW_MAX, ATOMIC_LOADW_MIN, ATOMIC_LOADW_NILH, ATOMIC_LOADW_NILHi, ATOMIC_LOADW_NR, ATOMIC_LOADW_NRi, ATOMIC_LOADW_OILH, ATOMIC_LOADW_OR, ATOMIC_LOADW_SR, ATOMIC_LOADW_UMAX, ATOMIC_LOADW_UMIN, ATOMIC_LOADW_XILF, ATOMIC_LOADW_XR, ATOMIC_LOAD_AFI, ATOMIC_LOAD_AGFI, ATOMIC_LOAD_AGHI, ATOMIC_LOAD_AGR, ATOMIC_LOAD_AHI, ATOMIC_LOAD_AR, ATOMIC_LOAD_MAX_32, ATOMIC_LOAD_MAX_64, ATOMIC_LOAD_MIN_32, ATOMIC_LOAD_MIN_64, ATOMIC_LOAD_NGR, ATOMIC_LOAD_NGRi, ATOMIC_LOAD_NIHF64, ATOMIC_LOAD_NIHF64i, ATOMIC_LOAD_NIHH64, ATOMIC_LOAD_NIHH64i, ATOMIC_LOAD_NIHL64, ATOMIC_LOAD_NIHL64i, ATOMIC_LOAD_NILF, ATOMIC_LOAD_NILF64, ATOMIC_LOAD_NILF64i, ATOMIC_LOAD_NILFi, ATOMIC_LOAD_NILH, ATOMIC_LOAD_NILH64, ATOMIC_LOAD_NILH64i, ATOMIC_LOAD_NILHi, ATOMIC_LOAD_NILL, ATOMIC_LOAD_NILL64, ATOMIC_LOAD_NILL64i, ATOMIC_LOAD_NILLi, ATOMIC_LOAD_NR, ATOMIC_LOAD_NRi, ATOMIC_LOAD_OGR, ATOMIC_LOAD_OIHF64, ATOMIC_LOAD_OIHH64, ATOMIC_LOAD_OIHL64, ATOMIC_LOAD_OILF, ATOMIC_LOAD_OILF64, ATOMIC_LOAD_OILH, ATOMIC_LOAD_OILH64, ATOMIC_LOAD_OILL, ATOMIC_LOAD_OILL64, ATOMIC_LOAD_OR, ATOMIC_LOAD_SGR, ATOMIC_LOAD_SR, ATOMIC_LOAD_UMAX_32, ATOMIC_LOAD_UMAX_64, ATOMIC_LOAD_UMIN_32, ATOMIC_LOAD_UMIN_64, ATOMIC_LOAD_XGR, ATOMIC_LOAD_XIHF64, ATOMIC_LOAD_XILF, ATOMIC_LOAD_XILF64, ATOMIC_LOAD_XR, ATOMIC_SWAPW, ATOMIC_SWAP_32, ATOMIC_SWAP_64, CDS, CDSG, CDSY, CS, CSG, CSST, CSY, LAA, LAAG, LAAL, LAALG, LAN, LANG, LAO, LAOG, LAX, LAXG, LPD, LPDG, LPQ, PLO, STPQ, TS.

Otherwise, there's still two things that should be fixed before this can go in: the TwoAddress fix, and the intrinsics flags.

The TwoAddress fix has been committed, but the intrinsics flags fixing seem to require some work so I am not sure if we really want to wait with these changes until that is done?

Once this is all ready, could you please re-create another diff for the attribute settings, just so we can verify that nothing surprising has crept in?

Sure, this should be an updated diff of the patch effects:

jonpa added inline comments.Dec 4 2017, 3:37 AM
lib/Target/SystemZ/SystemZInstrInfo.td
525–538

Yes - removed this one and a few more in SystemZInstrSystem.td with NFC.

The TwoAddress fix has been committed, but the intrinsics flags fixing seem to require some work so I am not sure if we really want to wait with these changes until that is done?

The problems with the intrinsics flags were related to IntrNoMem, right? This only happens for a few of the changes, the majority were about IntrWriteMem IIRC. It should be possible to add at least those changes right away without problem. And that would already remove the need for many of the wrongly-added mayLoad = 1 settings in this patch ...

Otherwise, this all looks good to me now.

jonpa updated this revision to Diff 125352.Dec 4 2017, 9:11 AM

The TwoAddress fix has been committed, but the intrinsics flags fixing seem to require some work so I am not sure if we really want to wait with these changes until that is done?

The problems with the intrinsics flags were related to IntrNoMem, right? This only happens for a few of the changes, the majority were about IntrWriteMem IIRC. It should be possible to add at least those changes right away without problem. And that would already remove the need for many of the wrongly-added mayLoad = 1 settings in this patch ...

Yes, the IntrHasSideEffects would be needed for IntrNoMem (tabort, sfpc, efpc).

Patch updated for the other cases by adding IntrWriteMem and removing mayLoad flags, with functional change:

MayLoad REMOVED: NTSTG, TBEGINC, VSTL, VSTRL, VSTRLR

uweigand accepted this revision.Dec 4 2017, 9:38 AM

For consistency, the other variants of the tbegin intrinsic (int_s390_tbegin and int_s390_tbegin_nofloat) should likewise get the IntrWriteMem flag.

Otherwise, this LGTM now. Thanks!

This revision is now accepted and ready to land.Dec 4 2017, 9:38 AM
jonpa closed this revision.Dec 5 2017, 3:26 AM

r319756