This is an archive of the discontinued LLVM Phabricator instance.

[SystemZ] Replace subreg_r with subreg_h
ClosedPublic

Authored by kparzysz on Aug 14 2018, 10:25 AM.

Details

Summary

subreg_r32 -> subreg_h32
subreg_r64 -> subreg_h64
subreg_hr32 -> subreg_hh32

Companion patch for D50648.

Diff Detail

Repository
rL LLVM

Event Timeline

This is just a demonstration of what I meant in the comments for llvm.org/PR38544: getting rid of explicit overlapping subregister indices solves the problem encountered in tc_subregliveness_noliveseg.ll.

jonpa added a comment.Aug 15 2018, 5:03 AM

I tried this and found that everything builds, SPEC is NFC and all tests pass, so this looks fine to me.

I suspect that the "reinterpret" subregindexes are just better names when used with the floating point registers, but we should wait for Ulis comment on this before we commit it.

Well, the original rationale for using different subreg indices for float/vector registers is given here:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20150504/274358.html

The main underlying issue is that the architecture states that writing to a floating-point register (which overlays the upper half of a vector register) may clobber the lower half of that vector register. I believe the current processor implementations do not actually do that, but it still would be preferable to model this correctly.

On the other hand, just using the different subreg index names doesn't actually do that, it's just a cosmetic marker. So I don't necessarily object to reverting that difference (as this patch would do). Just wondering if there's a better way to actually express the underlying issue.

This situation is the same as the case of RAX vs EAX on X86. EAX is the lower half of RAX, but modifying EAX does not preserve the upper bits of RAX. On the other hand, modifying AX (lower half of EAX) does preserve the upper half of EAX. Originally, the former behavior was modeled for both cases, i.e. overwriting a lone subregister would be considered as overwriting the entire register. I added phony registers to X86 (e.g. HAX covering the upper half of EAX) to model the latter behavior for EAX, EBX, etc.

In case of VF128, if a register from that class has only one subregister, then both, the register and the subregister will share the same register unit(s), which means that from the point of view of register aliasing, they are assumed to clobber each other without preserving any parts.

Running tblgen with -debug shows

RC FP64Bit Units:
 F0S F1S F2S F3S F4S F5S F6S F7S F8S F9S F10S F11S F12S F13S F14S F15S
[...]
RC VF128Bit Units: 
 F0S F1S F2S F3S F4S F5S F6S F7S F8S F9S F10S F11S F12S F13S F14S F15S
[...]
RC FP128Bit Units:
 F0S F1S F2S F3S F4S F5S F6S F7S F8S F9S F10S F11S F12S F13S F14S F15S

These are unit sets for the entire class, but the fact that they are identical is a sign that the individual registers share units with their subregisters.

uweigand accepted this revision.Aug 15 2018, 7:58 AM

Ah, that's good to know! So if I understand this correctly, accessing even the 32-bit part (F0S) would be considered to clobber F0D. This is not really necessary, but probably doesn't hurt at this point. We could do the same thing as the HAX you mention to get this modeled exactly.

In any case, then I agree that this patch is fine.

This revision is now accepted and ready to land.Aug 15 2018, 7:58 AM
This revision was automatically updated to reflect the committed changes.

@kparzysz @uweigand

We're seeing tlbgen warnings after this patch:

warning: SubRegIndex SystemZ::subreg_h64 and SystemZ::subreg_h32 compose ambiguously as SystemZ::subreg_hh32 or SystemZ::subreg_h32

Will investigate.

The problem is in how tblgen calculates sub-registers. For a given register R, subreg indices of its sub-registers will be added to R, in other words a super-register "inherits" subreg indices from its subregisters. For example, V0 (from class VR128) has an explicit subreg index subreg_h64. It also has a sub-register F0D, which has a subreg index subreg_h32. As a result, V0 will have two subreg indices: subreg_h64 and subreg_h32. From the register structure (V0 vs F0D vs F0S) it is inferred that the composition V0.subreg_h64.subreg_h32 is same as V0.subreg_h32, a in general that the composition subreg_h64.subreg_h32 is equivalent to subreg_h32. At the same time, repeating this logic for a register F0Q from class FPR128 leads to adding subreg_h32 twice, which tblgen tries to resolve by referring to user-defined compositions. This is how subreg_hh32 shows up. However, now there is another result for composing subreg_h64.subreg_h32, hence the warning.

This warning didn't exist back when the subreg_r was introduced, but that's because VR128 didn't exist back then. If subreg_r was not added, adding VR128 would cause the warning to appear.

That warning can be eliminated by resolving such conflicts in favor of user-defined compositions.

That warning can be eliminated by resolving such conflicts in favor of user-defined compositions.

Are you or someone else working on such a patch?

I just don't want to have warnings sit and persist here.

Yes, I do have one. I'll post it on Monday.

plotfi added a subscriber: plotfi.Nov 3 2018, 10:25 PM

Yes, I do have one. I'll post it on Monday.

I'm still seeing this warning as of 11/3/2018

warning: SubRegIndex SystemZ::subreg_h64 and SystemZ::subreg_h32 compose ambiguously as SystemZ::subreg_hh32 or SystemZ::subreg_h32

When can we expect a resolution?

PL

We're all still waiting on a fix here....

If Krzysztof can't address this, maybe the code owner for SystemZ (CC-ed I
believe) can take a look?

We're all still waiting on a fix here....

If Krzysztof can't address this, maybe the code owner for SystemZ (CC-ed I
believe) can take a look?

Here's the scoop:

  1. Without this patch (causing the warning to appear), TableGen generated incorrect lane masks, with this patch it doesn't, but we get a warning.
  2. The situation indicated by the warning is harmless. D50977 was an attempt to silence this case, but the patch caused some concerns.
  3. The best solution would be to fix TableGen, but there root of the problem seems to be in some mismatch between the way that TableGen generates subregister indices and the assumptions it makes when it calculates lane masks. IIRC, in this case there two ways to address a subregister (that maps to the same physical register in the end) with two corresponding (distinct) lane masks.

I went for fixing (2) because I don't have a clear understanding on how to fix (3). I think the problem has to do with how TableGen generates subregister indexes, and any change there could affect a lot of target and codegen code.

At the moment I'm stuck between a simple silencing of the warning and investigating the root of the problem for which I don't really have enough time at the moment.