This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Fix for branch offset hardware workaround
ClosedPublic

Authored by rtaylor on Jun 18 2019, 7:19 AM.

Details

Summary

This fixes a hardware bug that makes a branch offset of 0x3f unsafe.
This replaces the 32 bit branch with offset 0x3f to a 64 bit
instruction that includes the same 32 bit branch and the encoding
for a s_nop 0 to follow. The relaxer than modifies the offsets
accordingly.

Change-Id: I10b7aed99d651f8159401b01bb421f105fa6288e

Diff Detail

Event Timeline

rtaylor created this revision.Jun 18 2019, 7:19 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 18 2019, 7:19 AM
arsenm added inline comments.Jun 18 2019, 7:26 AM
lib/Target/AMDGPU/MCTargetDesc/AMDGPUAsmBackend.cpp
56–57

I would much rather avoid the proliferation of junk opcodes for this. Can you use a bundle or another way to add an independent nop instruction?

test/MC/AMDGPU/offsetbug.s
18–20

Can you use trivial instructions for sizes? I don't trust the size of most instructions to stay the same

82–83

Formatting

arsenm added inline comments.Jun 18 2019, 7:28 AM
lib/Target/AMDGPU/SOPInstructions.td
1011–1018

If the opcodes end up unavoidable, the class should be fixed to generate the branch and the dummy at the same time, rather than requiring repeating each definition with the opcode value

rtaylor marked 4 inline comments as done.Jun 18 2019, 8:16 AM
rtaylor added inline comments.
lib/Target/AMDGPU/MCTargetDesc/AMDGPUAsmBackend.cpp
56–57

I'm not sure that's possible, relaxInstruction is return the MCInst Res but I can't say for sure as I've never worked with bundles and I'm not sure how the compiler treats them. I agree though that this would be more ideal than having new 64 bit instructions.

lib/Target/AMDGPU/SOPInstructions.td
1011–1018

I can have a look at doing this.

test/MC/AMDGPU/offsetbug.s
18–20

Yes, I can. I can make them nops.

82–83

Ah, missed this, thanks.

rtaylor marked an inline comment as done.Jun 18 2019, 8:40 AM
rtaylor added inline comments.
lib/Target/AMDGPU/MCTargetDesc/AMDGPUAsmBackend.cpp
56–57

This would require a bundle to have a specific opcode, which from what I can see they do not.

arsenm added inline comments.Jun 18 2019, 8:47 AM
lib/Target/AMDGPU/MCTargetDesc/AMDGPUAsmBackend.cpp
56–57

I'm not sure what you mean. There's one BUNDLE opcode, and then instructions following I are marked as in the bundle

Can we have a disasm test? I want to see that we do not mess with decoding.

Can we have a disasm test? I want to see that we do not mess with decoding.

Something other than the -disassemble checks?

Can we have a disasm test? I want to see that we do not mess with decoding.

Something other than the -disassemble checks?

Ah, missed it. Thanks, this is sufficient.

Can we have a disasm test? I want to see that we do not mess with decoding.

Something other than the -disassemble checks?

Ah, missed it. Thanks, this is sufficient.

Np. Ok, thanks.

arsenm added inline comments.Jun 18 2019, 4:31 PM
lib/Target/AMDGPU/MCTargetDesc/AMDGPUAsmBackend.cpp
64

These names aren't good. I would prefer a sufficient like _pad_s_nop or something. 64 could mean many different things

rampitec added inline comments.Jun 18 2019, 4:39 PM
lib/Target/AMDGPU/MCTargetDesc/AMDGPUAsmBackend.cpp
113

I'd suggest using a multiclass to define and InstrMapping instead of switches.

rtaylor updated this revision to Diff 205638.Jun 19 2019, 10:29 AM
rtaylor marked 2 inline comments as done.

Changes per suggestions

rtaylor marked an inline comment as done.Jun 19 2019, 11:16 AM
rtaylor added inline comments.
lib/Target/AMDGPU/SOPInstructions.td
1011–1018

I don't think _Real is the best naming since there isn't a corresponding Pseudo but I had no clue what to call this multiclass so if anyone has a suggestion, I'm all ears.

rampitec added inline comments.Jun 19 2019, 11:21 AM
lib/Target/AMDGPU/SOPInstructions.td
960

SOPP_With_Relaxation maybe?

962

Can yo use InstrMapping mapping here to avoid switches?

arsenm added inline comments.Jun 19 2019, 11:49 AM
lib/Target/AMDGPU/MCTargetDesc/AMDGPUAsmBackend.cpp
61

I would lowercase the suffix. The general naming convention its HARDWARE_NAME_compiler_defined_variant

arsenm added inline comments.Jun 19 2019, 11:51 AM
test/MC/AMDGPU/offsetbug.s
2–8

Can you split the test into separate files? I think there should be one function that needs 1 relaxation, another where the relaxation of one triggers the relaxation of the other, and another that triggers this twice

rtaylor marked an inline comment as done.Jun 20 2019, 5:48 AM
rtaylor added inline comments.
lib/Target/AMDGPU/SOPInstructions.td
962

Do you feel that this is worth it in this case? Seems overly complicated for a very specific use case but if it's the standard to go in this direction than ok.

rampitec added inline comments.Jun 20 2019, 7:29 AM
lib/Target/AMDGPU/SOPInstructions.td
962

You just get rid of one switch, so it is not that big deal now.

1011–1018

_With_Relaxation?

rtaylor marked an inline comment as done.Jun 20 2019, 11:17 AM
rtaylor added inline comments.
lib/Target/AMDGPU/SOPInstructions.td
962

That's ok, I've pretty much done it.

rtaylor updated this revision to Diff 205887.Jun 20 2019, 1:10 PM

Suggested changes: Added InstrMapping. Three test files. Lower-cased suffixes for MI defs. Changed multiclass suffix.

rampitec added inline comments.Jun 20 2019, 1:35 PM
lib/Target/AMDGPU/SIInstrInfo.td
2316 ↗(On Diff #205887)

I must be missing something, but how does it work? You are passing the same asm string into the both versions. What do you map to what then?

test/MC/AMDGPU/offsetbug_once.s
9 ↗(On Diff #205887)

Indent.

test/MC/AMDGPU/offsetbug_one_and_one.s
78 ↗(On Diff #205887)

Indent.

test/MC/AMDGPU/offsetbug_twice.s
112 ↗(On Diff #205887)

And indent ;)

rtaylor updated this revision to Diff 205911.Jun 20 2019, 3:03 PM
rtaylor marked an inline comment as done.

Fixed idents.

rtaylor added inline comments.Jun 20 2019, 3:03 PM
lib/Target/AMDGPU/SIInstrInfo.td
2316 ↗(On Diff #205887)

The asm string is the same for both, the s_nop is added in at encoding. I'm using the Size field, 4 maps to 8.

This is quite similar to how VOP is doing in during getVOPe32 and get VOPe64, the difference is that they also use a VOP3 flag but it's not needed here. Here, if you notice I am just adding the s_nop in class SOPP64e.

LGTM, except one indent.

lib/Target/AMDGPU/SIInstrInfo.td
2316 ↗(On Diff #205887)

Got it, thanks!

test/MC/AMDGPU/offsetbug_once.s
9 ↗(On Diff #205911)

Another indent.

arsenm requested changes to this revision.Jun 20 2019, 3:10 PM
arsenm added inline comments.
lib/Target/AMDGPU/MCTargetDesc/AMDGPUAsmBackend.cpp
10

This is a layering violation

72–75

return test();

This revision now requires changes to proceed.Jun 20 2019, 3:10 PM
arsenm added inline comments.Jun 20 2019, 3:15 PM
lib/Target/AMDGPU/SOPInstructions.td
927–929

I don't understand how this gets you a nop?

932

SOPP_w_nop or something? The current naming sounds like an actually different encoding (same with SOPPe64)

rampitec added inline comments.Jun 20 2019, 3:20 PM
lib/Target/AMDGPU/SOPInstructions.td
927–929

I guess s_nop opcode is 0, simm is 0 and SOPP encoding is 0x17f.
Maybe it would be more readable to write something like:

let Inst{63-55} = S_NOP.Inst{31-23}; // encoding
let Inst{54-48} = S_NOP.Inst{22-16}; // opcode
rtaylor marked an inline comment as done.Jun 20 2019, 4:50 PM
rtaylor added inline comments.
lib/Target/AMDGPU/MCTargetDesc/AMDGPUAsmBackend.cpp
10

Where would you like the InstrMapping to go? The others are here.

rtaylor marked 5 inline comments as done.Jun 20 2019, 5:16 PM
rtaylor added inline comments.
lib/Target/AMDGPU/MCTargetDesc/AMDGPUAsmBackend.cpp
10

I suppose I could put it into Utils/AMDGPUBaseInfo.h which is already included in AMDGPUAsmBackend.cpp.... though that seems at first glance to also be a layering violation. Maybe it would be better if this was a mapping table, it would fit into AMDGPUBaseInfo.h but I think InstrMapping is more fitting.

72–75

Not sure what you mean here though the else is useless, so I'll remove that and just return false.

lib/Target/AMDGPU/SOPInstructions.td
927–929

Yes, the opcode for s_nop is 0. The value I'm using is 0 and the encoding I had is the same as everywhere else but I can see how this makes it more understandable... maybe a simple comment here would help? I fixed this but it just makes tablegen less readable honestly, since now the S_NOP def has to go after the SOPP class but before the SOPP_w_nop encoding, so it's all alone among classes. Am I missing something? Prototyping would work but again, make everything less readable.

932

Sure, that's fine.

test/MC/AMDGPU/offsetbug_once.s
9 ↗(On Diff #205911)

Thanks.

rtaylor updated this revision to Diff 206117.Jun 22 2019, 6:30 AM

Fixed suggestions

arsenm accepted this revision.Jun 26 2019, 9:53 AM

LGTM

This revision is now accepted and ready to land.Jun 26 2019, 9:53 AM
This revision was automatically updated to reflect the committed changes.