This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Fix SMEM WAR hazard for gfx10 readlane
ClosedPublic

Authored by kerbowa on Oct 18 2019, 8:43 AM.

Event Timeline

kerbowa created this revision.Oct 18 2019, 8:43 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 18 2019, 8:43 AM

It is OK as a w/a, but real opcode should not appear that early.
@arsenm What is the reason to use MC opcode in the SIFrameLowering::emitEpilogue()?

BuildMI(MBB, MBBI, DL, TII->getMCOpcodeFromPseudo(AMDGPU::V_READLANE_B32),
        FuncInfo->getFrameOffsetReg())

It is OK as a w/a, but real opcode should not appear that early.
@arsenm What is the reason to use MC opcode in the SIFrameLowering::emitEpilogue()?

BuildMI(MBB, MBBI, DL, TII->getMCOpcodeFromPseudo(AMDGPU::V_READLANE_B32),
        FuncInfo->getFrameOffsetReg())

There is a reason for it, but I don't remember what it is. I've tried to fix this multiple times in the past, and then rediscovered why. I think it had something to do with an operand constraint/encoding change in VI

It is OK as a w/a, but real opcode should not appear that early.
@arsenm What is the reason to use MC opcode in the SIFrameLowering::emitEpilogue()?

BuildMI(MBB, MBBI, DL, TII->getMCOpcodeFromPseudo(AMDGPU::V_READLANE_B32),
        FuncInfo->getFrameOffsetReg())

There is a reason for it, but I don't remember what it is. I've tried to fix this multiple times in the past, and then rediscovered why. I think it had something to do with an operand constraint/encoding change in VI

The same is also done in spillSGPR/restoreSGPR, the PEI version was just reproducing the same

rampitec accepted this revision.Oct 18 2019, 10:45 AM

LGTM as a w/a.

This revision is now accepted and ready to land.Oct 18 2019, 10:45 AM
kerbowa closed this revision.Oct 18 2019, 11:19 AM

r375265

foad added a subscriber: foad.Mar 27 2020, 9:12 AM

Hi @kerbowa, why is this needed for gfx10? The public RDNA documentation says:

4.5. Manually Inserted Wait States (NOPs)
Inserting S_NOP is never required to achieve correct operation.

Is this not the full story?

Hi @kerbowa, why is this needed for gfx10? The public RDNA documentation says:

4.5. Manually Inserted Wait States (NOPs)
Inserting S_NOP is never required to achieve correct operation.

Is this not the full story?

It does not have data hazards, but then it has some bugs. This one is SMEMtoVectorWriteHazard.