This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] SIOptimizeExecMaskingPreRA should check constant bus constraint when folds EXEC copy
ClosedPublic

Authored by alex-t on Mar 18 2021, 12:31 PM.

Details

Summary

Folding EXEC copy into it's single use may lead to constant bus constraint violation as it adds one more SGPR operand.

This change makes it validate the user instruction with the new SGPR operand and only fold it if it is legal.

Diff Detail

Event Timeline

alex-t created this revision.Mar 18 2021, 12:31 PM
alex-t requested review of this revision.Mar 18 2021, 12:31 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 18 2021, 12:31 PM
Herald added a subscriber: wdng. · View Herald Transcript

Needs test.

llvm/lib/Target/AMDGPU/SIOptimizeExecMaskingPreRA.cpp
421

This could be an implicit use by VALU, which shall be ignored then.

arsenm requested changes to this revision.Mar 18 2021, 1:31 PM
This revision now requires changes to proceed.Mar 18 2021, 1:31 PM
alex-t updated this revision to Diff 331977.Mar 19 2021, 12:16 PM

Added MIR test. Implicit operand check added.

alex-t marked an inline comment as done.Mar 19 2021, 12:16 PM

With the change test checks for the COPY but w/o change, it reports "Bad machine code" error because of the "-verify-machineinstrs".
It may be changed just to report "expected string not found..." by removing "-verify-machineinstrs".

arsenm added inline comments.Mar 19 2021, 12:48 PM
llvm/test/CodeGen/AMDGPU/opt_exec_copy_fold.mir
16 ↗(On Diff #331977)

Is exec + implicit exec + other SGPR really a constant bus violation?

rampitec requested changes to this revision.Mar 19 2021, 1:42 PM
rampitec added inline comments.
llvm/lib/Target/AMDGPU/SIOptimizeExecMaskingPreRA.cpp
424

It is OK for exec operand to be implicit, but you do not need to check for legality in this case.

This revision now requires changes to proceed.Mar 19 2021, 1:42 PM
alex-t added inline comments.Mar 19 2021, 3:07 PM
llvm/lib/Target/AMDGPU/SIOptimizeExecMaskingPreRA.cpp
424

The EXEC register may be an implicit operand but the "Idx" is the index of the Sreg64 register that holds the EXEC copy.

rampitec accepted this revision.Mar 19 2021, 3:22 PM

LGTM

llvm/lib/Target/AMDGPU/SIOptimizeExecMaskingPreRA.cpp
424

Yep, you are right.

llvm/test/CodeGen/AMDGPU/opt_exec_copy_fold.mir
16 ↗(On Diff #331977)

I think SGPR + EXEC is a violation, implicit exec is just an internal glue. There is difference if you read exec as exec or as data operand.

This revision now requires review to proceed.Mar 19 2021, 3:22 PM

Matt, do you still have any objections? Could you please unlock the review if no?

arsenm accepted this revision.Mar 23 2021, 12:22 PM
This revision is now accepted and ready to land.Mar 23 2021, 12:22 PM