This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] SDWA: make pass global
ClosedPublic

Authored by SamWot on Apr 11 2017, 8:00 AM.

Diff Detail

Repository
rL LLVM

Event Timeline

SamWot created this revision.Apr 11 2017, 8:00 AM
rampitec edited edge metadata.Apr 11 2017, 10:05 AM

A virtual register can have multiple definitions. I think you need to make sure you are using a right one. Can you please forge a test like:

if (...) vregx = some_sdwa_cand;
else vregx = other_sdwa_cand;
use vregx;

and see what is folded at use?

You can conservatively use getUniqueVRegDef to make sure there is only one def, or make sure which def is really used. If you use getUniqueVRegDef it may decrease even current optimization scope since in many cases you have subreg candidates, which have multiple defs.

arsenm edited edge metadata.Apr 11 2017, 10:35 AM

A virtual register can have multiple definitions. I think you need to make sure you are using a right one. Can you please forge a test like:

if (...) vregx = some_sdwa_cand;
else vregx = other_sdwa_cand;
use vregx;

and see what is folded at use?

You can conservatively use getUniqueVRegDef to make sure there is only one def, or make sure which def is really used. If you use getUniqueVRegDef it may decrease even current optimization scope since in many cases you have subreg candidates, which have multiple defs.

This is an SSA pass, there is only one reg def. Using getUniqueVRegDef is not necessary

This is an SSA pass, there is only one reg def. Using getUniqueVRegDef is not necessary

Missed that, thanks.

This revision is now accepted and ready to land.Apr 11 2017, 10:52 AM
This revision was automatically updated to reflect the committed changes.