This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Enable output modifiers for double precision instructions
ClosedPublic

Authored by bcahoon on Mar 29 2021, 7:04 AM.

Details

Summary

Update SIFoldOperands pass to recognize v_add_f64 and v_mul_f64 instructions for folding output modifiers.

Diff Detail

Event Timeline

bcahoon created this revision.Mar 29 2021, 7:04 AM
bcahoon requested review of this revision.Mar 29 2021, 7:04 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 29 2021, 7:04 AM

Should also have some negative tests where the denormal mode, ieee mode, or signed zeros don't match

bcahoon updated this revision to Diff 334167.Mar 30 2021, 7:51 AM

Update with negative test cases.

arsenm added inline comments.Mar 30 2021, 11:24 AM
llvm/test/CodeGen/AMDGPU/omod.ll
102–103

Should just use the minimum set of fast flags

120–121

Should just use the minimum set of fast flags

138–139

Should just use the minimum set of fast flags

376

You're using the fast math flags, so you don't need the global no-signed-zeros-fp-math

bcahoon updated this revision to Diff 334301.Mar 30 2021, 4:35 PM

Use nsz and nnan flags. Ignore IEEE mode is nnan flag is set.

arsenm added inline comments.Mar 30 2021, 5:18 PM
llvm/lib/Target/AMDGPU/SIFoldOperands.cpp
1777

This is a separate change

arsenm added inline comments.Mar 30 2021, 5:19 PM
llvm/lib/Target/AMDGPU/SIFoldOperands.cpp
1777

Also would need a comment explaining that ieee mode only changes snan behavior, nnan lets us ignore it.

bcahoon updated this revision to Diff 334432.Mar 31 2021, 7:11 AM

Remove IEEE mode/nnan change. Will add separately.

arsenm accepted this revision.Mar 31 2021, 9:15 AM
This revision is now accepted and ready to land.Mar 31 2021, 9:15 AM