This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: do not fold clamp instructions when sources are different
ClosedPublic

Authored by hakzsam on Sep 22 2017, 5:04 AM.

Details

Reviewers
arsenm
nhaehnle
Summary
The given sequence of instructions:

a = fadd(x, y)
b = fadd(x, z)
r = clamp(max(a, b), 0.0, 1.0)

is definitely not equivalent to:
r = fadd(x, y) with clamp enabled

This is because we don't check that source operands of max()
are equivalent before folding the instruction.

This has been reported by Alex Smith (Feral).

Diff Detail

Event Timeline

hakzsam created this revision.Sep 22 2017, 5:04 AM
nhaehnle accepted this revision.Sep 29 2017, 8:14 AM

Maybe you could add a helpful comment on isClamp? Either way, LGTM.

This revision is now accepted and ready to land.Sep 29 2017, 8:14 AM

Maybe you could add a helpful comment on isClamp? Either way, LGTM.

Thanks for accepting this review. What kind of comment do you think should be useful?
Also, could you push the patch for me? I don't have LLVM commit rights.

Mostly it took me a while to grok why the function was looking at MAX instructions. As far as I understand now, it's because the code that deduces the clamp output modifier happens to always produce it on a MAX(x,x) instruction, and not a MIN.

arsenm closed this revision.Oct 4 2017, 5:15 PM
arsenm edited edge metadata.

r314951