Download Raw Diff

Details

Reviewers

arsen
arsenm
tstellar
jvesely

Commits

rGad21f2687dcc: [AMDGPU] Add custom lowering for llvm.log{,10}.{f16,f32} intrinsics
rL319025: [AMDGPU] Add custom lowering for llvm.log{,10}.{f16,f32} intrinsics

Summary

Presently AMDGPU backend errors with "unsupported call to function" upon encountering a call to llvm.log{,10}.{f16,f32} intrinsics. This patch adds custom lowering to avoid that error on both R600 and SI.

Diff Detail

Repository: rL LLVM

Event Timeline

rivanvx created this revision.Feb 14 2017, 7:42 AM

Herald added subscribers: tpr, nhaehnle, arsenm. · View Herald TranscriptFeb 14 2017, 7:42 AM

rivanvx edited the summary of this revision. (Show Details)Feb 14 2017, 7:43 AM

Needs tests

lib/Target/AMDGPU/SIISelLowering.cpp
297	Should also handle f16?
3478	I've been meaning to remove the dependent on the math.h constants. Can you define this somewhere locally?

Better version: handle log10 as well, define constants locally, handle f16, do not pretend to handle f64. Tests incoming.

rivanvx retitled this revision from Add custom lowering for llvm.log.f32 intrinsic to Add custom lowering for llvm.log{,10}.{f16,f32} intrinsics.Feb 15 2017, 9:26 AM

rivanvx edited the summary of this revision. (Show Details)

Needs tests

rivanvx updated this revision to Diff 89036.Feb 18 2017, 12:04 PM

rivanvx edited the summary of this revision. (Show Details)

jvesely added a subscriber: jvesely.Feb 20 2017, 9:59 AM

jvesely added inline comments.

lib/Target/AMDGPU/AMDGPUISelLowering.cpp
250 ↗	(On Diff #89036)	v2f32 and v4f32 can be moved to for (MVT VT : FloatVectorTypes) { block (line ~398)
1891 ↗	(On Diff #89036)	Does FDIV have good enough precision to do this? OCL requires 2.5 ULP, and I'm not sure how good the EG/CM hw is. libclc uses precomputed constants and multiplication, maybe the same can be applied here.

TODO: TESTS NEED TO BE UPDATED

Addresed the comments.

arsenm added inline comments.Mar 9 2017, 10:56 AM

lib/Target/AMDGPU/AMDGPUISelLowering.cpp
16–18 ↗	(On Diff #91190)	Should name something else that doesn't collide with the standard names
test/CodeGen/AMDGPU/llvm.log.f16.ll
2 ↗	(On Diff #91190)	Can you also add a gfx900 lines for testing <2 x hal>
test/CodeGen/AMDGPU/llvm.log10.ll
3 ↗	(On Diff #91190)	Remove -mcpu=SI. Also should sort r600 lines later

Sorry for the delay, updated patch, now passes tests, addressed all the comments. I would appreciate if you could check if I got all the R_ and A_ prefixes correctly (it's just variable naming, but there is some logic behind it -- register vs address, I presume?) in f16 tests.

LGTM with minor test cleanup

test/CodeGen/AMDGPU/llvm.log10.ll
2 ↗	(On Diff #95598)	s/SI/GCN
18 ↗	(On Diff #95598)	Can you name these vars?

This revision is now accepted and ready to land.Apr 18 2017, 1:58 PM

This should be it.

Herald added a subscriber: wdng. · View Herald TranscriptApr 19 2017, 4:54 AM

That was only a partial diff, should be correct now.

jvesely added inline comments.Apr 24 2017, 8:26 AM

lib/Target/AMDGPU/AMDGPUISelLowering.cpp
1895 ↗	(On Diff #95718)	You can pass the log2base constant here to avoid second switch and simplify the code. Just a nitpick
1914 ↗	(On Diff #95718)	Using FMUL and inverted Log2Base should be both faster and more precise.

@jvesely I would like to get an ACK before I tackle the tests.

looks ok to me. hope you were not waiting for me this whole time.

In D29942#861582, @jvesely wrote:

looks ok to me. hope you were not waiting for me this whole time.

No, absolutely not, it was a combination of limited time for Clover/Clang/LLVM and reluctance to finish things that are almost done but not "interesting" to complete. Thanks for the OK, full patch coming much sooner than this one. :-)

This should be it.

jvesely accepted this revision.Oct 12 2017, 9:37 AM

LGTM

Closed by commit rL319025: [AMDGPU] Add custom lowering for llvm.log{,10}.{f16,f32} intrinsics (authored by vedranm). · Explain WhyNov 27 2017, 5:26 AM

This revision was automatically updated to reflect the committed changes.

This is an archive of the discontinued LLVM Phabricator instance.

Add custom lowering for llvm.log{,10}.{f16,f32} intrinsics
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 88374

lib/Target/AMDGPU/SIISelLowering.h

lib/Target/AMDGPU/SIISelLowering.cpp

This is an archive of the discontinued LLVM Phabricator instance.

Add custom lowering for llvm.log{,10}.{f16,f32} intrinsicsClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 88374

lib/Target/AMDGPU/SIISelLowering.h

lib/Target/AMDGPU/SIISelLowering.cpp

Add custom lowering for llvm.log{,10}.{f16,f32} intrinsics
ClosedPublic