This is an archive of the discontinued LLVM Phabricator instance.

[NVPTX] Add lowering of i128 params.
ClosedPublic

Authored by denzp on Jun 23 2017, 7:07 AM.

Details

Summary

The patch adds support of i128 params lowering. The changes are quite trivial to support i128 as a "special case" of integer type. With this patch, we lower i128 params the same way as aggregates of size 16 bytes: .param .b8 _ [16].

Currently, NVPTX can't deal with the 128 bit integers:

  • in some cases because of failed assertions like ValVTs.size() == OutVals.size() && "Bad return value decomposition"
  • in other cases emitting PTX with .i128 or .u128 types (which are not valid [1])

[1] http://docs.nvidia.com/cuda/parallel-thread-execution/index.html#fundamental-types

Diff Detail

Event Timeline

denzp created this revision.Jun 23 2017, 7:07 AM
jlebar edited edge metadata.Jun 23 2017, 9:27 AM

I'd like tra to look at this -- he's much more of an expert in selection DAG than I am -- but he's on vacation for another week. Are you OK waiting?

I'd like tra to look at this -- he's much more of an expert in selection DAG than I am -- but he's on vacation for another week. Are you OK waiting?

Of course! Code review should never be hurried :)

tra requested changes to this revision.Jul 5 2017, 10:47 AM

In general the patch looks OK.
One thing that you need to make sure is that alignment of i128 on nvptx side always matches that on the host side. AFAICT, i128 alignment on x64 is 16 (see D28990) , but this patch shows that it's 8 on nvptx side. That would result in different in-memory representation of aggregates between host and device.

lib/Target/NVPTX/NVPTXAsmPrinter.cpp
403

Nit. I'd group together (Ty->isIntegerTy() && !Ty->isIntegerTy(128))

1428–1434

Please add a test case for this.

lib/Target/NVPTX/NVPTXISelLowering.cpp
1279

Same nit as above.

This revision now requires changes to proceed.Jul 5 2017, 10:47 AM
denzp updated this revision to Diff 105451.Jul 6 2017, 9:27 AM
denzp edited edge metadata.

Fixed i128 params alignment
Added regression test for global variables
Couple minor code readability improvements

denzp marked 3 inline comments as done.Jul 6 2017, 9:30 AM

Thanks for the review, @tra! I fixed alignment and some lines according to your comments.

tra accepted this revision.Jul 6 2017, 9:37 AM

Nice. With alignment=16 we also benefit from 128-bit loads/stores.

This revision is now accepted and ready to land.Jul 6 2017, 9:37 AM
denzp added a comment.Jul 6 2017, 9:43 AM

Could you please land it for me?

This revision was automatically updated to reflect the committed changes.
mkuper added a subscriber: mkuper.Jul 6 2017, 4:25 PM

I've reverted this in r307334 because it breaks a lot of clang tests with:

error: 'error' diagnostics seen but not expected:

(frontend): backend data layout 'e-i64:64-i128:128-v16:16-v32:32-n16:32:64' does not match expected target description 'e-i64:64-v16:16-v32:32-n16:32:64'
denzp reopened this revision.Jul 7 2017, 4:21 AM

Is it safe just to change data layout at clang side? Won't it break backward compatibility?

This revision is now accepted and ready to land.Jul 7 2017, 4:21 AM
tra requested changes to this revision.Jul 7 2017, 9:29 AM

Is it safe just to change data layout at clang side? Won't it break backward compatibility?

Clang and LLVM sides of this change should land together as they have to be in sync.

Other than that, AFAICT it should be safe as it affects only i128 which is not currently supported, so there should be no backward compatibility issues. That's in theory. You may want to ask this question on cfe-dev@.

This revision now requires changes to proceed.Jul 7 2017, 9:29 AM
In D34555#802254, @tra wrote:

Is it safe just to change data layout at clang side? Won't it break backward compatibility?

Clang and LLVM sides of this change should land together as they have to be in sync.

Other than that, AFAICT it should be safe as it affects only i128 which is not currently supported, so there should be no backward compatibility issues. That's in theory. You may want to ask this question on cfe-dev@.

Looks like we can simply update Clang's data layout and it should not break anything: cfe-dev thread.

Do I need to create a "differential" for the clang patch? I attached the diff to my previous comment here, and changes there are quite trivial.

tra accepted this revision.Jul 20 2017, 11:37 AM
This revision is now accepted and ready to land.Jul 20 2017, 11:37 AM
This revision was automatically updated to reflect the committed changes.