This is an archive of the discontinued LLVM Phabricator instance.

[NVPTX] Promote i24, i40, i48 and i56 to next power-of-two register when passing
ClosedPublic

Authored by kjetilkjeka on Jul 7 2022, 7:29 AM.

Details

Summary

Today llc will crash when attempting to use non-power-of-two integer types as function arguments or returns. This patch enables passing non standard integer values in functions by promoting them before store and truncating after load.

The main motivation of implementing this change is that rust casts small structs (less than pointer size) into an integer of the same size. As an example, if a struct contains three u8 then it will be passed as an i24. This patch is a step towards enabling rust compilation to ptx while retaining the target independent optimizations.

The tests reflects that it is mostly the multiple of 8 integers less than 64 that is of interest. I have locally tested some non-multiple-of-eight integers but decided against writing tests for them as nothing should really rely on them. Let me know if you want a few of those in addition.

More context can be found in my original github issue


This is my first LLVM contribution and I hope I have done everything by your contribution guide. Let me know if I should fix anything and I will do my best to get it done. I'm also not very familiar with the LLVM codebase yet and I have not received any external input on the content of this patch beyond the test passing. If anything looks fishy it is probably something I have misunderstood.

Diff Detail

Event Timeline

kjetilkjeka created this revision.Jul 7 2022, 7:29 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 7 2022, 7:29 AM
kjetilkjeka requested review of this revision.Jul 7 2022, 7:29 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 7 2022, 7:29 AM
kjetilkjeka added a comment.EditedJul 11 2022, 8:51 AM

When I run the failing test on my machine with <build>/tools/clang/tools/extra/clangd/unittests/./ClangdTests --gtest_filter=TUSchedulerTests.PreambleThrottle it passes even for the exact same pre-merge revision as the CI uses. Is this some sort of sporadic failure?

(Also, if I understand llvm correctly, the result of this test should not be changed by the nvptx changes I made)
`

tra added a comment.Jul 11 2022, 4:54 PM

General nit: patch should be submitted on phabricator using large context. Please see: https://llvm.org/docs/Phabricator.html#requesting-a-review-via-the-web-interface

All i*-param.ll tests should probably be combined into one test file. They do pretty much the same thing.

Also, there should probably be test cases for integer sizes that are not multiples of 8. E.g i49.

llvm/lib/Target/NVPTX/NVPTXAsmPrinter.cpp
1500–1503

Size roundup calculation appears to be repeated in many places and could be extracted into a helper function.

llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
216

This looks a bit odd -- checking for the same boundary in two places, implicitly skipping power-of-2 sized integers...

I'd restructure it a bit to look like this:

if (VT.isScalarInteger()) {
   auto n = VT.getFixedSizeInBits();
   if (isPowerOf2_32(n))
     return false;
   switch(PowerOf2Ceil(n)){
      default: return false; // Covers i1 and integers larger than i64
      case 2:
      case 4:
      case 8:       *PromotedVT = MVT::i8; break;
      case 16:     *PromotedVT = MVT::i16; break;
      case 32:     *PromotedVT = MVT::i32; break;
      case 64:     *PromotedVT = MVT::i64; break;
    }
    return true
}

Thanks for the review!

I have tried to address all the requested changes and created the new diff with the large context.

I added a couple of non standard tests, but LLVM seem to handle the promote/truncate in many different ways for the different non standard integers so I'm not sure at what level of detail it makes sense to test for here. Let me know if you want more tests.

tra added a comment.Jul 12 2022, 11:08 AM

I have tried to address all the requested changes and created the new diff with the large context.

Thank you. It makes patch reviewing much easier.

I added a couple of non standard tests, but LLVM seem to handle the promote/truncate in many different ways for the different non standard integers so I'm not sure at what level of detail it makes sense to test for here. Let me know if you want more tests.

We want to make sure that param loads/stores are done using correct sizes. We do not really care how LLVM does extension/truncation of non-power-of-2 sized integers.
So, in practice we only need to check for relevant ld.param/st.param instructions.
We don't need to enumerate all possible sizes. Something like i3/i11/i23/i47 should be sufficient to exercise the code in this patch.
We do have test/CodeGen/NVPTX/param-load-store.ll so the new test cases should probably go there as well.

BTW, for IR test cases here it's convenient to get the function to call itself, so you do not need both caller and callee and can test all relevant functionality in one place.

llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
217

I think we may simplify it a bit further. Get rid of this check, always set PromotedVT to the appropriate size and return true if the type changed.

switch(PowerOf2Ceil(n)){
      default: llvm_unreachable();
      case 1:      *PromotedVT = MVT::i1; break;
      case 2:
      case 4:
      case 8:       *PromotedVT = MVT::i8; break;
      case 16:     *PromotedVT = MVT::i16; break;
      case 32:     *PromotedVT = MVT::i32; break;
      case 64:     *PromotedVT = MVT::i64; break;
    }
    return EVT(*PromotedVT) != VT;
1327–1330

Use promoteScalarArgumentSize() here, too?

1381–1384

ditto.

1555–1562

This could also use promoteScalarArgumentSize().

if ((VT.isInteger() || VT.isFloatingPoint())
  TypeSize = promoteScalarArgumentSize(TypeSize*8)/8.
1690–1693

ditto.

kjetilkjeka marked an inline comment as done.Jul 13 2022, 4:07 AM

Thanks for being so responsive and helpful! It helps a lot being my first time attempting to contribute to LLVM.

I added two small questions to your comments as inline comments.

We want to make sure that param loads/stores are done using correct sizes. We do not really care how LLVM does extension/truncation of non-power-of-2 sized integers.
So, in practice we only need to check for relevant ld.param/st.param instructions.
We don't need to enumerate all possible sizes. Something like i3/i11/i23/i47 should be sufficient to exercise the code in this patch.

Should I then also remove the attempts of testing the truncating/extending to make all tests the same?

llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
217

I will fix this together with the change from the other comment when I have gotten your opinion on where to put the helper from the other comment but first just a question about the case default.

Isn't it better to keep the case default: return false; to allow for attempting to promote the EVT and doing somewhere else if there is not a proper way to do it? I'm thinking more of the future scenario where things like i99 and i1000 is supported and you might first try to promote the integer type and if it fail then instead try to split it up into parts. I guess it doesn't make a big difference for this patch because things like i99 or i1000 will fail to compile both before and after the patch anyway. The only practical difference is the type of error you get if you try to feed these kind of integers to llc.

1327–1330

We would then need to put promoteScalarArgumentSize in a place that would be visible from both files (NVPTXSelLowering.cpp and NVPTXAsmPrinter.cpp). Would it make sense to use NVPTXUtilities.cpp for that?

tra added a comment.Jul 13 2022, 1:30 PM

Should I then also remove the attempts of testing the truncating/extending to make all tests the same?

If you want to. Or it could be cleaned up in a separate patch. Leaving existing tests as is is fine, too.

llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
217

Isn't it better to keep the case default: return false; to allow for attempting to promote the EVT and doing somewhere else if there is not a proper way to do it?

If there's a valid use case for it, then sure. On the other hand, if we are only expected to handle specific set of types (i1-i64), then I'd prefer to know right away if we ever end up with something that we can't handle. It's better to fail right away rather than ignore it and then have to debug it the hard way when things go wrong somewhere downstream. If we do grow a legitimate need to pass through other types unpromoted it will be easy enough to change.

1327–1330

I think you could just put it as an inline function into NVPTXUtilities.h

kjetilkjeka updated this revision to Diff 444662.EditedJul 14 2022, 7:57 AM

I have implement your review comments.

Looking at the tests from test/CodeGen/NVPTX/param-load-store.ll helped a lot and I think the quality of the tests are better now.

I also discovered a bug when implementing the i2 and i3 test related to promoting the type to i8 but using a i16 as the container but it should be fixed now.

tra added inline comments.Jul 14 2022, 1:57 PM
llvm/test/CodeGen/NVPTX/param-load-store.ll
559–562 ↗(On Diff #444662)

This, and other tests dealing with multiple loads/stores of parts of the argument may be fragile. First, the order of loads would not necessarily be guaranteed. Second -- the way LLVM reconstructs the value from parts may also change.

I would stick with a set of CHECK-DAG: ld.param.XX {{.*}} [test_i23_param_0+Y]. All we care about here is that we load the right set of bits. We can assume that reconstructing the integer is handled by appropriate tests already.

kjetilkjeka marked 8 inline comments as done.

I would stick with a set of CHECK-DAG: ld.param.XX {{.*}} [test_i23_param_0+Y]. All we care about here is that we load the right set of bits. We can assume that reconstructing the integer is handled by appropriate tests already.

I realize that you say that all we care about is loading the right set of bits. But I assume we would like to check it for returns/args on both the caller and callee side? These four cases is basically what I'm doing now in addition to checking that the function is actually called. I have removed the checking of truncating/promoting as you mentioned.

More specifically I'm checking the reads from the parameter in the call is done for the correct amount of bytes with CHECK-DAG. While the pass of the return is done with the promoted type. I'm also checking the pass and return from the call is being done with the correctly promoted type. I'm not checking anything related to truncating or promoting the integers before or after the load/store to param space.

Let me know if you want even less checks in the tests, like in fact only checking the read is being done with the correct amount of bytes, and I will fix it.

tra accepted this revision.Jul 15 2022, 12:35 PM

LGTM with one test nit.

I would stick with a set of CHECK-DAG: ld.param.XX {{.*}} [test_i23_param_0+Y]. All we care about here is that we load the right set of bits. We can assume that reconstructing the integer is handled by appropriate tests already.

I realize that you say that all we care about is loading the right set of bits. But I assume we would like to check it for returns/args on both the caller and callee side? These four cases is basically what I'm doing now in addition to checking that the function is actually called. I have removed the checking of truncating/promoting as you mentioned.

Let me rephrase -- we only care about loads/stores from the param space. That does apply to both load of parameters from the function arguments and storing them to pass parameters and return values.
I think we are in agreement on what we need to check. The comment was largely about checks for cvt, shl and or instructions we match (e.g a few still remain in test_i24).

llvm/test/CodeGen/NVPTX/param-load-store.ll
567 ↗(On Diff #444947)

we want CHECK-DAG on loads here.

Nit: no need to check logical ops.

This revision is now accepted and ready to land.Jul 15 2022, 12:35 PM
kjetilkjeka updated this revision to Diff 445212.EditedJul 16 2022, 2:37 AM
kjetilkjeka marked 2 inline comments as done.

Seems like I forgot changing the i24 tests. It should be fixed now.

I still do not know why these CI tests are failing? I don't think it should have anything to do with this change though? I assume it's OK since you have not commented on it.

I hope that you will apply the patch to the code-base since I obviously do not have write access.

Thanks for helping and reviewing!

tra added a comment.Jul 20 2022, 11:01 AM

Sanitizer test failures are AFAICT unrelated.

I will land your patch a bit later this week. Thank you for contributing these changes.

This revision was landed with ongoing or failed builds.Jul 22 2022, 2:15 PM
This revision was automatically updated to reflect the committed changes.