This is an archive of the discontinued LLVM Phabricator instance.

[mlir][arith] Add wide integer emulation pass
ClosedPublic

Authored by kuhar on Sep 1 2022, 12:00 PM.

Details

Summary

In this first patch in a series to add wide integer emulation:

  • Set up the initial pass structure
  • Add a custom type converter
  • Handle func ops

The initial implementation supports power-of-two integers types only. We
emulate wide integer operations by splitting original i2N integer types
into two iN halves

My immediate use case is to emulate i64 operations using i32 ones
on mobile GPUs that do not support i64.

Diff Detail

Event Timeline

kuhar created this revision.Sep 1 2022, 12:00 PM
Herald added a project: Restricted Project. · View Herald Transcript
kuhar requested review of this revision.Sep 1 2022, 12:00 PM

As a general question, why limit yourself to the next power of 2? (and also, that limitation didn't feel documented)

Would it be too much bother to allow, for example, i128 -> 4xi32?

kuhar added a comment.Sep 1 2022, 2:20 PM

As a general question, why limit yourself to the next power of 2?

Would it be too much bother to allow, for example, i128 -> 4xi32?

I had a few reasons:

  1. In my use case, I don't need that level of generality.
  2. Specializing for the next power of 2 makes the implementation and testing simpler. I'd expect a lot more corner cases to appear once we allow arbitrarily large integers. For example, what if someone wants to emulate i64 with i1s?
  3. Some ops get quite inefficient when emulated. For example, my conversion pattern for arith.muli` generates ~60 instructions already when splitting the bit width in half.
  4. The current implementation provides a workaround: you can run the emulation pass multiple times, gradually decreasing the widest int type supported.

(and also, that limitation didn't feel documented)

The patch mentions this in the pass description and in the comment by the create function. Do you have some recommendations on how to communicate this better?

kuhar added a comment.Sep 1 2022, 2:25 PM

(and also, that limitation didn't feel documented)

The patch mentions this in the pass description and in the comment by the create function. Do you have some recommendations on how to communicate this better?

Ah, I think I see what you mean. It does mention i2N -> iN but not that N has to be a power of two.

I think it was more that the description gave the impression that this was
the general case when I first saw it.

If this were explicitly a "split wide integer math in half" pass it'd be
less confusing - "wide integer emulation" feels like it'd handle the
general case.

(and if someone wants to autoconvert i64 math to i1, let 'em - I expect the
compiler will translate add with carry in and carry out to the relevant
Boolean operators or tell the user to get lost)

kuhar updated this revision to Diff 457486.Sep 1 2022, 8:07 PM

Update the pass description and comments to be more explicit about supported integer types and how we emulate wide integer operations.

kuhar updated this revision to Diff 457489.Sep 1 2022, 8:09 PM

Fix a comment.

kuhar added a comment.Sep 1 2022, 8:13 PM

Thanks for the feedback, @krzysz00.

In principle, I think that generalizing this pass to use a more general emulation scheme could be useful. For the initial implementation, I'd prefer to keep things simple (and sound) and cover the use case we have in mind.

I updated the comments, documentation, and revision description to be more explicit about the supported types and how we emulate wide operations. Does this address your concerns?

kuhar edited the summary of this revision. (Show Details)Sep 1 2022, 8:15 PM

I have a little question, what about if some target does support different max integer width between scalar and vector type? For example, in RISCV, RV64 + Zve32x supports i64 integer scalar but only max to i32 integer vector type. What should we set for this case?

kuhar added a comment.Sep 6 2022, 7:25 AM

I have a little question, what about if some target does support different max integer width between scalar and vector type? For example, in RISCV, RV64 + Zve32x supports i64 integer scalar but only max to i32 integer vector type. What should we set for this case?

Right now this is not supported, but I think we should be able to handle this. We can add a second option to the pass and type converter for the max vector int width supported so that integers and vectors of integers get converted to different types. As for the types used for emulation, we convert ints to 2-element vectors but don't perform any dynamic operations on them -- we only use vectors to keep two results in one value. If this is not supported by the target, we could instead use a struct with 2 elements or tuple with 2 values.

Mogball requested changes to this revision.Sep 6 2022, 9:17 AM
Mogball added inline comments.
mlir/include/mlir/Dialect/Arithmetic/Transforms/Passes.h
15–16
34

Do you need this constructor? I think it's auto-generated by ODS

39

Returning a pointer to a TypeConverter is a little odd. The usual pattern is to subclass TypeConverter (e.g. WideIntTypeConverter) and expose it in the header

mlir/include/mlir/Dialect/Arithmetic/Transforms/Passes.td
60

I would put efficiency as a TODO

67
mlir/lib/Dialect/Arithmetic/Transforms/EmulateWideInt.cpp
12–21
31

This sentence would suggest the type converter is capable of converting multiple integer types, but in fact only one type is supported: i2N since all other types are rejected by the converter.

31–37
48
54
64
68

I would expect vector<4xi64> to become vector<4x2xi32> instead of vector<2x4xi32>

111

Why are your asserting false? The program will abort instead of allowing the pass to gracefully fail.

126

Is this necessary for a partial conversion?

135

Can you outline this lambda?

141

This shouldn't be necessary in a partial conversion

mlir/test/Dialect/Arithmetic/emulate-wide-int.mlir
33

Can you add a test for a scalar i64 conversion?

This revision now requires changes to proceed.Sep 6 2022, 9:17 AM
antiagainst requested changes to this revision.Sep 6 2022, 9:45 AM
antiagainst added inline comments.
mlir/include/mlir/Dialect/Arithmetic/Transforms/Passes.h
34

Nit: widestMachineIntBitwidth? widestIntSupported makes me think that the widest integer supported for emulation is 32.. Would also be nice to explain the parameter in the comments.

mlir/include/mlir/Dialect/Arithmetic/Transforms/Passes.td
56

... on narrower integer ...

62

.. only power-of two integer bitwidths ...

mlir/lib/Dialect/Arithmetic/Transforms/EmulateWideInt.cpp
46

Just capture this?

61

Same here.

63

In MLIR the convention is to do early returns.

68

Hmm, I'd expect this to be vector<...x2xiN> to keep the two halves close. The current way it would require transposing.

142

These shouldn't be unconditionally legal right? Actually the dynamic legality specification at L129 should applicable to all ops.

+1 to progressively build up the functionality. This way we can make sure we build up sufficient amount of testing to cover various cases too.

kuhar updated this revision to Diff 458257.Sep 6 2022, 1:19 PM
kuhar marked 22 inline comments as done.

Addressed non-functional issues.

mlir/include/mlir/Dialect/Arithmetic/Transforms/Passes.h
34

Thanks, I didn't realize ODS also automates constructors with pass options.

mlir/lib/Dialect/Arithmetic/Transforms/EmulateWideInt.cpp
126

It seems so to me. I added a test case that demonstrates this. Without these materializations, it fails with:

error: failed to legalize un
resolved materialization from 'i64' to 'vector<2xi32>' that remained live after conversion                                          
    return %i : i64                                               
    ^
kuhar updated this revision to Diff 458292.Sep 6 2022, 3:09 PM

Moved type converter to its own header. Changed vector layout.

kuhar marked 3 inline comments as done.Sep 6 2022, 3:12 PM

All comments should be addressed now. Could you take a second look, @Mogball and @antiagainst?

mlir/lib/Dialect/Arithmetic/Transforms/EmulateWideInt.cpp
68

Thanks for the suggestion. Initially, I chose vector<2x..iN> because it seemed easy to construct/destruct vectors with this new outer dim. However, your suggestion makes more sense because it essentially allows for bitcasting between original types and emulated ones.

kuhar marked an inline comment as done.Sep 6 2022, 3:12 PM
Mogball added inline comments.Sep 6 2022, 8:34 PM
mlir/lib/Dialect/Arithmetic/Transforms/EmulateWideInt.cpp
29

Please move these all out of the anonymous namespace and make them static.

kuhar updated this revision to Diff 458461.Sep 7 2022, 8:15 AM

Made helper functions static. Rebased.

kuhar marked an inline comment as done.Sep 7 2022, 8:15 AM
Mogball requested changes to this revision.Sep 7 2022, 8:28 AM
Mogball added inline comments.
mlir/lib/Dialect/Arithmetic/Transforms/EmulateWideInt.cpp
27

I would just inline this function in addDynamicallyLegalOp

41

Wouldn't this allow i0 or i1?

59

can you split this up with target.addDynamicallyLegalOp<func::funcOp>(...) and target.addDynamicallyLegalOp<func::CallOp, func::ReturnOp>(...) so that isOpLegal doesn't have to switch on func::FuncOp?

73–75
126

This seems like a test case that *should* fail. If an i64 operation could not be emulated (because it lacks an implementation, for example), then this pass should fail.

131
This revision now requires changes to proceed.Sep 7 2022, 8:28 AM
kuhar updated this revision to Diff 458467.Sep 7 2022, 8:45 AM
kuhar marked 3 inline comments as done.

Cleanups. Removed support for unhandled arith ops.

kuhar marked an inline comment as done.Sep 7 2022, 8:45 AM
kuhar added inline comments.
mlir/lib/Dialect/Arithmetic/Transforms/EmulateWideInt.cpp
41

0 is not considered a power of 2. Emulating i2 with i1 is supported, although I haven't tested it.

59

thanks for the suggestion, this simplifies things

Mogball added inline comments.Sep 7 2022, 3:08 PM
mlir/lib/Dialect/Arithmetic/Transforms/EmulateWideInt.cpp
41

Right, thanks.

Mogball accepted this revision.Sep 7 2022, 3:11 PM

LGTM

mlir/lib/Dialect/Arithmetic/Transforms/EmulateWideInt.cpp
58

Can you drop this since you have using namespace mlir declared at the top of the file?

kuhar updated this revision to Diff 458634.Sep 7 2022, 8:41 PM
kuhar marked an inline comment as done.

Dropped the mlir namespace.

antiagainst accepted this revision.Sep 8 2022, 8:56 AM
antiagainst added inline comments.
mlir/lib/Dialect/Arithmetic/Transforms/EmulateWideInt.cpp
34

I think you can just return signalPassFailure() here.

This revision is now accepted and ready to land.Sep 8 2022, 8:56 AM
kuhar added inline comments.Sep 8 2022, 10:52 AM
mlir/lib/Dialect/Arithmetic/Transforms/EmulateWideInt.cpp
34

The return type is void, so this seems unusual to me. Not sure what the benefit would be apart from saving one line of code.

This revision was automatically updated to reflect the committed changes.