This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/CodeGen/
-
llvm/
-
CodeGen/
1/1
LowLevelType.h
-
lib/Target/AMDGPU/
-
Target/
-
AMDGPU/
-
AMDGPULegalizerInfo.cpp

Differential D150605

LLT: Add some stub constructors for FP types
ClosedPublic

Authored by arsenm on May 15 2023, 1:10 PM.

Download Raw Diff

Details

Reviewers

aemerson
paquette
dsanders
bogner
qcolombet
Pierre-vh

Summary

This is to start documenting uses to ease a future migration
to supporting different types with the same size.

Diff Detail

Event Timeline

arsenm created this revision.May 15 2023, 1:10 PM

Herald added a project: Restricted Project. · View Herald TranscriptMay 15 2023, 1:10 PM

Herald added subscribers: foad, kerbowa, hiraditya, jvesely. · View Herald Transcript

arsenm requested review of this revision.May 15 2023, 1:10 PM

Herald added a project: Restricted Project. · View Herald TranscriptMay 15 2023, 1:10 PM

Herald added a subscriber: wdng. · View Herald Transcript

Did we decide on different types as the route for this? I'm still wondering if somehow adding more semantic operands to FP operations would be less intrusive.

Harbormaster completed remote builds in B232095: Diff 522311.May 15 2023, 3:23 PM

I'm a bit neutral regarding this change, I can't comment on whether it's better to have the information in the types or the operations.
Though I'm also wondering if binding the semantics to FP operation wouldn't make more sense?

However if we're going this direction I would prefer to see all relevant uses of LLT::scalar looked at and replaced with ::float where needed (not just within the AMD backend) to avoid further fragmentation.

llvm/include/llvm/CodeGen/LowLevelType.h
74	small nit: I would add that to make it clear those ctors were added with the intention of going that way later.

In D150605#4343994, @aemerson wrote:

Did we decide on different types as the route for this? I'm still wondering if somehow adding more semantic operands to FP operations would be less intrusive.

I thought previous discussions on this were coalescing around adding something to LLT, but not the details of how that would look. My feeling is adding FP semantics to the operations would require more code on average, plus additional costs from operating differently from the high level IR

In D150605#4344853, @Pierre-vh wrote:

However if we're going this direction I would prefer to see all relevant uses of LLT::scalar looked at and replaced with ::float where needed (not just within the AMD backend) to avoid further fragmentation.

This isn't a complete set for AMDGPU either, and I don't see how it's fragmenting since this is just documenting at this point. Documenting FP uses like this will allow incrementally moving to a state where it's easier to introduce the new types

I'm still wondering if somehow adding more semantic operands to FP operations would be less intrusive.

I don't think that would be enough because we have to deal with the values at the ABI boundaries. I.e., there may not be an operation to attach the information to.

On a second thought the ABI concerns are probably fine because we can get the information directly from the LLVM IR.

If we were to go with the different opcodes for the different floating types, we would probably need to make RegBankSelect better.
With the proliferation of FP types (I think there are at least 3 different FP8 implementation at this point), we would end up with a lot of opcodes.

In D150605#4344953, @arsenm wrote:

In D150605#4343994, @aemerson wrote:

Did we decide on different types as the route for this? I'm still wondering if somehow adding more semantic operands to FP operations would be less intrusive.

I thought previous discussions on this were coalescing around adding something to LLT, but not the details of how that would look. My feeling is adding FP semantics to the operations would require more code on average, plus additional costs from operating differently from the high level IR

Yeah it did, but it's going to be a lot of work either way. I'm not opposed to this in principle though.

In D150605#4345232, @qcolombet wrote:

On a second thought the ABI concerns are probably fine because we can get the information directly from the LLVM IR.

Right, we already deal with this correctly with untyped scalars so nothing would get worse.

If we were to go with the different opcodes for the different floating types, we would probably need to make RegBankSelect better.
With the proliferation of FP types (I think there are at least 3 different FP8 implementation at this point), we would end up with a lot of opcodes.

Sure, my idea was something more akin to:
%dst = G_FADD %src1:(s64), %src2, F64 where F64 is a sort of symbolic operand that encodes the semantics of the operation without requiring a whole new class of instructions.

static constexpr LLT bfloat16() {
   return scalar(16);
 }

???

In D150605#4346990, @tschuett wrote:
static constexpr LLT bfloat16() {
   return scalar(16);
 }
???

Someday, when these actually do something besides create a scalar. Currently bfloat just miscompiles

Could you store an enum in LLT for the vast range of current and future floating point kinds?

In D150605#4347050, @tschuett wrote:

Could you store an enum in LLT for the vast range of current and future floating point kinds?

That's what I was thinking. For all practical purposes we have excess bits for size/element count to steal some (although we do break on cases where we can't represent an IR type in LLT. We'd need an EVT equivalent referring back to an IR type to handle 100% of types the DAG can deal with)

The innovative floating types ended in ApFloat and MLIR. I would be surprised if the number of floating point types in LLVM rapidly increases.

I think this is a reasonable thing to do assuming we're going in the direction of having FP info in the LLT rather than the ops (As was my suggestion way back here: https://discourse.llvm.org/t/rfc-globalisel-representing-fp-types-in-llt/57162). The main downside is that since it's just the API without any semantics with this change it could make things hard to debug if someone gets them wrong, but it's probably still better than having to migrate wholesale from having no types at all.

If we're going down this path, (which is ok with me), should we have a migration plan for this change? I'm thinking we start a thread on discourse to talk about intermediate steps. The last time I attempted this it got very hairy, very quickly. I'd like us to have a path where we can make incremental changes over time with robust verification, and get to a place where we can have a switch that we can flip for testing purposes before we enable permanently.

sameerds added a subscriber: sameerds.Jul 1 2023, 9:34 AM

Rebase

Harbormaster completed remote builds in B256095: Diff 555142.Aug 31 2023, 1:26 PM

LGTM. Have you thought about @aemerson's suggestion about a migration plan?

This revision is now accepted and ready to land.Sep 1 2023, 11:17 AM

In D150605#4634972, @bogner wrote:

LGTM. Have you thought about @aemerson's suggestion about a migration plan?

That's sort of the point of this. We can incrementally start documenting where FP types are supposed to be used with these. Later we can have a switch for making these do something different

77c67436d9e7e63c50752d990a271cf860ba9a0e

arsenm mentioned this in rG77c67436d9e7: LLT: Add some stub constructors for FP types.Sep 3 2023, 5:33 AM

Revision Contents

Path

Size

llvm/

include/

llvm/

CodeGen/

LowLevelType.h

16 lines

lib/

Target/

AMDGPU/

AMDGPULegalizerInfo.cpp

41 lines

Diff 555142

llvm/include/llvm/CodeGen/LowLevelType.h

Show First 20 Lines • Show All 65 Lines • ▼ Show 20 Lines

static constexpr LLT vector(ElementCount EC, LLT ScalarTy) {

return LLT{ScalarTy.isPointer(),

/*isVector=*/true,

/*isScalar=*/false,

EC,

ScalarTy.getSizeInBits().getFixedValue(),

ScalarTy.isPointer() ? ScalarTy.getAddressSpace() : 0};

}

/// Get a 16-bit IEEE half value.

Pierre-vhUnsubmitted

Done

ScalarTy.isPointer() ? ScalarTy.getAddressSpace() : 0};

}

/// Get a 16-bit IEEE half value.

+ /// TODO: Add IEEE semantics to type - This currently returns a simple `scalar(16)`.

static constexpr LLT float16() {

small nit: I would add that to make it clear those ctors were added with the intention of going that way later.

Pierre-vh: small nit: I would add that to make it clear those ctors were added with the intention of going…

/// TODO: Add IEEE semantics to type - This currently returns a simple `scalar(16)`.

static constexpr LLT float16() {

return scalar(16);

}

/// Get a 32-bit IEEE float value.

static constexpr LLT float32() {

return scalar(32);

}

/// Get a 64-bit IEEE double value.

static constexpr LLT float64() {

return scalar(64);

}

/// Get a low-level fixed-width vector of some number of elements and element

/// width.

static constexpr LLT fixed_vector(unsigned NumElements,

unsigned ScalarSizeInBits) {

return vector(ElementCount::getFixed(NumElements), ScalarSizeInBits);

}