This is to start documenting uses to ease a future migration
to supporting different types with the same size.
Details
Diff Detail
Event Timeline
Did we decide on different types as the route for this? I'm still wondering if somehow adding more semantic operands to FP operations would be less intrusive.
I'm a bit neutral regarding this change, I can't comment on whether it's better to have the information in the types or the operations.
Though I'm also wondering if binding the semantics to FP operation wouldn't make more sense?
However if we're going this direction I would prefer to see all relevant uses of LLT::scalar looked at and replaced with ::float where needed (not just within the AMD backend) to avoid further fragmentation.
llvm/include/llvm/CodeGen/LowLevelType.h | ||
---|---|---|
74 | small nit: I would add that to make it clear those ctors were added with the intention of going that way later. |
I thought previous discussions on this were coalescing around adding something to LLT, but not the details of how that would look. My feeling is adding FP semantics to the operations would require more code on average, plus additional costs from operating differently from the high level IR
This isn't a complete set for AMDGPU either, and I don't see how it's fragmenting since this is just documenting at this point. Documenting FP uses like this will allow incrementally moving to a state where it's easier to introduce the new types
I'm still wondering if somehow adding more semantic operands to FP operations would be less intrusive.
I don't think that would be enough because we have to deal with the values at the ABI boundaries. I.e., there may not be an operation to attach the information to.
On a second thought the ABI concerns are probably fine because we can get the information directly from the LLVM IR.
If we were to go with the different opcodes for the different floating types, we would probably need to make RegBankSelect better.
With the proliferation of FP types (I think there are at least 3 different FP8 implementation at this point), we would end up with a lot of opcodes.
Yeah it did, but it's going to be a lot of work either way. I'm not opposed to this in principle though.
Right, we already deal with this correctly with untyped scalars so nothing would get worse.
If we were to go with the different opcodes for the different floating types, we would probably need to make RegBankSelect better.
With the proliferation of FP types (I think there are at least 3 different FP8 implementation at this point), we would end up with a lot of opcodes.
Sure, my idea was something more akin to:
%dst = G_FADD %src1:(s64), %src2, F64 where F64 is a sort of symbolic operand that encodes the semantics of the operation without requiring a whole new class of instructions.
Someday, when these actually do something besides create a scalar. Currently bfloat just miscompiles
Could you store an enum in LLT for the vast range of current and future floating point kinds?
That's what I was thinking. For all practical purposes we have excess bits for size/element count to steal some (although we do break on cases where we can't represent an IR type in LLT. We'd need an EVT equivalent referring back to an IR type to handle 100% of types the DAG can deal with)
The innovative floating types ended in ApFloat and MLIR. I would be surprised if the number of floating point types in LLVM rapidly increases.
I think this is a reasonable thing to do assuming we're going in the direction of having FP info in the LLT rather than the ops (As was my suggestion way back here: https://discourse.llvm.org/t/rfc-globalisel-representing-fp-types-in-llt/57162). The main downside is that since it's just the API without any semantics with this change it could make things hard to debug if someone gets them wrong, but it's probably still better than having to migrate wholesale from having no types at all.
If we're going down this path, (which is ok with me), should we have a migration plan for this change? I'm thinking we start a thread on discourse to talk about intermediate steps. The last time I attempted this it got very hairy, very quickly. I'd like us to have a path where we can make incremental changes over time with robust verification, and get to a place where we can have a switch that we can flip for testing purposes before we enable permanently.
That's sort of the point of this. We can incrementally start documenting where FP types are supposed to be used with these. Later we can have a switch for making these do something different
small nit: I would add that to make it clear those ctors were added with the intention of going that way later.