Describe the allowable values for the least-significant bit and width operands for
G_SBFX/G_UBFX. This is necessary when the values are run-time values (as opposed
to compile-time constants).
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Thanks for doing this. I think it's important that we reach agreement on these details.
I would suggest these rules (1):
G_SBFX, G_UBFX: 0 <= lsb < lsb + width <= bitwidth (all values unsigned)
This implies that width is not allowed to be 0.
Or as an alternative (2):
G_SBFX: 0 <= lsb < lsb + width <= bitwidth (all values unsigned) G_UBFX: 0 <= lsb <= lsb + width <= bitwidth (all values unsigned)
This allows 0 width for G_UBFX only. The rationale is that value of 0-width unsigned field is trivially 0, but the value of a 0-width signed field is not well defined -- it could be either 0 or -1.
On balance I prefer (1) because (a) I think it's simpler to have the same rule for both opcodes, and (b) allowing "width" to take any value from 0 to bitwidth inclusive makes it more awkward to expand these operations into shifts and masking.
llvm/include/llvm/Target/GenericOpcodes.td | ||
---|---|---|
1382–1383 | As mentioned I think lsb should be strictly less than bitwidth, here and below. |
Thanks for the suggestions! I updated the patch based upon the first suggestion. I agree that it is better for the reasons that you've described.
LGTM assuming others agree with these rules - maybe wait a day in case they have comments.
Shouldn't we also have verifier checks for these requirements in the statically known cases?
You mean, verify that the operand values are in range if they happen to be G_CONSTANTs? I don't think that's a good idea because constant-folding one of the operands could suddenly make the MIR fail verification.
No because the verifier should not be considering the context of the instruction. The operands are legal or not looking at just the instruction itself
llvm/docs/GlobalISel/GenericOpcode.rst | ||
---|---|---|
259 | What what is the result if this is violated? Poison? |
llvm/docs/GlobalISel/GenericOpcode.rst | ||
---|---|---|
259 | I think poison makes sense. Though, I'm not sure how that concept occurs in the gMIR? |
llvm/docs/GlobalISel/GenericOpcode.rst | ||
---|---|---|
259 |
I think it should be the same as what you get from any out-of-range shift. In IR that would be poison. but...
... I get the impression that codegen does not really distinguish between undef and poison (yet). Given that the description of shifts in this document does not mention the out of range case at all, I think the current patch is good to go as-is. Obviously it would be great if someone could add documentation about the out-of-range / undef / poison cases in future. |
What what is the result if this is violated? Poison?