This is an archive of the discontinued LLVM Phabricator instance.

[GlobalISel] Describe undefined values for G_SBFX/G_UBFX operands
ClosedPublic

Authored by bcahoon on Jun 14 2021, 10:53 AM.

Details

Summary

Describe the allowable values for the least-significant bit and width operands for
G_SBFX/G_UBFX. This is necessary when the values are run-time values (as opposed
to compile-time constants).

Diff Detail

Event Timeline

bcahoon created this revision.Jun 14 2021, 10:53 AM
bcahoon requested review of this revision.Jun 14 2021, 10:53 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 14 2021, 10:53 AM
foad added a comment.Jun 15 2021, 1:08 AM

Thanks for doing this. I think it's important that we reach agreement on these details.

I would suggest these rules (1):

G_SBFX, G_UBFX: 0 <= lsb < lsb + width <= bitwidth (all values unsigned)

This implies that width is not allowed to be 0.

Or as an alternative (2):

G_SBFX: 0 <= lsb < lsb + width <= bitwidth (all values unsigned)
G_UBFX: 0 <= lsb <= lsb + width <= bitwidth (all values unsigned)

This allows 0 width for G_UBFX only. The rationale is that value of 0-width unsigned field is trivially 0, but the value of a 0-width signed field is not well defined -- it could be either 0 or -1.

On balance I prefer (1) because (a) I think it's simpler to have the same rule for both opcodes, and (b) allowing "width" to take any value from 0 to bitwidth inclusive makes it more awkward to expand these operations into shifts and masking.

llvm/include/llvm/Target/GenericOpcodes.td
1382–1383

As mentioned I think lsb should be strictly less than bitwidth, here and below.

bcahoon updated this revision to Diff 352718.Jun 17 2021, 7:34 AM

Improved description of legal values, based upon review comments.

Thanks for doing this. I think it's important that we reach agreement on these details.

I would suggest these rules (1):

G_SBFX, G_UBFX: 0 <= lsb < lsb + width <= bitwidth (all values unsigned)

This implies that width is not allowed to be 0.

Thanks for the suggestions! I updated the patch based upon the first suggestion. I agree that it is better for the reasons that you've described.

foad accepted this revision.Jun 17 2021, 9:07 AM
foad added reviewers: aemerson, arsenm.

LGTM assuming others agree with these rules - maybe wait a day in case they have comments.

This revision is now accepted and ready to land.Jun 17 2021, 9:07 AM

Shouldn't we also have verifier checks for these requirements in the statically known cases?

foad added a comment.Jun 18 2021, 2:31 AM

Shouldn't we also have verifier checks for these requirements in the statically known cases?

You mean, verify that the operand values are in range if they happen to be G_CONSTANTs? I don't think that's a good idea because constant-folding one of the operands could suddenly make the MIR fail verification.

Shouldn't we also have verifier checks for these requirements in the statically known cases?

No because the verifier should not be considering the context of the instruction. The operands are legal or not looking at just the instruction itself

llvm/docs/GlobalISel/GenericOpcode.rst
259

What what is the result if this is violated? Poison?

bcahoon added inline comments.Jun 18 2021, 2:59 PM
llvm/docs/GlobalISel/GenericOpcode.rst
259

I think poison makes sense. Though, I'm not sure how that concept occurs in the gMIR?

foad added inline comments.Jun 21 2021, 2:05 AM
llvm/docs/GlobalISel/GenericOpcode.rst
259

I think poison makes sense.

I think it should be the same as what you get from any out-of-range shift. In IR that would be poison. but...

Though, I'm not sure how that concept occurs in the gMIR?

... I get the impression that codegen does not really distinguish between undef and poison (yet).

Given that the description of shifts in this document does not mention the out of range case at all, I think the current patch is good to go as-is. Obviously it would be great if someone could add documentation about the out-of-range / undef / poison cases in future.