This is an archive of the discontinued LLVM Phabricator instance.

[CodeGen] Increase applicability of ffine-grained-bitfield-accesses for targets with limited native integer widths
ClosedPublic

Authored by asb on Apr 30 2020, 12:17 AM.

Details

Summary

As pointed out in PR45708, -ffine-grained-bitfield-accesses doesn't trigger in all cases you think it might for RISC-V. The logic in CGRecordLowering::accumulateBitFields checks OffsetInRecord is a legal integer according to the datalayout. RISC targets will typically only have the native width as a legal integer type so this check will fail for OffsetInRecord of 8 or 16 when you would expect the transformation is still worthwhile.

This patch changes the logic to check for an OffsetInRecord of a at least 1 byte, that fits in a legal integer, and is a power of 2. We would prefer to query whether native load/store operations are available, but I don't believe that is possible.

Diff Detail

Event Timeline

asb created this revision.Apr 30 2020, 12:17 AM
asb added a subscriber: apazos.Apr 30 2020, 12:18 AM
asb updated this revision to Diff 261138.Apr 30 2020, 12:43 AM

Updated patch to include full context.

The -ffine-grained-bitfield-accesses seems to generate weird results in certain cases. For example, on x86 we generate an unaligned load in the following example.

struct S  { long c : 8; long z: 24; long : 0; };
struct S s;
int f() { return s.c+s.z; }

I guess that's not really an issue with this patch specifically, though; the general algorithm is just sort of weird.

clang/lib/CodeGen/CGRecordLayoutBuilder.cpp
414

StartBitOffset is the offset in bits of the beginning of the current run, the comment above explains that. It's not obvious what OffsetInRecord is; I guess it's the size in bits of the current run? The name isn't really intuitive.

This patch seems reasonable; you want to allow any legal integer, and any smaller type that can probably be naturally loaded and stored.

asb added a comment.Jun 10 2020, 10:45 PM

Ping on this. The patch still applies cleanly against current HEAD.

@efriedma: was your comment an LGTM?

Please add a comment explaining what OffsetInRecord means; then LGTM.

asb added a comment.Jun 12 2020, 1:30 AM

Please add a comment explaining what OffsetInRecord means; then LGTM.

Thanks. It's not easy to follow, but having stepped through it I agree with yu that it is the size in bits of the current run. I'll add that comment and commit.

This revision was not accepted when it landed; it landed in state Needs Review.Jun 12 2020, 2:40 AM
This revision was automatically updated to reflect the committed changes.
dmgreen added a subscriber: dmgreen.Apr 7 2021, 1:45 AM

Hello. We've received reports that this is bloating codesize of some code, quite a lot in places. There is an example in https://godbolt.org/z/66TEKa1xK. Essentially the glomming together of reads/writes into i32's (in our case) helps to reduce the total number of loads/stores needed. Splitting that up into individual i8/i16's creates a lot more load/mask/load/mask/or/store sequences.

I suspect it's better in cases, worse in others. It depends on where exactly the bitfield delimitators lie. And whether they are more likely to be on i32/i64 boundaries (which I believe is common in a lot of cases bitfields are used in), or to be more randomly distributed.