This is an archive of the discontinued LLVM Phabricator instance.

SROA: Handle a case of store size being smaller than allocation size
ClosedPublic

Authored by rnk on Aug 21 2014, 2:36 PM.

Details

Summary

In this case, we are creating an x86_fp80 slice for a union from C where
the padding bytes may contain real data. An x86_fp80 alloca is 16 bytes,
and that's just fine. We can't, however, use regular loads and stores to
access the slice, because the store size is only 10 bytes / 80 bits.
Instead, use memcpy and memset.

Fixes PR18726.

Diff Detail

Repository
rL LLVM

Event Timeline

rnk updated this revision to Diff 12806.Aug 21 2014, 2:36 PM
rnk retitled this revision from to SROA: Handle a case of store size being smaller than allocation size.
rnk updated this object.
rnk added a reviewer: chandlerc.
rnk added a subscriber: Unknown Object (MLST).

I think this patch might not be the right approach. The underlying issue is
that clang translates:

union U { long m0[3]; long double m1; };

to:

%union.U = type { x86_fp80, [16 x i8] }

The x86_fp80 type is 10 bytes of data aligned to 16 bytes. Is it
permissible to copy only the lower 10 bytes of an x86_fp80? I think the
answer should be yes. This is the behavior of the underlying x87 fldt and
fstpt instructions. If copying only the lower 10 bytes of a x86_fp80 is
permissible, then the C union was not correctly lowered to LLVM IR. I think
the correct resolution is to lower the union to:

%union.U = type { [4 x i64] }

chandlerc accepted this revision.Aug 21 2014, 5:10 PM
chandlerc edited edge metadata.

Looks good with a slightly compactified test.

test/Transforms/SROA/slice-width.ll
65 ↗(On Diff #12806)

Just test memset in the function above? We don't run other optimizations here, it's not going to forward or do dead store elim...

This revision is now accepted and ready to land.Aug 21 2014, 5:10 PM
rnk added inline comments.Aug 21 2014, 5:18 PM
test/Transforms/SROA/slice-width.ll
65 ↗(On Diff #12806)

I don't need to copy anything out, but I do need to load the i64 in order to cause SROA to fire. I guess I'll stick to that in both test cases.

rnk closed this revision.Aug 21 2014, 5:19 PM
rnk updated this revision to Diff 12816.

Closed by commit rL216248 (authored by @rnk).